Architecture¶
Deep dive into Hector's architecture, components, and design principles.
Overview¶
Hector is a declarative AI agent platform built on three core principles:
- A2A Protocol Native - 100% compliant with the A2A Protocol specification
- Zero Code Required - Build agents with YAML configuration
- Production Ready - Enterprise-grade performance and reliability
System Architecture¶
┌──────────────────────────────────────────────────────────────┐
│ APPLICATION │
│ (Your Agent & Logic) │
├──────────────────────────────────────────────────────────────┤
│ HECTOR RUNTIME │
│ • Configuration Loading • Agent Initialization │
│ • Component Management • Lifecycle Management │
├──────────────────────────────────────────────────────────────┤
│ CLIENT LAYER │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ A2AClient Interface (Protocol Native) │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │ HTTPClient │ LocalClient │ │
│ │ • Remote agents │ • In-process agents │ │
│ │ • Uses protojson │ • No network calls │ │
│ │ • Multi-transport │ • Direct protobuf │ │
│ └─────────────────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────────────┤
│ TRANSPORT LAYER │
│ ┌──────────────┬──────────────────┬─────────────────────┐ │
│ │ gRPC (Core) │ REST (Gateway) │ JSON-RPC (Adapter) │ │
│ │ • Native │ • Auto-gen │ • Custom HTTP │ │
│ │ • Binary │ • JSON │ • Simple RPC │ │
│ │ • Streaming │ • SSE │ • JSON │ │
│ │ Port: 8080 │ Port: 8081 │ Port: 8082 │ │
│ └──────────────┴──────────────────┴─────────────────────┘ │
├──────────────────────────────────────────────────────────────┤
│ AGENT LAYER │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Agent (A2A Interface) │ │
│ │ • SendMessage • GetAgentCard │ │
│ │ • SendStreamingMessage • GetTask/CancelTask │ │
│ │ • Task subscriptions • Push notifications │ │
│ └─────────────────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────────────┤
│ REASONING ENGINE │
│ Chain-of-Thought Strategy | Supervisor Strategy │
│ • Step-by-step reasoning | • Multi-agent coord │
│ • Natural termination | • Task decomposition │
├──────────────────────────────────────────────────────────────┤
│ CORE SERVICES │
│ ┌───────────┬──────────┬──────────┬──────────┬──────────┐ │
│ │ LLM │ Tools │ Memory │ RAG │ Tasks │ │
│ │ • OpenAI │ • Local │ • Buffer │ • Qdrant │ • Async │ │
│ │• Anthropic│ • MCP │ • Summary│ • Search │ • Status │ │
│ │ • Gemini │ • Plugin │ • Session│ • Embed │ • Track │ │
│ └───────────┴──────────┴──────────┴──────────┴──────────┘ │
└──────────────────────────────────────────────────────────────┘
Core Components¶
1. Runtime¶
The runtime manages agent lifecycle and component initialization.
Responsibilities: - Load and validate configuration - Initialize components (LLMs, databases, embedders) - Create and register agents - Handle graceful shutdown
Key Files:
- pkg/runtime/runtime.go
- pkg/hector.go
2. Agent¶
The core agent implementation that handles message processing.
Responsibilities: - Receive and process messages - Execute reasoning strategy - Manage conversation context - Return responses
Key Files:
- pkg/agent/agent.go
- pkg/agent/orchestration.go
3. Reasoning Engine¶
Strategies for agent reasoning and decision-making.
Chain-of-Thought: - Step-by-step reasoning - Tool execution loop - Natural termination
Supervisor: - Multi-agent orchestration - Task decomposition - Agent selection - Result synthesis
Key Files:
- pkg/reasoning/chain_of_thought.go
- pkg/reasoning/supervisor_strategy.go
4. Memory System¶
Dual-layer memory for short and long-term context.
Working Memory: - Buffer Window (LIFO) - Summary Buffer (token-based compression)
Long-Term Memory: - Vector-based storage - Semantic search - Session scoping
Key Files:
- pkg/memory/memory.go
- pkg/memory/vector_memory.go
5. Tool System¶
Three-tier tool architecture.
Built-in Tools: - File operations (read, write, search_replace) - Command execution - Document search - Todo management - Agent communication
MCP Integration: - External tool servers - 150+ integrations (Composio, Mem0) - Standard protocol
gRPC Plugins: - Custom high-performance tools - Native Go interface - Production-grade
Key Files:
- pkg/tools/*.go
- pkg/tools/mcp.go
6. LLM Providers¶
Unified interface for multiple LLM providers.
Supported: - OpenAI (GPT-4o, GPT-4o-mini) - Anthropic (Claude) - Google (Gemini) - Custom plugins
Key Files:
- pkg/llms/*.go
- pkg/llms/registry.go
7. RAG & Document Stores¶
Semantic search and retrieval augmented generation.
Components: - Document parsers (text, markdown, code) - Vector databases (Qdrant) - Embedders (Ollama) - Incremental indexing
Key Files:
- pkg/context/document_store.go
- pkg/context/search.go
8. Transport Layer¶
Multiple protocol support from single codebase.
Protocols: - gRPC (binary, HTTP/2) - REST (JSON, HTTP/1.1) - JSON-RPC (simple integration) - WebSocket (real-time)
Key Files:
- pkg/transport/*.go
- pkg/a2a/server/*.go
Data Flow¶
Message Processing¶
1. Client → Transport Layer
└─ REST/gRPC/JSON-RPC request
2. Transport → Agent
└─ Parse and route to agent
3. Agent → Reasoning Engine
└─ Execute reasoning strategy
4. Reasoning → LLM + Tools
└─ Generate response, execute tools
5. Agent → Memory
└─ Store conversation context
6. Agent → Transport
└─ Return response
7. Transport → Client
└─ Stream or return complete response
Multi-Agent Orchestration¶
1. Client → Supervisor Agent
└─ High-level task
2. Supervisor → LLM
└─ Analyze task, create plan
3. Supervisor → Sub-Agents
└─ Delegate via agent_call tool
4. Sub-Agents → Execution
└─ Process individual tasks
5. Sub-Agents → Supervisor
└─ Return results
6. Supervisor → LLM
└─ Synthesize final response
7. Supervisor → Client
└─ Return complete result
Plugin System¶
Hector supports gRPC plugins for extending functionality.
Plugin Types¶
Type | Purpose | Interface |
---|---|---|
LLM | Custom language models | LLMProvider |
Database | Vector databases | Database |
Embedder | Text embeddings | Embedder |
Tool | Custom tools | Tool |
Parser | Document parsing | DocumentParser |
Reasoning | Custom strategies | ReasoningEngine |
Plugin Architecture¶
┌─────────────────────────────────────┐
│ Hector Core │
├─────────────────────────────────────┤
│ Plugin Interface (gRPC) │
├─────────────────────────────────────┤
│ ┌──────────┐ ┌──────────┐ │
│ │ Plugin 1 │ │ Plugin 2 │ │
│ │ (Go) │ │ (Python) │ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────┘
Key Files:
- pkg/plugins/*.go
- pkg/plugins/grpc/*.proto
Configuration System¶
YAML-based declarative configuration.
Configuration Flow¶
1. Load YAML file
2. Parse and validate
3. Substitute environment variables
4. Create components
5. Register agents
6. Start server
Configuration Validation¶
- Schema validation
- Reference checking (LLM, database, embedder)
- Value range validation
- Default application
Key Files:
- pkg/config/*.go
- pkg/config/validation_test.go
Memory Architecture¶
Working Memory¶
┌────────────────────────────────────┐
│ Conversation History │
│ ┌──────────────────────────────┐ │
│ │ Recent Messages (Buffer) │ │
│ │ • Last N messages │ │
│ │ • Fixed size window │ │
│ └──────────────────────────────┘ │
│ ┌──────────────────────────────┐ │
│ │ Compressed Summary │ │
│ │ • Token budget management │ │
│ │ • Automatic summarization │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────┘
Long-Term Memory¶
┌────────────────────────────────────┐
│ Vector Database (Qdrant) │
│ ┌──────────────────────────────┐ │
│ │ Session Memories │ │
│ │ • Scoped to conversation │ │
│ │ • Semantic search │ │
│ │ • Auto-recall │ │
│ └──────────────────────────────┘ │
│ ┌──────────────────────────────┐ │
│ │ User Memories │ │
│ │ • Across sessions │ │
│ │ • Persistent knowledge │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────┘
Performance Characteristics¶
Throughput¶
Operation | Latency | Throughput |
---|---|---|
Message (no tools) | ~200ms | ~50 req/s |
Message (with tools) | ~1-2s | ~10 req/s |
Streaming | ~50ms TTFT | Real-time |
RAG search | ~50-100ms | ~100 req/s |
Scalability¶
Horizontal Scaling: - Stateless agent processing - Shared vector database - Session storage backends (SQL, Redis)
Vertical Scaling: - Concurrent request handling - Connection pooling - Efficient memory management
Security Architecture¶
Authentication¶
┌─────────────────────────────────────┐
│ Client Request │
│ (with JWT token) │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ JWT Validation Middleware │
│ • Verify signature (JWKS) │
│ • Check issuer & audience │
│ • Validate expiration │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Agent-Level Authorization │
│ • Check required schemes │
│ • Validate credentials │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Agent Execution │
└─────────────────────────────────────┘
Key Files:
- pkg/auth/*.go
Extensibility¶
Hector provides multiple extension points:
1. Configuration Extension¶
# Add custom LLM
llms:
custom:
type: "plugin:my-llm"
# Your config
2. MCP Tools¶
# Add external tools
tools:
custom_tools:
type: "mcp"
server_url: "http://localhost:3000"
3. gRPC Plugins¶
// Implement interface
type MyLLM struct {}
func (m *MyLLM) Generate(ctx context.Context, req *pb.GenerateRequest) (*pb.GenerateResponse, error) {
// Your implementation
}
4. Custom Reasoning¶
// Implement ReasoningEngine
type MyStrategy struct {}
func (s *MyStrategy) Execute(ctx context.Context, input *ReasoningInput) (*ReasoningOutput, error) {
// Your strategy
}
Design Principles¶
1. Declarative First¶
Configuration over code. Agents defined in YAML, no programming required.
2. Protocol Native¶
Built entirely on A2A protocol. No abstraction layers, maximum interoperability.
3. Production Ready¶
Enterprise features built-in: auth, streaming, tasks, monitoring.
4. Extensible¶
Multiple extension points: plugins, MCP, custom components.
5. Zero Dependencies¶
Single binary deployment. No external services required (except for chosen features).
Deployment Modes¶
Local Mode¶
hector call assistant "Hello"
- In-process execution
- No network overhead
- Fast startup
Server Mode¶
hector serve --config config.yaml
- Multi-client support
- Persistent state
- Production ready
Client Mode¶
hector call assistant "Hello" --server http://remote:8080
- Connect to remote servers
- Distributed agents
- Team collaboration
Next Steps¶
- Configuration Reference - Complete config options
- API Reference - REST/gRPC APIs
- A2A Protocol - Protocol details
- Deploy to Production - Production setup
Related Topics¶
- Core Concepts - Understanding components
- Multi-Agent - Orchestration architecture
- Security - Authentication architecture