Memory & Sessions¶
Hector's memory system manages conversation history, state, and semantic search across sessions.
Architecture¶
Hector uses a two-layer design:
Session Service (SOURCE OF TRUTH)
│
├─ Messages (full conversation history)
├─ State (key-value store)
└─ Artifacts (files, images)
Index Service (SEARCH INDEX)
│
└─ Built from session events
(can be rebuilt at any time)
Key principles:
- Session service is the single source of truth
- Index service is a derived search index
- Index can be rebuilt from sessions
- No data loss if index corrupted
Session Configuration¶
In-Memory (Development)¶
storage:
sessions:
backend: inmemory
Data lost on restart. Use for development.
SQL (Production)¶
storage:
sessions:
backend: sql
database: main
Persistent storage across restarts.
Supported databases:
- SQLite (embedded)
- PostgreSQL (production)
- MySQL (alternative)
Complete Example:
databases:
main:
driver: postgres
host: localhost
port: 5432
database: hector
user: ${DB_USER}
password: ${DB_PASSWORD}
storage:
sessions:
backend: sql
database: main
State Management¶
Sessions include a scoped key-value store for persisting data.
State Scopes¶
| Prefix | Scope | Lifecycle |
|---|---|---|
| (none) | Session | Persists within session only |
user: |
User | Shared across user's sessions |
app: |
Application | Global across all users |
temp: |
Temporary | Auto-cleared after invocation |
Usage Examples¶
Session-level state (default):
Store data specific to the current conversation:
// In a tool callback
session.SetState(ctx, "last_search", "deployment kubernetes")
session.SetState(ctx, "current_topic", "infrastructure")
// Retrieve later in the same session
topic, _ := session.GetState(ctx, "current_topic")
User-level state:
Store preferences that persist across all user's sessions:
// Prefixed with "user:" for user-scoped storage
session.SetState(ctx, "user:theme", "dark")
session.SetState(ctx, "user:language", "en")
session.SetState(ctx, "user:preferences", `{"notifications": true}`)
// Available in all sessions for this user
theme, _ := session.GetState(ctx, "user:theme")
App-level state:
Global data shared across all users:
// Prefixed with "app:" for application-wide storage
session.SetState(ctx, "app:version", "2.0.0")
session.SetState(ctx, "app:announcement", "Maintenance at 2 AM")
// Readable by any session
version, _ := session.GetState(ctx, "app:version")
Temporary state:
Data automatically cleared after each agent invocation:
// Prefixed with "temp:" for auto-clearing
session.SetState(ctx, "temp:processing", "true")
session.SetState(ctx, "temp:intermediate_result", partialData)
// Cleared automatically after invocation completes
Accessing state in instructions:
agents:
assistant:
instruction: |
User's preferred language: {user:language}
Current topic: {current_topic?}
App version: {app:version}
Working Memory¶
Working memory manages the LLM context window by filtering conversation history.
No Strategy (Default)¶
Include entire conversation history:
agents:
assistant:
context:
strategy: none # Default
Best for: Short conversations where full context is needed.
Buffer Window¶
Keep only the last N messages:
agents:
assistant:
context:
strategy: buffer_window
window_size: 20 # Keep last 20 messages
Best for: Medium-length conversations, simple use cases.
Token Window¶
Keep messages within a token budget:
agents:
assistant:
context:
strategy: token_window
budget: 8000 # Max tokens
preserve_recent: 5 # Always keep last 5 messages
Best for: Precise context control, cost optimization.
Summary Buffer¶
Summarize old messages when exceeding budget:
agents:
assistant:
context:
strategy: summary_buffer
budget: 8000 # Token budget
threshold: 0.85 # Summarize at 85% usage
target: 0.7 # Reduce to 70% after summarizing
summarizer_llm: fast # Use cheaper model for summarization
Best for: Long conversations where context matters.
How it works:
- Conversation exceeds threshold (85% of budget)
- Old messages are summarized using the summarizer LLM
- Summary replaces the old messages
- Context reduced to target level (70%)
- Recent messages preserved in full
Memory Index¶
Searchable index over conversation history enables cross-session recall.
Keyword Index¶
Simple word matching, no embeddings required:
storage:
memory:
backend: keyword
Vector Index¶
Semantic similarity using embeddings:
storage:
memory:
backend: vector
embedder: default
Requires an embedder configuration:
embedders:
default:
provider: openai
model: text-embedding-3-small
api_key: ${OPENAI_API_KEY}
storage:
memory:
backend: vector
embedder: default
Vector Index with Persistence¶
For production, persist vectors to disk:
storage:
memory:
backend: vector
embedder: default
vector_provider:
type: chromem # Embedded vector store
chromem:
persist_path: .hector/memory_vectors
compress: true # Gzip compression
This ensures vectors survive restarts without re-embedding.
Session Management¶
Session APIs¶
List sessions:
curl "http://localhost:8080/sessions?user_id=user123"
Response:
{
"sessions": [
{"id": "sess_abc", "agent": "assistant", "updated_at": "2025-01-15T10:00:00Z"},
{"id": "sess_def", "agent": "assistant", "updated_at": "2025-01-14T15:30:00Z"}
]
}
Delete a session:
curl -X DELETE "http://localhost:8080/sessions/sess_abc"
Get session details:
curl "http://localhost:8080/sessions/sess_abc"
Cross-Session Memory¶
Enable agents to recall information from past conversations.
Search Tool¶
agents:
assistant:
llm: gpt-4o
search:
enabled: true
max_results: 5
With search enabled, the agent can access past conversations:
User: "What did I ask about yesterday?"
Agent searches past conversations
→ Finds relevant history
→ Uses context to answer
Artifacts¶
Sessions can store generated files.
Artifact Storage¶
Artifacts are binary files (images, documents, etc.) that persist with the session:
- Images generated by the agent
- Documents created during conversation
- Any binary data the agent produces
Usage¶
Agents can save and retrieve artifacts during execution. Artifacts are stored in the session and persist with it.
Session Lifecycle¶
Session Creation¶
Sessions are created automatically when a user starts a conversation:
- User sends first message
- Session created with unique ID
- State initialized (empty or with defaults)
- Events collection started
During Conversation¶
Each invocation follows this flow:
- Load session
- Filter events (working memory strategy)
- Run agent
- Save new events
- Index events (for search)
- Summarize if needed
- Clear temporary state
Session Cleanup¶
Sessions can be deleted when no longer needed:
- Manually via API
- Automatically after expiration (configurable)
- On user request
Best Practices¶
Choosing Working Memory Strategy¶
Short conversations (customer support, Q&A):
context:
strategy: buffer_window
window_size: 20
Long conversations (research, analysis):
context:
strategy: summary_buffer
budget: 8000
threshold: 0.85
target: 0.7
Cost-sensitive applications:
context:
strategy: token_window
budget: 4000
preserve_recent: 3
State Scoping¶
Use appropriate prefixes for data:
- No prefix: Session-specific data (current topic, last action)
- user:: User preferences (theme, language, settings)
- app:: Global configuration (version, announcements)
- temp:: Processing data (intermediate results)
Index Configuration¶
Development: Use keyword index (simpler, no embeddings needed)
storage:
memory:
backend: keyword
Production: Use vector index for semantic search
storage:
memory:
backend: vector
embedder: default
Database Selection¶
| Use Case | Database | Configuration |
|---|---|---|
| Development | In-memory | backend: inmemory |
| Single server | SQLite | driver: sqlite |
| Production | PostgreSQL | driver: postgres |
| Alternative | MySQL | driver: mysql |
Examples¶
Basic Session Persistence¶
databases:
main:
driver: sqlite
path: .hector/hector.db
storage:
sessions:
backend: sql
database: main
Production Configuration¶
databases:
main:
driver: postgres
host: ${DB_HOST}
port: 5432
database: hector
user: ${DB_USER}
password: ${DB_PASSWORD}
embedders:
default:
provider: openai
model: text-embedding-3-small
api_key: ${OPENAI_API_KEY}
storage:
sessions:
backend: sql
database: main
index:
type: vector
embedder: default
agents:
assistant:
llm: default
context:
strategy: summary_buffer
budget: 8000
threshold: 0.85
target: 0.7
Multi-Agent with Shared Memory¶
agents:
coordinator:
llm: default
sub_agents: [researcher, writer]
context:
strategy: buffer_window
window_size: 30
researcher:
llm: default
tools: [search]
context:
strategy: buffer_window
window_size: 20
writer:
llm: default
tools: [text_editor]
context:
strategy: buffer_window
window_size: 20
All agents share the same session when working together.