Sessions & Streaming¶
Sessions provide context continuity across multiple interactions, while streaming delivers responses in real-time as the agent generates them.
Sessions¶
What are Sessions?¶
A session represents a continuous conversation between a user and an agent. Sessions enable:
- Context preservation - Agent remembers previous messages
- Memory persistence - Long-term memory scoped to sessions
- Conversation tracking - Monitor ongoing interactions
- Multi-turn conversations - Natural back-and-forth dialogue
Session Lifecycle¶
1. Session Created
├─ Unique session ID generated
├─ Memory initialized
└─ Context store created
2. Conversations Happen
├─ Messages exchanged
├─ Working memory updated
└─ Long-term memories stored
3. Session Ends
├─ Final memories stored
├─ Session marked complete
└─ Resources cleaned up
Using Sessions¶
REST API¶
Start a session:
curl -X POST http://localhost:8080/agents/assistant/sessions \
-H "Content-Type: application/json"
Response:
{
"session_id": "sess_abc123"
}
Send message in session:
curl -X POST http://localhost:8080/agents/assistant/sessions/sess_abc123/messages \
-H "Content-Type: application/json" \
-d '{
"message": "Hello, remember me?"
}'
Continue conversation:
curl -X POST http://localhost:8080/agents/assistant/sessions/sess_abc123/messages \
-H "Content-Type: application/json" \
-d '{
"message": "What did we talk about earlier?"
}'
Agent remembers previous context!
gRPC API¶
// Create session
rpc CreateSession(CreateSessionRequest) returns (Session)
// Send message
rpc SendMessage(SendMessageRequest) returns (MessageResponse)
// Stream messages
rpc StreamMessage(SendMessageRequest) returns (stream MessageChunk)
CLI¶
# Interactive chat (automatic session)
hector chat assistant
# Specify session ID
hector chat assistant --session sess_abc123
Session Configuration¶
agents:
assistant:
memory:
working:
strategy: "summary_buffer"
budget: 4000
longterm:
storage_scope: "session" # Session-scoped long-term memory
Storage scopes:
session
- Memories per session (most common)conversational
- Memories across all user sessionsall
- Global memory across all userssummaries_only
- Only summarized content
Session Management¶
List Sessions¶
GET /agents/{agent}/sessions
Get Session Info¶
GET /agents/{agent}/sessions/{session_id}
Delete Session¶
DELETE /agents/{agent}/sessions/{session_id}
Streaming¶
What is Streaming?¶
Streaming delivers agent responses token-by-token as they're generated, instead of waiting for the complete response.
Without streaming:
[User waits...]
[User waits...]
[Complete response arrives]
With streaming:
The capital
The capital of
The capital of France
The capital of France is
The capital of France is Paris.
Benefits¶
- Real-time feedback - Users see progress immediately
- Better UX - Feels more interactive
- Early cancellation - Stop if going wrong direction
- Perceived speed - Feels faster even if same total time
Enabling Streaming¶
Configuration¶
agents:
assistant:
reasoning:
enable_streaming: true # Enable streaming
REST API (SSE - Server-Sent Events)¶
curl -N http://localhost:8080/agents/assistant/messages/stream \
-H "Content-Type: application/json" \
-d '{
"message": "Explain quantum computing"
}'
Response (SSE format):
data: {"chunk": "Quantum"}
data: {"chunk": " computing"}
data: {"chunk": " uses"}
data: {"chunk": " quantum"}
data: {"chunk": " mechanics..."}
data: [DONE]
WebSocket¶
const ws = new WebSocket('ws://localhost:8080/agents/assistant/stream');
ws.send(JSON.stringify({
message: "Explain quantum computing"
}));
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.chunk) {
process.stdout.write(data.chunk);
}
};
gRPC Streaming¶
stream, err := client.StreamMessage(ctx, &pb.SendMessageRequest{
Agent: "assistant",
Message: "Explain quantum computing",
})
for {
chunk, err := stream.Recv()
if err == io.EOF {
break
}
fmt.Print(chunk.Content)
}
CLI (Automatic)¶
# Streaming enabled by default in CLI
hector call assistant "Explain quantum computing"
# Response streams as it's generated
Streaming with Tools¶
When agents use tools, streaming shows progress:
agents:
coder:
reasoning:
enable_streaming: true
show_tool_execution: true
tools: ["write_file", "execute_command"]
Streamed output:
Let me create that file...
[Tool: write_file("hello.py", "print('Hello')")]
File created successfully.
Now let me test it...
[Tool: execute_command("python hello.py")]
Output: Hello
The program works correctly!
Sessions + Streaming¶
Combine both for best experience:
agents:
assistant:
reasoning:
enable_streaming: true
memory:
working:
strategy: "summary_buffer"
budget: 4000
longterm:
storage_scope: "session"
REST API:
# Create session
SESSION_ID=$(curl -X POST http://localhost:8080/agents/assistant/sessions | jq -r '.session_id')
# Stream messages in session
curl -N http://localhost:8080/agents/assistant/sessions/$SESSION_ID/messages/stream \
-H "Content-Type: application/json" \
-d '{"message": "Hello"}'
Agent streams responses and maintains session context!
Advanced Configuration¶
Session Timeout¶
# Coming soon
sessions:
timeout: "30m" # Session expires after 30 minutes of inactivity
max_duration: "24h" # Maximum session duration
Streaming Options¶
agents:
assistant:
reasoning:
enable_streaming: true
show_tool_execution: true # Show tool calls in stream
show_thinking: false # Show internal reasoning
show_debug_info: false # Show debug details
Streaming Customization¶
agents:
custom:
streaming:
chunk_size: 10 # Characters per chunk
delay_ms: 50 # Delay between chunks
buffer_size: 1024 # Buffer size
Use Cases¶
Chat Applications¶
agents:
chatbot:
reasoning:
enable_streaming: true
memory:
working:
strategy: "buffer_window"
window_size: 20
longterm:
storage_scope: "session"
Frontend:
const eventSource = new EventSource(
`http://localhost:8080/agents/chatbot/sessions/${sessionId}/messages/stream`
);
eventSource.onmessage = (event) => {
const data = JSON.parse(event.data);
appendToChat(data.chunk);
};
Customer Support¶
agents:
support:
reasoning:
enable_streaming: true
show_tool_execution: true
tools: ["search", "agent_call"]
memory:
longterm:
storage_scope: "conversational" # Remember across sessions
Code Assistants¶
agents:
coder:
reasoning:
enable_streaming: true
show_tool_execution: true
show_thinking: true # Show reasoning process
tools: ["write_file", "execute_command", "search"]
Monitoring & Debugging¶
Session Tracking¶
Enable debug logging:
agents:
debug:
reasoning:
show_debug_info: true
Output shows:
[Session: sess_abc123]
[Message: 1]
[Memory: 523 tokens]
[Response: ...]
Streaming Debug¶
Monitor stream chunks:
curl -N http://localhost:8080/agents/assistant/messages/stream?debug=true \
-H "Content-Type: application/json" \
-d '{"message": "Hello"}'
API Reference¶
REST Endpoints¶
Endpoint | Method | Description |
---|---|---|
/agents/{agent}/sessions |
POST | Create session |
/agents/{agent}/sessions |
GET | List sessions |
/agents/{agent}/sessions/{id} |
GET | Get session info |
/agents/{agent}/sessions/{id} |
DELETE | Delete session |
/agents/{agent}/sessions/{id}/messages |
POST | Send message |
/agents/{agent}/sessions/{id}/messages/stream |
POST | Stream message (SSE) |
/agents/{agent}/stream |
WS | WebSocket streaming |
gRPC Methods¶
service A2AService {
rpc CreateSession(CreateSessionRequest) returns (Session);
rpc SendMessage(SendMessageRequest) returns (MessageResponse);
rpc StreamMessage(SendMessageRequest) returns (stream MessageChunk);
rpc ListSessions(ListSessionsRequest) returns (ListSessionsResponse);
rpc DeleteSession(DeleteSessionRequest) returns (DeleteSessionResponse);
}
See API Reference for full details.
Best Practices¶
Session Management¶
# ✅ Good: Session-scoped memory
agents:
support:
memory:
longterm:
storage_scope: "session"
# ⚠️ Caution: Global memory (memory grows indefinitely)
agents:
risky:
memory:
longterm:
storage_scope: "all"
Streaming Performance¶
# ✅ Good: Streaming with reasonable chunk size
agents:
fast:
reasoning:
enable_streaming: true
# ❌ Bad: Streaming disabled for interactive apps
agents:
slow:
reasoning:
enable_streaming: false # Users wait for complete response
Error Handling¶
// ✅ Good: Handle stream errors
const eventSource = new EventSource(url);
eventSource.onerror = (error) => {
console.error('Stream error:', error);
eventSource.close();
};
// ❌ Bad: No error handling
const eventSource = new EventSource(url);
eventSource.onmessage = (event) => { /* ... */ };
Next Steps¶
- API Reference - Complete API documentation
- Memory - Session-scoped memory configuration
- Security - Session authentication
- Architecture - How sessions work internally
Related Topics¶
- Agent Overview - Understanding agents
- Configuration Reference - All session options
- Build a Chat Application - Complete tutorial