Skip to content

Observability

Monitor your Hector deployment with structured logs, Prometheus metrics, and OpenTelemetry tracing.

Logging

Hector uses structured logging for production observability.

Configuration

hector serve \
  --log-level info \
  --log-format json \
  --log-file /var/log/hector.log
Flag Env Variable Default Options
--log-level HECTOR_LOG_LEVEL info debug, info, warn, error
--log-format HECTOR_LOG_FORMAT text text, json
--log-file HECTOR_LOG_FILE stdout File path

JSON Log Format

Production-ready for log aggregators (Elasticsearch, Loki, Splunk):

{
  "time": "2026-01-20T18:00:00Z",
  "level": "INFO",
  "msg": "Agent invocation completed",
  "agent": "assistant",
  "session_id": "sess_123",
  "duration_ms": 1250,
  "tokens_used": 450
}

Metrics (Prometheus)

Enable the /metrics endpoint:

hector serve --metrics

Available Metrics

Metric Type Description
hector_requests_total Counter Total HTTP requests
hector_request_duration_seconds Histogram Request latency
hector_agent_invocations_total Counter Agent executions
hector_llm_tokens_total Counter Token usage (prompt/completion)
hector_tool_calls_total Counter Tool invocations
hector_scheduler_triggers_total Counter Scheduled trigger firings
hector_notifications_total Counter Outbound webhook notifications
hector_guardrail_violations_total Counter Blocked inputs/outputs

Labels

Common labels across metrics:

Label Description
app App/tenant name
agent Agent name
status success, error
tool Tool name (for tool metrics)

Prometheus Scrape Config

scrape_configs:
  - job_name: 'hector'
    static_configs:
      - targets: ['hector:8080']
    metrics_path: /metrics

Tracing (OpenTelemetry)

Enable distributed tracing:

hector serve --tracing-endpoint "jaeger:4317"

Trace Spans

Span Description
http.request Full request lifecycle
agent.run Agent execution
llm.generate LLM API call
tool.call Tool invocation
vectorstore.search RAG retrieval
guardrail.check Input/output validation

Configuration

Flag Env Variable Default
--tracing-endpoint HECTOR_TRACING_ENDPOINT localhost:4317

Jaeger Setup

docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

hector serve --tracing-endpoint "localhost:4317"

Access UI at http://localhost:16686.

Grafana Dashboard

Import metrics into Grafana using the hector_* namespace.

Key Panels

Panel Query
Request Rate rate(hector_requests_total[5m])
Agent Latency P95 histogram_quantile(0.95, rate(hector_request_duration_seconds_bucket[5m]))
Token Usage sum(rate(hector_llm_tokens_total[1h])) by (agent)
Error Rate rate(hector_requests_total{status="error"}[5m])
Tool Usage topk(10, sum by (tool) (hector_tool_calls_total))

Alerting Examples

# High error rate
- alert: HectorHighErrorRate
  expr: rate(hector_requests_total{status="error"}[5m]) > 0.1
  for: 5m
  labels:
    severity: warning

# Token budget exhaustion
- alert: HectorHighTokenUsage
  expr: sum(increase(hector_llm_tokens_total[1h])) > 100000
  for: 1h
  labels:
    severity: warning

Health Endpoint

curl http://localhost:8080/health

Response:

{
  "status": "ok",
  "version": "v1.20.0",
  "database": "connected"
}

Next Steps