Skip to content

Quick Observability Setup

Enable monitoring and tracing in development with a single flag - no config file needed!


The --observe Flag

The --observe flag automatically configures:

  • βœ… Metrics - Prometheus metrics at /metrics endpoint
  • βœ… Tracing - OpenTelemetry traces to localhost:4317
  • βœ… Full instrumentation - Agent calls, LLM requests, tool execution

Quick Start

# 1. Start observability stack (Prometheus, Jaeger, Grafana)
docker-compose -f docker-compose.observability.yaml up -d

# 2. Start Hector with observability enabled
hector serve --observe

# 3. Make requests
curl -X POST http://localhost:8080/v1/agents/assistant/message:send \
  -H "Content-Type: application/json" \
  -d '{"message":{"role":"user","parts":[{"text":"Hello"}]}}'

# 4. View dashboards
# Metrics:  http://localhost:9090
# Traces:   http://localhost:16686
# Grafana:  http://localhost:3000 (admin/Dev12345)

What Gets Configured

When you use --observe, Hector automatically enables:

Global Observability

global:
  observability:
    metrics:
      enabled: true
    tracing:
      enabled: true
      exporter_type: "otlp"
      endpoint_url: "localhost:4317"
      sampling_rate: 1.0
      service_name: "hector"

Metrics Available

  • hector_agent_calls_total - Total agent calls
  • hector_agent_call_duration_seconds - Agent call latency
  • hector_agent_errors_total - Agent errors
  • hector_agent_tokens_used_total - Token usage
  • hector_llm_* - LLM-specific metrics
  • hector_tool_* - Tool execution metrics
  • hector_http_* - HTTP request metrics
  • hector_grpc_* - gRPC metrics

Traces Collected

  • πŸ” Agent calls - Full request/response lifecycle
  • πŸ€– LLM requests - Model interactions with token counts
  • πŸ”§ Tool executions - Tool invocations with timing
  • πŸ“Š HTTP/gRPC - Transport layer spans

Combine with Other Flags

The --observe flag works seamlessly with other zero-config flags:

# With tools enabled
hector serve --observe --tools

# With custom model
hector serve --observe --model gpt-4o

# With RAG/document store
hector serve --observe --tools --docs-folder ./knowledge

# All together
hector serve --observe --tools --docs-folder ./docs --model claude-sonnet-4

Access Points

Once running with --observe:

Service URL Credentials
Metrics http://localhost:8080/metrics -
Prometheus http://localhost:9090 -
Jaeger http://localhost:16686 -
Grafana http://localhost:3000 admin/Dev12345

Grafana Dashboards

Pre-built dashboards automatically load:

  1. Hector Overview - Service health, request rates, errors
  2. HTTP/gRPC - Transport layer metrics
  3. LLM & Tools - Model performance, tool usage
  4. Business Metrics - Sessions, conversations, tokens

Using Dashboards

  1. Open Grafana: http://localhost:3000
  2. Login: admin / Dev12345
  3. Navigate to Dashboards β†’ Browse
  4. Select any Hector dashboard

Custom Endpoint

Need to send traces somewhere else? Use a config file:

global:
  observability:
    metrics:
      enabled: true
    tracing:
      enabled: true
      exporter_type: "otlp"
      endpoint_url: "my-collector.example.com:4317"
      sampling_rate: 0.1  # Sample 10% in production

llms:
  openai:
    type: openai
    api_key: "${OPENAI_API_KEY}"

agents:
  assistant:
    name: "Assistant"
    llm: openai

Then run:

hector serve --config my-config.yaml


Why This Matters

Zero-Config Observability means you can start monitoring immediately without writing configuration files. Perfect for development and quick debugging.

Full Instrumentation gives you visibility into every layerβ€”from HTTP requests down to individual tool executions. You can see exactly where time is spent and where errors occur.

Production-Ready metrics and traces work with standard observability tools (Prometheus, Grafana, Jaeger), so you can use the same setup from development to production.


Troubleshooting

No metrics showing up

Check metrics are being generated:

curl http://localhost:8080/metrics | grep hector_

If empty, metrics haven't been generated yet. Make some requests first.

Prometheus not scraping

Check Prometheus targets:

curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets'

Look for job="hector" with health="up".

Jaeger not receiving traces

Verify OTLP endpoint is reachable:

# From host machine
nc -zv localhost 4317

Check Jaeger is running:

docker ps | grep jaeger

Grafana dashboards empty

  1. Check Prometheus datasource is configured:

  2. Go to Configuration β†’ Data Sources

  3. Prometheus should be listed and marked as default

  4. Verify time range in top-right (Last 15 minutes β†’ Last 1 hour)

  5. Generate some traffic to create data points


Production Use

For production, use a config file with:

  • Lower sampling rates (e.g., 0.1 for 10%)
  • External collectors (Datadog, Honeycomb, etc.)
  • Secure endpoints
  • Authentication

See: Observability Guide | Configuration Reference


Next Steps

Enhance your observability:

  • Add custom metrics: Track business-specific metrics
  • Set up alerts: Configure Prometheus alerting rules
  • Export to external services: Datadog, Honeycomb, New Relic
  • Production deployment: Secure endpoints, authentication

Resources:


Conclusion

You've enabled full observability with a single flag:

  • βœ… Metrics at /metrics endpoint
  • βœ… Traces to OpenTelemetry collector
  • βœ… Pre-built Grafana dashboards
  • βœ… Zero configuration required

The best part? You can use the same observability setup from development to production, just by adjusting sampling rates and endpoints.

Ready to monitor your agents? Start with --observe, then customize with a config file as you scale to production.


About Hector: Hector is a production-grade A2A-native agent platform designed for enterprise deployments. Learn more at gohector.dev.