Structured Output Guide¶

Hector provides comprehensive structured output capabilities across all supported LLM providers (OpenAI, Anthropic, Gemini). This guide explains how to leverage provider-specific optimizations for consistent, reliable structured responses.

Overview¶

Structured output ensures that LLM responses conform to a specific format or schema. This is essential for:

Reliable parsing - No need to parse free-form text
Type safety - Guaranteed data types and structure
Downstream integration - Direct use in APIs, databases, and other systems
Consistency - Predictable outputs across multiple calls

Provider Comparison¶

Feature	OpenAI	Anthropic	Gemini
Native JSON Schema	Yes	No	Yes
Strict Validation	Yes	No	Partial
Response Prefill	No	Yes	No
Property Ordering	No	No	Yes
Enum Support	Yes	Via prompt	Yes

Implementation Details¶

OpenAI: Uses native response_format with JSON schema and strict validation
Anthropic: Uses system prompt instructions + prefill technique for JSON output
Gemini: Uses responseMimeType and responseSchema with optional property ordering

Configuration¶

Structured output is configured via the StructuredOutputConfig struct:

type StructuredOutputConfig struct {
    Format           string                 // "json" or "enum"
    Schema           interface{}            // JSON Schema (map[string]interface{})
    Enum             []string               // Enum values (for enum format)
    Prefill          string                 // Anthropic-specific: prefill response
    PropertyOrdering []string               // Gemini-specific: property order
}

JSON Schema Output¶

Basic JSON Schema¶

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "sentiment": map[string]interface{}{
                "type": "string",
                "enum": []string{"positive", "negative", "neutral"},
                "description": "The sentiment of the text",
            },
            "confidence": map[string]interface{}{
                "type": "number",
                "minimum": 0,
                "maximum": 1,
                "description": "Confidence score for the sentiment",
            },
            "reasoning": map[string]interface{}{
                "type": "string",
                "description": "Brief explanation of the sentiment",
            },
        },
        "required": []string{"sentiment", "confidence"},
    },
}

// Use with any provider
text, toolCalls, tokens, err := provider.GenerateStructured(messages, tools, config)

Complex Nested Schema¶

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "person": map[string]interface{}{
                "type": "object",
                "properties": map[string]interface{}{
                    "name": map[string]interface{}{
                        "type": "string",
                    },
                    "age": map[string]interface{}{
                        "type": "number",
                    },
                    "address": map[string]interface{}{
                        "type": "object",
                        "properties": map[string]interface{}{
                            "street": map[string]interface{}{"type": "string"},
                            "city": map[string]interface{}{"type": "string"},
                            "zipcode": map[string]interface{}{"type": "string"},
                        },
                    },
                },
            },
            "skills": map[string]interface{}{
                "type": "array",
                "items": map[string]interface{}{
                    "type": "string",
                },
            },
        },
        "required": []string{"person", "skills"},
    },
}

Enum Output¶

For selecting from a fixed set of options:

config := &llms.StructuredOutputConfig{
    Format: "enum",
    Enum:   []string{"Percussion", "String", "Woodwind", "Brass", "Keyboard"},
}

// Gemini will set responseMimeType to "text/x.enum"
text, toolCalls, tokens, err := provider.GenerateStructured(messages, tools, config)

Provider-Specific Optimizations¶

OpenAI: Strict JSON Mode¶

OpenAI's structured output uses strict JSON schema validation:

// Hector automatically enables strict mode for OpenAI
config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: yourSchema,
}

// Translates to:
// {
//   "response_format": {
//     "type": "json_schema",
//     "json_schema": {
//       "name": "response",
//       "schema": yourSchema,
//       "strict": true
//     }
//   }
// }

Anthropic: Prefill Technique¶

Anthropic uses response prefilling to enforce JSON output:

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: yourSchema,
    Prefill: "{\"sentiment\":",  // Forces JSON start
}

// The assistant's response will begin with the prefill,
// ensuring JSON output from the start

Best prefills: - { - Generic JSON object - {"field_name": - Specific first field - [ - JSON array - {"type": " - When type is first field

Gemini: Property Ordering¶

Gemini supports property ordering for consistent output:

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: yourSchema,
    PropertyOrdering: []string{"name", "age", "email", "phone"},
}

// Properties will appear in this exact order in the response

Examples¶

Example 1: Sentiment Analysis¶

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "sentiment": map[string]interface{}{
                "type": "string",
                "enum": []string{"positive", "negative", "neutral"},
            },
            "score": map[string]interface{}{
                "type": "number",
                "minimum": -1,
                "maximum": 1,
            },
            "key_phrases": map[string]interface{}{
                "type": "array",
                "items": map[string]interface{}{"type": "string"},
            },
        },
        "required": []string{"sentiment", "score"},
    },
}

messages := []llms.Message{
    {Role: "user", Content: "I absolutely love this product! It's amazing!"},
}

text, _, _, err := provider.GenerateStructured(messages, nil, config)
// text: {"sentiment": "positive", "score": 0.95, "key_phrases": ["love", "amazing"]}

Example 2: Data Extraction¶

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "company": map[string]interface{}{
                "type": "string",
            },
            "position": map[string]interface{}{
                "type": "string",
            },
            "salary_range": map[string]interface{}{
                "type": "object",
                "properties": map[string]interface{}{
                    "min": map[string]interface{}{"type": "number"},
                    "max": map[string]interface{}{"type": "number"},
                    "currency": map[string]interface{}{"type": "string"},
                },
            },
            "requirements": map[string]interface{}{
                "type": "array",
                "items": map[string]interface{}{"type": "string"},
            },
        },
        "required": []string{"company", "position"},
    },
}

Example 3: Classification with Streaming¶

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "category": map[string]interface{}{
                "type": "string",
                "enum": []string{"bug", "feature", "question", "documentation"},
            },
            "priority": map[string]interface{}{
                "type": "string",
                "enum": []string{"low", "medium", "high", "critical"},
            },
            "tags": map[string]interface{}{
                "type": "array",
                "items": map[string]interface{}{"type": "string"},
            },
        },
        "required": []string{"category", "priority"},
    },
}

// Streaming works with structured output
chunks, err := provider.GenerateStructuredStreaming(messages, nil, config)
for chunk := range chunks {
    switch chunk.Type {
    case "text":
        fmt.Print(chunk.Text)  // Incremental JSON
    case "done":
        fmt.Printf("\nTokens: %d\n", chunk.Tokens)
    case "error":
        fmt.Printf("Error: %v\n", chunk.Error)
    }
}

Example 4: Multi-Turn with Structured Output¶

messages := []llms.Message{
    {Role: "user", Content: "Extract key information from this resume: John Doe, 5 years exp..."},
    {Role: "assistant", Content: `{"name": "John Doe", "experience_years": 5, ...}`},
    {Role: "user", Content: "Now add a relevance score for a software engineer position"},
}

config := &llms.StructuredOutputConfig{
    Format: "json",
    Schema: resumeWithScoreSchema,
}

text, _, _, err := provider.GenerateStructured(messages, nil, config)

Configuration via YAML¶

You can configure structured output in agent configs:

agents:
  - name: sentiment_analyzer
    description: Analyzes sentiment with structured output
    llm:
      type: openai
      model: gpt-4
      structured_output:
        format: json
        schema:
          type: object
          properties:
            sentiment:
              type: string
              enum: ["positive", "negative", "neutral"]
            confidence:
              type: number
              minimum: 0
              maximum: 1
          required: ["sentiment", "confidence"]

Best Practices¶

Schema Design¶

Keep schemas simple - Complex nested schemas can be harder for LLMs to follow
Use descriptive field names - Clear names help the LLM understand what to extract
Include required fields - Specify which fields are mandatory
Add constraints - Use min/max for numbers, enums for strings

Provider Selection¶

OpenAI - Best for strict JSON validation and complex schemas
Anthropic - Good for JSON with prefill technique, excellent reasoning
Gemini - Best for property ordering and enum outputs

Error Handling¶

text, _, _, err := provider.GenerateStructured(messages, nil, config)
if err != nil {
    // Handle provider-specific errors
    log.Printf("Structured output failed: %v", err)
    return
}

// Validate JSON before using
var result map[string]interface{}
if err := json.Unmarshal([]byte(text), &result); err != nil {
    log.Printf("Invalid JSON response: %v", err)
    return
}

Performance Considerations¶

Use streaming for long responses - Better user experience
Cache schemas - Avoid recreating complex schemas
Choose appropriate providers - Match provider strengths to your use case
Track token usage - Structured output may use more tokens than free-form text