Structured Output via tool_use

Estimated time: 30 minutes

Asking the model for JSON in prose produces JSON sometimes. Asking via a tool_use with a schema produces JSON deterministically. This lesson covers how to design schemas that prevent fabrication, handle ambiguity, and produce reliable structured output.

Why tool_use Beats Prompt-Asked JSON

When you write "respond in JSON with these fields…" in the system prompt, you're trusting the model to:

  • Produce syntactically valid JSON (matched braces, quoted keys, proper escaping)
  • Include all required fields
  • Use the right types (numbers as numbers, not strings)

The model is good at this but not perfect. Syntax errors in JSON output are a regular failure mode in production prompts. Tool schemas with tool_choice: "any" eliminate syntax errors entirely — the API enforces the schema.

The Schema-First Pattern

const tools = [{
  name: "extract_invoice",
  input_schema: {
    type: "object",
    properties: {
      invoice_number: { type: "string" },
      total_amount: { type: "number" },
      // Nullable for optional fields — prevents hallucination
      po_number: { type: ["string", "null"] }
    },
    required: ["invoice_number", "total_amount"]
  }
}];

const response = await claude.messages.create({
  messages: [{ role: "user", content: invoiceText }],
  tools,
  tool_choice: { type: "tool", name: "extract_invoice" }
});

Nullable Optional Fields Prevent Fabrication

If po_number is required but the invoice doesn't have one, the model fabricates a value to satisfy the schema. Mark optional fields as type: ["string", "null"] and remove from required. The model now has a clean way to say "not present" — it puts null instead of inventing.

Add "unclear" Enum Values for Ambiguous Cases

For categorization tasks, add an explicit unclear value to the enum:

{ "sentiment": { "enum": ["positive", "negative", "neutral", "unclear"] } }

Without unclear, ambiguous inputs get classified into one of the three definitive categories — possibly wrong. With unclear, the model has an honest option for genuinely ambiguous cases.

Add "other" with Detail for Extensible Categories

{
  "category": { "enum": ["shipping", "billing", "technical", "other"] },
  "other_detail": { "type": ["string", "null"] }
}

This lets the model handle inputs that don't fit your taxonomy. Useful when discovering new categories you should add to the enum.

tool_use Eliminates Syntax Errors, Not Semantic Errors

Schema enforcement gives you valid JSON. It does NOT give you correct values. The model can still make semantic errors: line items that don't sum to the stated total, dates in the wrong format, customer names misspelled. For semantic correctness, you need validation logic in your code, not just schema enforcement.

Format Normalization in the Prompt

Schemas don't enforce format conventions. Include format rules in the prompt:

- Dates: ISO 8601 format (YYYY-MM-DD)
- Currency: numeric only, no symbols, two decimal places
- Phone numbers: E.164 format (+1XXXXXXXXXX)

Skills to Develop

  1. Use tool_use with schemas + tool_choice: "any" for structured output.
  2. Mark optional fields as nullable to prevent fabrication.
  3. Add unclear enum values for genuinely ambiguous cases.
  4. Include format normalization rules in the prompt alongside schemas.
  5. Validate semantic correctness in code — schemas don't catch sums that don't match.
Exam tip: If the model is fabricating PO numbers when none exist, make the field type: ["string", "null"] and remove from required. The fix is the schema, not the prompt.