Information Provenance & Multi-Source Synthesis

When synthesizing information from multiple sources, claim attribution and provenance are critical. Without them, conflicting facts get arbitrarily resolved, temporal data gets misinterpreted as contradiction, and confidence in outputs collapses. This lesson covers structured claim-source mappings, contested-vs-established findings, and the role of human review.

Why Provenance Matters

When subagents return findings without source attribution, the coordinator can't tell where each claim came from. During synthesis, conflicting claims get arbitrarily resolved or averaged together. The user sees a confident answer with no way to verify, and contradictions in the source material get hidden.

The Structured Claim-Source Pattern

{
  "claim": "The iPhone was launched in 2007",
  "source": {
    "type": "press_release",
    "url": "https://apple.com/press/2007/01/09Apple-Reinvents-the-Phone...",
    "document_name": "Apple Reinvents the Phone with iPhone",
    "publication_date": "2007-01-09",
    "relevant_excerpt": "Apple today introduced iPhone, combining…"
  }
}

Each claim ships with its source: type, URL, document name, publication date, relevant excerpt. The coordinator uses this to handle conflicts, attribute findings, and answer follow-up questions about provenance.

Conflicting Statistics from Credible Sources

If two credible sources cite different numbers, do NOT arbitrarily pick one. Annotate the conflict explicitly:

Estimated market size: between $2.4B and $3.8B.
- $2.4B per Gartner Q1 2026 report (URL, date)
- $3.8B per IDC market scan, March 2026 (URL, date)
The discrepancy may be due to different scope definitions
(Gartner excludes embedded systems; IDC includes them).

Naming the conflict and explaining the likely cause is more useful than picking a number and hiding the disagreement.

Temporal Data Requires Dates

If two sources report different facts for the same question and one is from 2022 while the other is from 2026, that's not a contradiction — it's a change over time. The synthesis must surface dates so the coordinator (and ultimately the user) can interpret correctly. Without dates, temporal evolution looks like factual disagreement.

Structure: Established vs Contested Findings

For research outputs, organize findings into explicit sections:

Well-established findings. Multiple credible sources agree.
Contested findings. Sources disagree; surface the disagreement explicitly.
Single-source claims. Only one source backs this; flag for verification.

Render Content Types Appropriately

Financial data → tables. Numbers in prose are hard to scan.
News and narrative → prose. Tables fragment the story.
Technical findings → lists. Easy to scan and reference.
Multi-source comparisons → side-by-side tables with source attribution per row.

Human Review Workflow Considerations

Aggregate accuracy can mask poor performance on specific document types. A 97% overall accuracy might be 99% on receipts and 70% on invoices. Use stratified random sampling for ongoing measurement — sample by document type and field, not just by document.

Field-Level Confidence for Review Routing

Calibrate confidence scores per field using labeled validation sets. Route fields with low calibrated confidence to human review; auto-accept fields with high confidence. This is more efficient than reviewing every field of every document or relying on aggregate confidence.

Self-Review Limitation Revisited

The model that generated content retains its reasoning context — it's less likely to question its own output. For quality assurance, use a separate model instance or human reviewer for review. Same model + same session is the worst configuration for catching errors.

Skills to Develop

Require subagents to output structured claim-source mappings (URL, document, date, excerpt).
Annotate conflicting statistics with sources rather than arbitrarily picking one.
Surface publication/collection dates for temporal data.
Organize research outputs into established vs contested findings.
Use stratified random sampling and field-level confidence calibration for human review routing.

Exam tip: Conflicting numbers from credible sources → annotate with both sources and likely explanation for the discrepancy. Don't arbitrarily pick. Don't average. Surface the conflict.