AI Cited a URL That Didn't Contain the Claim. I Built the Tooling to Measure How Often

Current Situation Analysis

The Silent Failure of Grounded Generation

In production Retrieval-Augmented Generation (RAG) systems, the industry has largely solved the problem of "hallucinated facts" by grounding models in external knowledge bases. However, a more insidious failure mode has emerged: Citation Hallucination. This occurs when a model generates a response that appears factually grounded—complete with valid URLs and structured references—but the link between the claim and the source is broken.

The user experience is deceptive. The response looks authoritative. The links are clickable. The domain names are correct. Yet, the cited document does not support the specific assertion made in the text. This is not a simple "model made up a fact" error; it is a structural failure in the attribution pipeline.

Why This Problem is Overlooked

Most engineering teams treat citation verification as a binary check: Does the URL exist? or Is the URL in the retrieved set?

This approach misses the nuance of modern LLM behavior. Models are optimized for fluency and synthesis. When generating text, they often compress information from multiple sources or substitute a "canonical" URL from their training data in place of the actual retrieved document. Because the output is semantically coherent and the links are valid, these errors slip past standard automated tests and are rarely caught until they reach end-users or compliance audits.

Data-Backed Evidence

Analysis of grounded queries across major search-tool APIs reveals that citation errors are not rare edge cases; they are systemic. In controlled audits of factual and product-oriented queries, citation failures account for a significant percentage of "correct-looking" responses.

Crucially, these failures are not monolithic. They fall into distinct categories, each with a different root cause and requiring a different mitigation strategy. Treating them all as "hallucinations" prevents engineers from applying the correct fix.

WOW Moment: Key Findings

The critical insight for engineering teams is that not all citation errors are equal. A fabricated URL is a hard failure. A URL substitution might be acceptable depending on the use case. Anchor-text drift is the hardest to detect but often the most damaging in regulated industries.

The following matrix breaks down the four distinct failure modes observed in production environments.

Failure Mode	Mechanism	Detection Difficulty	User Impact
Fabricated URL	Model generates a plausible URL pattern not present in the retrieved context.	Low	High (Broken link / 404)
Retrieve-then-Misquote	Model cites a real URL, but the claim is supported by a different source or synthesis.	Medium	High (Misinformation)
URL Substitution	Model cites a "canonical" URL from training data instead of the actual retrieved source.	Medium	Medium (Broken audit trail)
Anchor-Text Drift	Model cites the correct URL, but the phrasing subtly alters the meaning of the source.	High	High (Compliance risk)

Why this matters: By categorizing errors, teams can prioritize fixes. Blocking fabricated URLs is a quick win. Fixing anchor-text drift requires sophisticated semantic verification. Understanding the distribution of these errors allows for better resource allocation in model evaluation.

Core Solution

Architecture: The Citation Faithfulness Layer

To address these failures, we introduce a Citation Faithfulness Layer that sits between the LLM output and the user interface. This layer performs automated verification before the response is rendered.

The architecture consists of three stages:

Extraction: Parse the LLM response to identify claims and their associated citations.
Verification: Compare claims against the retrieved context using a tiered verification strategy.
Enforcement: Apply policy-based actions (block, warn, or pass) based on the verification results.

Implementation Strategy

We will implement this using TypeScript. The solution focuses on modularity, allowing different verification strategies to be plugged in based on the required strictness.

1. Data Models

Define the structures for claims, citations, and verification results.

// models.ts

export interface Citation {
  url: string;
  claim: string;
}

export interface VerificationResult {
  citation: Citation;
  status: 'VALID' | 'FABRICATED' | 'MISQUOTE' | 'SUBSTITUTION' | 'DRIFT';
  details: string;
}

export interface FaithfulnessReport {
  totalCitations: number;
  validCitations: number;
  errors: VerificationResult[];
  summary: {
    fabricated: number;
    misquote: number;
    substitution: number;
    drift: number;
  };
}

2. The Verification Engine

The core logic implements the tiered verification strategy. It first checks for fabricated URLs, then validates the claim against the source text.

// verifier.ts

import { Citation, VerificationResult, FaithfulnessReport } from './models';

export class CitationVerifier {
  private retrievedUrls: Set<string>;
  private contextMap: Map<string, string>;

  constructor(retrievedUrls: string[], contextMap: Map<string, string>) {
    this.retrievedUrls = new Set(retrievedUrls);
    this.contextMap = contextMap;
  }

  public verify(citations: Citation[]): FaithfulnessReport {
    const results: VerificationResult[] = [];
    const summary = { fabricated: 0, misquote: 0, substitution: 0, drift: 0 };

    for (const citation of citations) {
      const result = this.verifySingle(citation);
      results.push(result);
      
      if (result.status !== 'VALID') {
        summary[result.status.toLowerCase() as keyof typeof summary]++;
      }
    }

    return {
      totalCitations: citations.length,
      validCitations: citations.length - results.filter(r => r.status !== 'VALID').length,
      errors: results.filter(r => r.status !== 'VALID'),
      summary
    };
  }

  private verifySingle(citation: Citation): VerificationResult {
    // Check 1: Fabricated URL
    if (!this.retrievedUrls.has(citation.url)) {
      return {
        citation,
        status: 'FABRICATED',
        details: 'URL not found in retrieved context.'
      };
    }

    const sourceText = this.contextMap.get(citation.url) || '';
    
    // Check 2: Claim Support (Semantic)
    const isSupported = this.checkClaimSupport(sourceText, citation.claim);
    
    if (!isSupported) {
      // Check 3: Substitution (Is claim supported by another URL?)
      const isSubstituted = this.checkSubstitution(citation.url, citation.claim);
      if (isSubstituted) {
        return {
          citation,
          status: 'SUBSTITUTION',
          details: 'Claim supported by a different retrieved URL.'
        };
      }
      
      return {
        citation,
        status: 'MISQUOTE',
        details: 'Claim not supported by cited URL.'
      };
    }

    // Check 4: Anchor-Text Drift (Semantic nuance)
    const hasDrift = this.checkDrift(sourceText, citation.claim);
    if (hasDrift) {
      return {
        citation,
        status: 'DRIFT',
        details: 'Claim supported but phrasing drifts from source.'
      };
    }

    return {
      citation,
      status: 'VALID',
      details: 'Citation verified successfully.'
    };
  }

  // Placeholder for semantic verification logic
  private checkClaimSupport(text: string, claim: string): boolean {
    // In production, use an embedding model or LLM-as-a-judge
    // to determine if the text supports the claim.
    return text.includes(claim) || this.semanticMatch(text, claim);
  }

  private checkSubstitution(citedUrl: string, claim: string): boolean {
    // Check if the claim is supported by any OTHER retrieved URL
    for (const url of this.retrievedUrls) {
      if (url !== citedUrl) {
        const text = this.contextMap.get(url) || '';
        if (this.checkClaimSupport(text, claim)) {
          return true;
        }
      }
    }
    return false;
  }

  private checkDrift(text: string, claim: string): boolean {
    // Detect subtle semantic shifts (e.g., "supports OAuth" vs "OAuth-compliant")
    // Requires fine-grained semantic analysis.
    return this.detectSemanticShift(text, claim);
  }

  // Mock semantic functions
  private semanticMatch(text: string, claim: string): boolean {
    return false; // Replace with actual implementation
  }

  private detectSemanticShift(text: string, claim: string): boolean {
    return false; // Replace with actual implementation
  }
}

3. Enforcement Policy

Define how the system should react to different error types.

// policy.ts

import { VerificationResult } from './models';

export type EnforcementAction = 'BLOCK' | 'WARN' | 'PASS';

export class EnforcementPolicy {
  public determineAction(result: VerificationResult): EnforcementAction {
    switch (result.status) {
      case 'FABRICATED':
        return 'BLOCK'; // Hard failure
      case 'MISQUOTE':
        return 'BLOCK'; // High risk of misinformation
      case 'SUBSTITUTION':
        return 'WARN'; // Acceptable in some contexts, but flagged
      case 'DRIFT':
        return 'WARN'; // Requires human review for compliance
      case 'VALID':
        return 'PASS';
      default:
        return 'WARN';
    }
  }
}

Architecture Decisions and Rationale

Modular Verification: By separating the verification logic from the enforcement policy, the system can be adapted to different use cases. A customer support bot might tolerate substitutions, while a legal research tool must block them.
Tiered Checking: The verification process is ordered by cost and complexity. Fabricated URLs are checked first (cheap, deterministic). Semantic checks are performed only if the URL is valid, optimizing for performance.
Context Map: Storing retrieved content in a map allows for efficient lookups during verification, avoiding repeated network requests.
Semantic Fallbacks: The implementation includes placeholders for semantic matching. In production, this should be backed by a dedicated embedding model or a smaller LLM fine-tuned for fact-checking.

Pitfall Guide

1. Ignoring URL Substitution

Explanation: Teams often focus only on fabricated URLs and misquotes, assuming that if the URL is real and the claim is true, the citation is valid. However, URL substitution breaks the audit trail. If the model cites a canonical documentation page instead of the specific forum post that contained the answer, the user cannot verify the exact source of the information. Fix: Implement substitution detection by checking if the claim is supported by any other retrieved URL. Flag these as warnings or block them in regulated contexts.

2. Over-Reliance on Exact String Matching

Explanation: Using simple string matching (text.includes(claim)) to verify citations is insufficient. LLMs paraphrase content, so exact matches will miss many valid citations, leading to false positives for misquotes. Fix: Use semantic similarity checks (embeddings) or an LLM-as-a-judge approach to determine if the source text supports the claim, even if the wording differs.

3. Neglecting Anchor-Text Drift

Explanation: Anchor-text drift is subtle. A model might change "The product supports OAuth" to "The product is OAuth-compliant." While similar, the latter is a stronger claim that might not be true. Automated tools often miss this because the URL is valid and the topic matches. Fix: Implement fine-grained semantic analysis to detect shifts in meaning. This may require a specialized model or human-in-the-loop review for critical claims.

4. Treating All Errors Equally

Explanation: Applying the same enforcement policy to all citation errors can lead to poor user experience. Blocking a response for a URL substitution might be unnecessary if the substituted URL is equally authoritative. Fix: Use a tiered enforcement policy. Block hard failures (fabricated URLs, misquotes) but warn on softer failures (substitutions, drift) where appropriate.

5. Failing to Handle Synthesis Claims

Explanation: When a model synthesizes information from multiple sources, it may cite only one of them. This is not necessarily a misquote, but it is incomplete attribution. Standard verification might flag this as an error. Fix: Allow for multi-citation claims. If a claim is supported by the union of multiple retrieved documents, ensure the model cites all relevant sources or explicitly indicates synthesis.

6. Performance Bottlenecks

Explanation: Running semantic verification on every citation can introduce significant latency, especially for long responses with many citations. Fix: Optimize the verification pipeline. Use caching for repeated checks, parallelize verification where possible, and consider sampling citations for large responses if full verification is too costly.

7. Lack of Human Review Workflow

Explanation: Automated verification is not perfect. Edge cases and nuanced errors will slip through. Without a mechanism for human review, these errors can persist. Fix: Integrate a human-in-the-loop workflow for flagged citations. Provide reviewers with the claim, the cited text, and the verification result to facilitate quick decisions.

Production Bundle

Action Checklist

Define Citation Schema: Establish a consistent format for claims and citations in your LLM prompts and response parsing.
Implement URL Extraction: Build a parser to reliably extract URLs and associated claims from the model output.
Set Up Context Map: Store retrieved documents in a fast-access structure (e.g., in-memory map or cache) for verification.
Deploy Verification Engine: Integrate the CitationVerifier into your response pipeline.
Configure Enforcement Policy: Define rules for blocking, warning, or passing based on your risk tolerance.
Add Semantic Verification: Implement embedding-based or LLM-based claim support checks.
Monitor Metrics: Track citation error rates by category to identify trends and model regressions.
Establish Review Workflow: Create a process for human review of flagged citations.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Customer Support Bot	Block Fabricated/Misquote; Warn on Substitution/Drift	High volume, lower risk. Focus on preventing broken links and clear misinformation.	Low
Legal Research Tool	Block All Errors; Require Multi-Citation	Zero tolerance for errors. Audit trail is critical.	High
Internal Knowledge Base	Warn on All Errors; Allow Override	Balance accuracy with usability. Users can verify sources manually.	Medium
High-Stakes Financial Data	Block All Errors; Human Review Required	Compliance and regulatory requirements demand strict verification.	Very High

Configuration Template

# citation-verification-config.yaml

verification:
  strategy: "tiered"
  semantic_threshold: 0.75
  drift_sensitivity: "high"

enforcement:
  policies:
    - error_type: "FABRICATED"
      action: "BLOCK"
    - error_type: "MISQUOTE"
      action: "BLOCK"
    - error_type: "SUBSTITUTION"
      action: "WARN"
    - error_type: "DRIFT"
      action: "WARN"

monitoring:
  metrics:
    - "citation_error_rate"
    - "error_distribution"
    - "verification_latency"
  alerts:
    - threshold: 0.05
      metric: "citation_error_rate"
      action: "notify_engineering"

Quick Start Guide

Install Dependencies: Ensure you have the necessary libraries for text processing and semantic analysis (e.g., @langchain/core, @tensorflow-models/universal-sentence-encoder).
Define Models: Copy the models.ts definitions into your project.
Implement Verifier: Use the verifier.ts code as a starting point. Replace the mock semantic functions with actual implementations.
Configure Policy: Set up your enforcement rules based on your use case.
Integrate: Add the verification step to your response pipeline. Test with a sample of queries to validate the logic.
Deploy: Roll out the verification layer and monitor metrics. Adjust thresholds and policies as needed.