bration:** Require the model to assign a confidence level to each claim based on the clarity and recency of the source text. Low-confidence items should trigger human review.
4. Unknown Declaration: The system must explicitly list what cannot be determined from the provided sources, preventing the model from filling gaps with inference.
Implementation Example
The following TypeScript implementation demonstrates a source-grounded research engine. This code defines the data structures and a validation function that enforces the constraints derived from the source material.
// Core types enforcing source-awareness
type FilingType = '10-K' | '10-Q' | '8-K' | 'S-1' | 'Earnings-Transcript' | 'News';
interface SourceCitation {
filingType: FilingType;
filingDate: string; // ISO 8601 format
exactExcerpt: string; // Must match source text
relevance: string; // Why this excerpt matters
confidence: 'High' | 'Medium' | 'Low';
}
interface FinancialClaim {
metric: string;
value: string;
period: string;
citations: SourceCitation[];
isManagementNarrative: boolean;
}
interface ResearchSnapshot {
businessModel: string;
revenueStreams: string[];
concentrationRisks: string[];
keyChanges: FinancialClaim[];
unknowns: string[];
citations: SourceCitation[];
}
// Validation logic to ensure compliance
function validateResearchOutput(snapshot: ResearchSnapshot): ValidationResult {
const errors: string[] = [];
// Check for missing citations on key changes
snapshot.keyChanges.forEach(change => {
if (change.citations.length === 0) {
errors.push(`Missing citation for metric: ${change.metric}`);
}
change.citations.forEach(cit => {
if (!cit.exactExcerpt || cit.exactExcerpt.length < 20) {
errors.push(`Excerpt too short or missing for ${cit.filingType} on ${cit.filingDate}`);
}
});
});
// Ensure unknowns are declared
if (snapshot.unknowns.length === 0) {
errors.push('Research must explicitly list unknowns or areas requiring human review.');
}
return {
isValid: errors.length === 0,
errors
};
}
// Example usage of the constrained extraction
async function generateCompanySnapshot(
ticker: string,
filingTexts: string[]
): Promise<ResearchSnapshot> {
// In production, this would invoke an LLM with a system prompt enforcing
// the ResearchSnapshot schema and requiring exact excerpts.
const prompt = `
You are a source-aware equity research assistant.
Analyze the provided filings for ${ticker}.
OUTPUT REQUIREMENTS:
1. Return a JSON object matching the ResearchSnapshot interface.
2. Every claim in keyChanges must include at least one SourceCitation.
3. Citations must include the exact text excerpt from the filing.
4. Mark management narrative claims separately from reported numbers.
5. List all unknowns that cannot be resolved from the provided text.
6. Do not provide investment advice or price targets.
FILLINGS:
${filingTexts.join('\n---\n')}
`;
// LLM invocation would occur here with JSON schema enforcement
const result = await llmClient.generate<ResearchSnapshot>(prompt);
const validation = validateResearchOutput(result);
if (!validation.isValid) {
throw new Error(`Validation failed: ${validation.errors.join(', ')}`);
}
return result;
}
Rationale for Choices
exactExcerpt Requirement: Forcing the model to output the exact text snippet allows for programmatic verification. You can hash the excerpt and match it against the raw filing text to detect hallucinations.
isManagementNarrative Flag: Management Discussion and Analysis (MD&A) sections often contain forward-looking statements or adjusted metrics. Tagging these separately ensures the researcher can distinguish between audited GAAP numbers and management spin.
unknowns Array: This prevents the model from hallucinating answers to fill gaps. In financial research, knowing what is missing is as valuable as knowing what is present.
- Validation Layer: The
validateResearchOutput function acts as a guardrail. If the model fails to cite a claim or omits unknowns, the pipeline rejects the output, forcing a retry or human intervention.
Pitfall Guide
Even with structured prompts, specific failure modes can compromise research integrity. The following pitfalls are common in production environments and require explicit mitigation.
-
Period Mismatch Confusion
- Explanation: The model mixes quarterly (10-Q) and annual (10-K) data, or compares Q3 2023 against Q4 2023 without labeling the periods. This leads to incorrect growth calculations.
- Fix: Enforce explicit period tagging in all comparisons. Require the model to state the exact fiscal period for every metric. Use prompts that explicitly instruct: "Flag if metrics are not comparable due to period differences."
-
EBITDA vs. Cash Flow Conflation
- Explanation: Models often treat Adjusted EBITDA as a proxy for cash generation. This ignores capital expenditures, working capital changes, and debt service, which are critical for liquidity assessment.
- Fix: In liquidity checks, explicitly request Operating Cash Flow and Free Cash Flow. Add a constraint: "Do not equate Adjusted EBITDA with cash flow. Extract cash flow from the statement of cash flows."
-
Dilution Blindness
- Explanation: The model focuses on revenue and margin but ignores changes in share count, ATM programs, or warrant exercises. Dilution can significantly impact per-share value even if top-line growth is strong.
- Fix: Include a dedicated dilution checklist in the workflow. Require extraction of share count changes, shelf registrations, and recent financing activities. Prompt: "Check for share count changes, ATM programs, and convertible instruments."
-
Social Sentiment Contamination
- Explanation: The model incorporates Reddit, StockTwits, or news rumors as evidence of business quality. Crowd excitement is attention context, not fundamental proof.
- Fix: Isolate crowd data in a separate "Attention Context" section. Add a rule: "Do not treat social sentiment as evidence of business performance. Label crowd discussion clearly as unverified chatter."
-
Boilerplate Risk Acceptance
- Explanation: The model lists generic risk factors (e.g., "competition," "economic downturn") without assessing whether they are company-specific or newly intensified. This dilutes the signal-to-noise ratio.
- Fix: Implement risk triage. Require the model to rank risks by directness to business performance and mark risks as "Generic," "Company-Specific," or "Newly Intensified." Prompt: "Do not list generic boilerplate risks unless they have changed significantly."
-
Citation Drift
- Explanation: The model cites a filing type and date but provides an excerpt that does not match the source text, or the excerpt is too vague to verify.
- Fix: Enforce minimum excerpt length and require exact string matching. Use the validation layer to reject outputs where excerpts are paraphrased or missing.
-
Narrative vs. Number Gap
- Explanation: Management claims "strong growth" or "market leadership," but the underlying numbers do not support these assertions. The model may accept the narrative without auditing it against the data.
- Fix: Implement a management claim audit. Create a section that lists claims supported by numbers, claims not yet proven, and claims needing context. Prompt: "Audit management narrative against reported metrics. Flag discrepancies."
Production Bundle
Action Checklist
Use this checklist to validate your research pipeline before deployment.
Decision Matrix
Select the appropriate approach based on your research requirements and resource constraints.
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-Volume Screening | Automated extraction with validation | Speed and consistency; validation catches major errors | Low API cost; high throughput |
| Deep Dive Analysis | Human-in-the-loop with source-grounded AI | Complex filings require nuanced judgment; AI handles extraction | Higher human cost; moderate API cost |
| Risk-Heavy Sectors | Strict risk triage + liquidity audit | Regulatory and financing risks require precise extraction | Moderate API cost; requires specialized prompts |
| Social-Driven Tickers | Isolated crowd context + filing verification | Prevents sentiment bias; ensures fundamental grounding | Low API cost; requires crowd data ingestion |
Configuration Template
Use this JSON configuration to define the constraints for your research pipeline. This template can be loaded dynamically to adjust behavior based on the research stage.
{
"pipelineConfig": {
"version": "2.0",
"constraints": {
"requireExactExcerpts": true,
"minExcerptLength": 20,
"enforcePeriodTagging": true,
"separateNarrativeFromFacts": true,
"requireUnknownDeclaration": true,
"prohibitedOutputs": [
"price_targets",
"buy_sell_recommendations",
"return_predictions"
]
},
"riskTriage": {
"rankByDirectness": true,
"filterGenericBoilerplate": true,
"flagNewlyIntensified": true
},
"liquidityCheck": {
"checkDilution": true,
"checkATMPrograms": true,
"checkShelfRegistrations": true,
"checkGoingConcern": true
},
"validation": {
"rejectMissingCitations": true,
"rejectPeriodMismatches": true,
"rejectSocialSentimentAsFact": true
}
}
}
Quick Start Guide
Get a source-grounded research workflow running in under five minutes.
- Define Your Schema: Create TypeScript interfaces for
SourceCitation, FinancialClaim, and your research output types. Ensure all required fields are marked as non-optional.
- Ingest Primary Sources: Load SEC filings (10-K, 10-Q, 8-K) into your context window. Ensure text is clean and metadata (filing date, type) is attached.
- Construct Constrained Prompts: Write prompts that explicitly require the output schema, exact excerpts, and confidence levels. Include rules to separate narrative from facts and declare unknowns.
- Implement Validation: Add a validation function that checks for missing citations, period mismatches, and prohibited outputs. Integrate this into your pipeline to reject non-compliant results.
- Run and Review: Execute the pipeline on a test ticker. Verify that all claims are cited, unknowns are listed, and crowd context is isolated. Iterate on prompts based on validation failures.