ires separating data assembly from language model inference. The architecture should treat context construction as a deterministic engineering problem, not a probabilistic one.
Step 1: Define a Bounded Workflow Scope
Start with a single, measurable use case. Instead of deploying an "enterprise copilot," target a specific operational task: resolving account status, retrieving contract renewal dates, or summarizing open support tickets. Bounded scopes prevent scope creep, establish clear success metrics, and allow iterative refinement of the data pipeline without blocking other teams.
Step 2: Implement a Thin Canonical Layer
Create a resolved view for the entities your workflow touches. This is not a full data warehouse. It is a materialized mapping table that links external identifiers to a stable internal ID. The layer should include:
- A primary canonical identifier
- Cross-reference mappings to source system keys
- Field ownership declarations (which system is authoritative for which attribute)
- Update timestamps for freshness tracking
This layer acts as the single source of truth for entity resolution. When a query arrives, the system translates the user's input into the canonical ID, then routes field requests to the appropriate source systems.
Step 3: Build a Permission-Aware Retrieval Router
The retrieval router must inherit the requesting user's identity and access scope. It should never use a blanket service account for data fetching. Instead, it propagates the user's credentials or session tokens to each downstream system, or applies a policy engine that filters results based on the user's role. The router should also enforce schema alignment by translating source field names into a unified context vocabulary before assembly.
Step 4: Instrument Retrieval and Generation Separately
Log every record fetched, filtered, or excluded during context construction. Store retrieval traces independently from model outputs. This separation enables precise failure diagnosis: if the right records were retrieved but the answer is wrong, the issue is reasoning or prompt structure. If the records are missing, stale, or incorrectly filtered, the issue is data integration.
Implementation Example (TypeScript)
The following architecture demonstrates a permission-aware retrieval router with canonical resolution and explicit tracing.
import { v4 as uuidv4 } from 'uuid';
interface DataSourceConfig {
name: string;
endpoint: string;
authStrategy: 'user-propagated' | 'service-key';
fieldMapping: Record<string, string>;
maxAgeSeconds: number;
}
interface ResolvedEntity {
canonicalId: string;
sourceKeys: Record<string, string>;
lastSynced: Date;
}
interface RetrievalContext {
userId: string;
requestedFields: string[];
traceId: string;
}
interface RetrievalResult {
context: Record<string, unknown>;
trace: {
fetched: string[];
filtered: string[];
stale: string[];
};
}
class PermissionPolicyEngine {
constructor(private userRoles: string[]) {}
canAccess(resource: string, field: string): boolean {
const policyMap: Record<string, string[]> = {
billing: ['finance', 'admin'],
support: ['agent', 'admin'],
crm: ['sales', 'admin'],
};
const allowedRoles = policyMap[resource] || [];
return allowedRoles.some(role => this.userRoles.includes(role));
}
}
class RetrievalRouter {
private entityCache: Map<string, ResolvedEntity> = new Map();
private traceStore: Map<string, RetrievalResult['trace']> = new Map();
constructor(
private sources: DataSourceConfig[],
private policyEngine: PermissionPolicyEngine
) {}
async resolveEntity(inputKey: string): Promise<ResolvedEntity> {
const cached = this.entityCache.get(inputKey);
if (cached) return cached;
// Simulate cross-system lookup and canonical mapping
const resolved: ResolvedEntity = {
canonicalId: uuidv4(),
sourceKeys: {
billing: `bill_${inputKey}`,
support: `sup_${inputKey}`,
crm: `crm_${inputKey}`,
},
lastSynced: new Date(),
};
this.entityCache.set(inputKey, resolved);
return resolved;
}
async assembleContext(
entityKey: string,
context: RetrievalContext
): Promise<RetrievalResult> {
const entity = await this.resolveEntity(entityKey);
const trace: RetrievalResult['trace'] = { fetched: [], filtered: [], stale: [] };
const assembled: Record<string, unknown> = {};
for (const source of this.sources) {
const sourceKey = entity.sourceKeys[source.name];
if (!sourceKey) continue;
// Permission gate
const hasAccess = context.requestedFields.every(field =>
this.policyEngine.canAccess(source.name, field)
);
if (!hasAccess) {
trace.filtered.push(`${source.name}:${context.requestedFields.join(',')}`);
continue;
}
// Freshness check
const isStale = (Date.now() - entity.lastSynced.getTime()) / 1000 > source.maxAgeSeconds;
if (isStale) {
trace.stale.push(source.name);
continue;
}
// Simulate data fetch with schema alignment
const rawData = await this.fetchFromSource(source, sourceKey);
const alignedData = this.alignSchema(rawData, source.fieldMapping);
Object.assign(assembled, alignedData);
trace.fetched.push(source.name);
}
this.traceStore.set(context.traceId, trace);
return { context: assembled, trace };
}
private async fetchFromSource(source: DataSourceConfig, key: string): Promise<Record<string, unknown>> {
// In production: authenticated HTTP/GraphQL call with user token propagation
return { [source.fieldMapping['status']]: 'active', [source.fieldMapping['tier']]: 'enterprise' };
}
private alignSchema(raw: Record<string, unknown>, mapping: Record<string, string>): Record<string, unknown> {
const aligned: Record<string, unknown> = {};
for (const [sourceField, canonicalField] of Object.entries(mapping)) {
if (raw[sourceField] !== undefined) {
aligned[canonicalField] = raw[sourceField];
}
}
return aligned;
}
getTrace(traceId: string): RetrievalResult['trace'] | undefined {
return this.traceStore.get(traceId);
}
}
Architecture Rationale
- Separation of Concerns: Context assembly is deterministic. Language model inference is probabilistic. Mixing them creates untraceable failure modes. The router handles data topology; the model handles reasoning.
- Permission Propagation: Access control must be evaluated before context construction. Filtering at the model level is unreliable and insecure. The policy engine enforces row-level security deterministically.
- Schema Alignment: Field mapping occurs at retrieval time, not prompt time. This prevents semantic collision and ensures the model receives a consistent vocabulary regardless of source system variations.
- Traceability: Retrieval logs are stored separately from generation logs. This enables precise failure classification and reduces mean time to resolution.
Pitfall Guide
1. The Super Service Account Trap
Explanation: Using a single high-privilege service account to query all data sources bypasses application-level access controls. The agent retrieves data the requesting user cannot legally or operationally access.
Fix: Propagate the end-user's session token or JWT to each downstream system. If direct propagation is impossible, implement a policy engine that filters results based on the user's role before context assembly.
2. Semantic Collision in Field Names
Explanation: Multiple systems use identical field names for different concepts. Feeding status from billing, support, and CRM into the same prompt causes the model to conflate payment state, ticket lifecycle, and deal stage.
Fix: Enforce a canonical vocabulary at the retrieval layer. Map source fields to unified names during assembly. Never pass raw source schemas directly to the model.
3. Nightly Sync Staleness
Explanation: Vector indexes or cached tables rebuilt on a fixed schedule return outdated information. For operational queries, stale data is functionally equivalent to incorrect data.
Fix: Classify fields by freshness requirements. Route real-time fields through direct API calls. Cache only static or slowly changing attributes. Implement TTL-based invalidation with explicit staleness flags in retrieval traces.
4. Unbounded Workflow Scope
Explanation: Attempting to unify the entire data estate before shipping any AI feature. This creates massive integration debt and delays measurable value.
Fix: Scope to a single workflow with clear success criteria. Build a thin canonical layer only for the entities that workflow requires. Expand incrementally after validating retrieval accuracy and permission enforcement.
5. Silent Retrieval Failures
Explanation: The agent returns an answer without logging which records were fetched, filtered, or excluded. When the answer is wrong, debugging requires guesswork.
Fix: Instrument every retrieval step. Log fetched sources, filtered fields, stale flags, and canonical IDs. Store traces in a structured format queryable by trace ID. Separate retrieval logs from generation logs.
6. Over-Engineering the Canonical Model
Explanation: Building a full data warehouse or complex event pipeline before the workflow proves viable. This introduces unnecessary latency and maintenance overhead.
Fix: Start with a materialized view or lightweight mapping table. Add event-driven synchronization only after validating that batch updates cannot meet freshness requirements.
7. Prompt-Driven Entity Resolution
Explanation: Relying on the model to match cus_J4k2 and 0014x as the same customer. Language models are not deterministic matchers and will hallucinate connections or miss edge cases.
Fix: Resolve entities deterministically before prompt construction. Use a dedicated resolution service with explicit mapping rules, fuzzy matching thresholds, and human-in-the-loop override capabilities for ambiguous cases.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-frequency operational queries (account status, ticket state) | Real-time API routing with permission propagation | Ensures freshness and enforces row-level security per request | Higher per-query latency, lower security risk |
| Static reference data (product catalogs, pricing tiers) | Cached canonical view with TTL invalidation | Reduces downstream load and improves response consistency | Low compute cost, requires sync monitoring |
| Cross-system entity matching with ambiguous identifiers | Deterministic resolution service + manual override queue | Prevents hallucinated matches and maintains auditability | Moderate engineering overhead, high accuracy gain |
| Unstructured knowledge retrieval (wikis, runbooks) | Permission-scoped document indexing with metadata filtering | Consolidates fragmented institutional knowledge safely | Storage and indexing costs, high retrieval quality lift |
| Multi-model fallback routing | Context-agnostic retrieval layer with model-agnostic prompt template | Decouples data quality from model selection | Minimal infrastructure change, improves resilience |
Configuration Template
retrieval_pipeline:
canonical_layer:
enabled: true
refresh_interval: 300s
fallback_strategy: "return_stale_with_flag"
sources:
- name: billing_ledger
endpoint: "https://api.billing.internal/v2/accounts"
auth: "user_propagated"
field_mapping:
payment_state: "subscription_status"
plan_tier: "billing_tier"
freshness_ttl: 60s
access_roles: ["finance", "admin"]
- name: support_portal
endpoint: "https://api.support.internal/v1/orgs"
auth: "user_propagated"
field_mapping:
ticket_state: "open_ticket_status"
priority: "support_priority"
freshness_ttl: 30s
access_roles: ["agent", "admin"]
- name: crm_platform
endpoint: "https://api.crm.internal/v3/contacts"
auth: "service_key"
field_mapping:
renewal_date: "contract_renewal"
account_stage: "sales_stage"
freshness_ttl: 3600s
access_roles: ["sales", "admin"]
tracing:
enabled: true
storage: "structured_json"
retention_days: 90
separate_from_generation: true
Quick Start Guide
- Scope the Workflow: Select one operational task (e.g., "Resolve account subscription status and open tickets"). Define success metrics: accuracy threshold, latency budget, and permission compliance rate.
- Deploy the Canonical Mapper: Create a lightweight mapping table linking source identifiers to a stable internal ID. Populate it with known entities from your primary systems.
- Wire the Retrieval Router: Implement the permission-aware router using the configuration template. Connect it to your data sources with user-propagated authentication. Enable retrieval tracing.
- Validate and Iterate: Run 50-100 test queries. Compare retrieval traces against expected results. Classify failures as data integration vs reasoning issues. Adjust field mappings, TTLs, or policy rules before introducing model variations.