LM owns this space entirely. When a new source is ingested, the synthesis engine:
- Extracts factual claims and maps them to existing pages
- Creates new pages for novel concepts
- Updates cross-references and dependency graphs
- Flags contradictions with existing knowledge
- Appends changes to a versioned log
Layer 3: Orchestration Schema (Instructional)
A single configuration file (e.g., AGENTS.md or SYSTEM.md) defines the conventions for ingestion, querying, and maintenance. It instructs the model on how to interact with the synthesis layer, enforce naming standards, and handle edge cases.
Implementation Architecture
The backend leverages SQLite with two specialized extensions: FTS5 for lexical keyword search and sqlite-vec for dense vector operations. Embeddings are generated locally using bge-base-en-v1.5, a 200MB model optimized for Apple Silicon and CPU inference. This eliminates API dependencies, rate limits, and data exfiltration risks.
Search operates as a hybrid pipeline combining BM25 (lexical) and cosine similarity (semantic) scores, fused via Reciprocal Rank Fusion (RRF). This balances exact term matching with conceptual relevance, significantly reducing false positives common in pure vector search.
TypeScript Integration Example
import { Database } from 'better-sqlite3';
import { createEmbedding } from './embedding-runtime';
import { KnowledgeSchema } from './schema-types';
export class KnowledgeSynthesisEngine {
private db: Database;
private schemaVersion: string;
constructor(dbPath: string, schemaVersion: string) {
this.db = new Database(dbPath);
this.schemaVersion = schemaVersion;
this.initializeTables();
}
private initializeTables(): void {
this.db.exec(`
CREATE TABLE IF NOT EXISTS knowledge_nodes (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
embedding BLOB,
last_updated INTEGER DEFAULT (strftime('%s', 'now')),
version INTEGER DEFAULT 1
);
CREATE VIRTUAL TABLE IF NOT EXISTS node_search USING fts5(title, content);
CREATE TRIGGER IF NOT EXISTS sync_search AFTER INSERT ON knowledge_nodes
BEGIN
INSERT INTO node_search(rowid, title, content) VALUES (new.id, new.title, new.content);
END;
`);
}
async ingestSource(rawContent: string, sourceMeta: Record<string, string>): Promise<void> {
const synthesisPrompt = this.buildSynthesisPrompt(rawContent, sourceMeta);
const structuredOutput = await this.invokeModel(synthesisPrompt);
for (const node of structuredOutput.nodes) {
const embedding = await createEmbedding(node.content);
this.upsertNode(node.id, node.title, node.content, embedding);
}
this.logIngestion(rawContent, sourceMeta);
}
async queryKnowledge(query: string, topK: number = 5): Promise<KnowledgeSchema.Node[]> {
const queryEmbedding = await createEmbedding(query);
const lexicalResults = this.db.prepare(`
SELECT id, title, content, rank
FROM node_search
WHERE node_search MATCH ?
ORDER BY rank LIMIT ?
`).all(query, topK);
const semanticResults = this.db.prepare(`
SELECT id, title, content,
vec_distance_cosine(embedding, ?) AS similarity
FROM knowledge_nodes
ORDER BY similarity ASC LIMIT ?
`).all(queryEmbedding, topK);
return this.fuseResults(lexicalResults, semanticResults, topK);
}
private fuseResults(lexical: any[], semantic: any[], limit: number): KnowledgeSchema.Node[] {
const rankMap = new Map<string, number>();
lexical.forEach((item, idx) => {
rankMap.set(item.id, (rankMap.get(item.id) || 0) + 1 / (idx + 1));
});
semantic.forEach((item, idx) => {
rankMap.set(item.id, (rankMap.get(item.id) || 0) + 1 / (idx + 1));
});
return Array.from(rankMap.entries())
.sort((a, b) => b[1] - a[1])
.slice(0, limit)
.map(([id]) => this.getNodeById(id));
}
private upsertNode(id: string, title: string, content: string, embedding: Buffer): void {
const tx = this.db.transaction(() => {
this.db.prepare(`
INSERT INTO knowledge_nodes (id, title, content, embedding)
VALUES (?, ?, ?, ?)
ON CONFLICT(id) DO UPDATE SET
content = excluded.content,
embedding = excluded.embedding,
last_updated = strftime('%s', 'now'),
version = version + 1
`).run(id, title, content, embedding);
});
tx();
}
}
Architecture Rationale
- Local-First Storage: SQLite eliminates network round-trips and external dependencies. The entire knowledge base is a single file, trivial to backup, version, or migrate.
- Hybrid Search via RRF: Pure vector search struggles with exact identifiers (e.g.,
payments-v2, OAuth2.1). Pure lexical search misses conceptual matches. RRF mathematically balances both, improving precision without complex re-ranking models.
- Markdown as Source of Truth: The SQLite index is derived from disk files. If the database corrupts, it rebuilds from markdown. This ensures compatibility with Obsidian, VS Code, and Git workflows.
- Daemon Mode for Latency: Loading
bge-base-en-v1.5 takes 2-3 seconds on cold start. Running a persistent background process warms the model in memory, reducing subsequent search latency to 50-150ms.
Pitfall Guide
1. Schema Drift
Explanation: Instruction files (AGENTS.md) become outdated as project conventions evolve. The LLM continues following deprecated formatting or ingestion rules, causing inconsistent wiki entries.
Fix: Version-control the schema file. Implement a pre-commit hook that validates new wiki entries against the current schema version. Force schema updates through explicit model prompts rather than implicit behavior.
2. Contradiction Blindness
Explanation: LLMs tend to merge conflicting information rather than flagging it. Two architecture decisions from different quarters might coexist without warning, leading to implementation errors.
Fix: Add explicit contradiction detection to the synthesis prompt. Require the model to output a conflicts array when new facts clash with existing nodes. Implement a human-in-the-loop review queue for flagged conflicts.
3. Cold-Start Embedding Latency
Explanation: Loading the embedding model on every CLI invocation adds 2-3 seconds of overhead. In rapid iteration workflows, this compounds into significant friction.
Fix: Deploy a persistent daemon (kb serve --detached equivalent). Route all search/ingest calls through a local IPC or HTTP endpoint. Keep the model resident in memory during active development sessions.
4. Over-Indexing Raw Artifacts
Explanation: Ingesting every log file, debug output, or transient note bloats the knowledge base with noise. The synthesis layer wastes tokens processing irrelevant data.
Fix: Implement a curation pipeline. Only ingest sources that pass a relevance filter (e.g., architecture docs, decision records, API specs). Use file-type allowlisting and size thresholds before triggering synthesis.
5. Vector Drift & Stale Embeddings
Explanation: As wiki pages are updated, their embeddings may not reflect the latest content if re-embedding is skipped for performance reasons. Search results gradually degrade.
Fix: Tie embedding regeneration to the version column. Trigger async re-embedding on every UPDATE. Use a background worker pool to handle embedding jobs without blocking the main synthesis loop.
6. Git Merge Conflicts in Wiki Files
Explanation: Multiple agents or developers editing markdown files simultaneously creates merge conflicts. LLMs lack native conflict resolution strategies for structured text.
Fix: Enforce atomic writes with file locking. Use a branch-per-feature workflow for knowledge updates. Implement a deterministic merge strategy that prioritizes the latest last_updated timestamp and preserves cross-reference integrity.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Rapid prototyping / local dev | Compilation/Wiki Pattern (Local SQLite) | Zero API costs, instant iteration, full data sovereignty | $0 infrastructure, minimal compute |
| Enterprise multi-agent orchestration | Hybrid (Wiki + Centralized Vector DB) | Wiki handles synthesis, vector DB enables cross-team search | Moderate cloud costs, high ROI on knowledge reuse |
| High-frequency real-time queries | Daemon-backed Hybrid Search | Warm model cache reduces latency to <150ms | Higher RAM usage, negligible CPU overhead |
| Compliance-heavy / regulated data | Local-First Compilation | No data leaves the machine, full audit trail via Git | Zero third-party risk, higher internal maintenance |
Configuration Template
# AGENTS.md - Knowledge Synthesis Protocol
## Core Directives
1. You maintain a persistent wiki located in `./knowledge/`.
2. Raw sources are immutable. Never modify original documents.
3. Synthesize facts into entity pages, concept summaries, and decision logs.
4. Update cross-references when dependencies change.
5. Flag contradictions explicitly using the `[[CONFLICT]]` tag.
## File Structure
- `./knowledge/entities/` - System components, services, libraries
- `./knowledge/concepts/` - Architectural patterns, protocols, methodologies
- `./knowledge/decisions/` - ADRs, trade-off analyses, implementation choices
- `./knowledge/logs/` - Chronological synthesis history
## Search & Ingestion Rules
- Use `kb search <query>` to retrieve relevant context before answering.
- Use `kb add <source>` to ingest new documentation.
- Use `kb update <entity>` to modify existing pages.
- Always verify cross-references after updates.
- Maintain markdown formatting consistency.
Quick Start Guide
- Initialize the knowledge store: Create a dedicated directory for markdown files and configure the SQLite backend with FTS5 and vector extensions. Set the embedding model path to your local cache.
- Deploy the orchestration schema: Place
AGENTS.md in your project root. Define naming conventions, file structure, and synthesis rules. Ensure your AI agent reads this file on initialization.
- Start the embedding daemon: Launch the background service to warm the model in memory. Verify latency drops below 200ms for subsequent queries.
- Ingest initial sources: Run the synthesis pipeline against your existing architecture docs, decision records, and API specifications. Verify cross-references and contradiction flags.
- Validate agent behavior: Query the knowledge base through your agent. Confirm it autonomously calls search, update, and add operations without manual prompt engineering. Monitor synthesis logs for accuracy.