Build Your Own LLM Wiki: A Persistent, Queryable Knowledge Base on Zo

By Codcompass Team·2026-05-19·7 min read

Architecting Incremental Knowledge Systems: Beyond Stateless Retrieval

Current Situation Analysis

Modern development and research workflows generate massive volumes of unstructured data. Teams accumulate markdown notes, technical specifications, research PDFs, and internal documentation across multiple platforms. The industry standard for interacting with this data has been keyword search or Retrieval-Augmented Generation (RAG). Both approaches share a fundamental architectural flaw: they treat knowledge as static and stateless.

Keyword search returns file paths. RAG returns retrieved chunks paired with a generated response. Neither approach accumulates understanding. When a query is executed, the system fetches relevant passages, generates an answer, and discards the context. The next query repeats the entire retrieval cycle, even if the underlying documents haven't changed. This creates a perpetual compute tax and prevents the system from evolving alongside the data it processes.

The problem is often overlooked because RAG pipelines are heavily marketed as "AI-native" solutions. In practice, they require maintaining embedding providers, vector databases, chunking strategies, and inference endpoints. The infrastructure overhead frequently eclipses the value of the retrieval itself. More critically, stateless retrieval cannot form cross-document relationships. A concept documented in a 2023 architecture review may directly contradict a 2024 implementation guide, but without a persistent synthesis layer, the system surfaces both as equally valid chunks. The burden of reconciliation falls entirely on the human operator.

The alternative paradigm shifts from retrieval-first to compilation-first. Instead of querying raw documents on demand, the system incrementally reads source material, extracts structured concepts, maps relationships, and compiles them into a persistent knowledge layer. This approach, often referred to as the LLM Wiki pattern, treats documents as raw inputs to a continuous synthesis pipeline. The knowledge base grows organically, queries operate against compiled structure rather than fragmented chunks, and cross-referencing becomes a native capability rather than an afterthought.

WOW Moment: Key Findings

The architectural shift from stateless retrieval to incremental compilation fundamentally changes how knowledge systems scale. The following comparison highlights the operational differences across three common approaches:

Approach	Context Persistence	Cross-Document Synthesis	Infrastructure Complexity	Query Latency (Post-Build)
Static Keyword Search	None	None	Low	<50ms
Traditional RAG	Stateless (per query)	Limited (chunk overlap only)	High (vector DB, embeddings, retrieval layer)	200-800ms
Incremental Compilation	Persistent & cumulative	Native (relationship mapping)	Medium (workspace + compilation prompts)	100-300ms

Traditional RAG optimizes for immediate retrieval accuracy but pays for it with recurring compute costs and fragmented context. Incremental compilation front-loads the synthesis work. Once the knowledge layer is built, queries operate against structured markdown notes rather than raw embeddings. This reduces latency, eliminates redundant retrieval cycles, and surfaces relationshi

ps that span multiple source documents. The system stops acting as a search engine and starts functioning as a compounding knowledge asset.

Core Solution

Building an incremental knowledge system requires three architectural decisions: persistent storage, structured compilation, and constrained querying. The following implementation demonstrates how to orchestrate this workflow using a persistent workspace environment.

Step 1: Establish the Persistent Storage Schema

Raw documents, staged inputs, and compiled outputs must be isolated to prevent contamination during synthesis. A clean directory structure enforces separation of concerns:

/knowledge-archival/
  /raw-inputs/
    /pdfs/
    /markdown/
    /specs/
  /staging/
  /synthesized/
    /concepts/
    /relationships/
    /master-index.md

/raw-inputs/ holds source material. /staging/ acts as a buffer for new documents before compilation. /synthesized/ contains the structured output. This separation ensures that queries never accidentally read unprocessed files, and compilation prompts can target specific directories without ambiguity.

Step 2: Orchestrate Incremental Compilation

Compilation runs in two phases: concept extraction and relationship mapping. Both phases use deterministic prompt templates that output standardized markdown. The orchestration script below demonstrates how to trigger these phases programmatically:

import { readFileSync, writeFileSync, existsSync } from 'fs';
import { join } from 'path';

interface CompilationConfig {
  workspaceRoot: string;
  modelEndpoint: string;
  apiKey: string;
}

class KnowledgeCompiler {
  private config: CompilationConfig;

  constructor(config: CompilationConfig) {
    this.config = config;
  }

  async extractConcepts(sourceFile: string): Promise<string> {
    const prompt = `
      Analyze ${sourceFile} and extract core technical concepts.
      Output a markdown file with the following structure:
      - Concept name (H2)
      - Definition (1-2 sentences)
      - Key properties (bullet list)
      - Source reference (filename + section)
      Maintain consistent formatting across all concept notes.
    `;
    return this.invokeModel(prompt);
  }

  async mapRelationships(conceptDir: string): Promise<string> {
    const files = this.listMarkdownFiles(conceptDir);
    const prompt = `
      Review the following concept files: ${files.join(', ')}.
      Identify up to 5 implicit relationships between concepts.
      For each relationship, generate a markdown note containing:
      - Concept A and Concept B
      - Relationship type (dependency, contradiction, extension, alternative)
      - Evidence summary (1-2 sentences)
      Append all relationships to the master index.
    `;
    return this.invokeModel(prompt);
  }

  private async invokeModel(prompt: string): Promise<string> {
    // Placeholder for actual API call to persistent workspace model
    // Returns compiled markdown string
    return `# Compiled Output\n${prompt}\n// Model response placeholder`;
  }

  private listMarkdownFiles(dir: string): string[] {
    // Implementation would scan directory and return .md files
    return [];
  }
}

The compiler separates extraction from relationship mapping. This two-pass approach prevents context window overflow and ensures each concept is normalized before cross-referencing. The output is always structured markdown, which remains human-readable, version-controllable, and easily parsable by downstream query engines.

Step 3: Configure Constrained Querying

Once compilation completes, the system must answer questions using only the synthesized layer. This requires a persona configuration that enforces citation, scope boundaries, and gap detection:

ROLE: Knowledge Synthesis Engine
SCOPE: /synthesized/concepts/ and /synthesized/relationships/
CONSTRAINTS:
  - Answer exclusively from compiled notes
  - Cite source concept file for every claim
  - Flag unanswered questions as "knowledge gaps"
  - Never speculate beyond documented relationships
OUTPUT_FORMAT:
  1. Direct answer (2-3 sentences)
  2. Supporting evidence (bullet list with citations)
  3. Gap analysis (if applicable)

This configuration transforms the model from a generative engine into a verification layer. It cannot invent relationships, must ground every statement in compiled markdown, and explicitly surfaces missing coverage. The architecture prioritizes accuracy over fluency, which is critical for technical and research workflows.

Pitfall Guide

1. Aggressive Pre-Chunking Before Compilation

Explanation: Splitting documents into small fragments before running extraction prompts destroys semantic continuity. The model loses narrative context and outputs shallow concept lists. Fix: Feed complete documents or logical sections to the extraction prompt. Let the model determine conceptual boundaries rather than imposing arbitrary token limits.

2. Skipping Relationship Mapping

Explanation: Compiling concepts in isolation creates a knowledge silo. Without explicit cross-linking, the system cannot answer comparative or synthesis queries. Fix: Always run the relationship mapping pass after extraction. Enforce a minimum of 3-5 relationship notes per compilation cycle to maintain graph density.

3. Weak Persona Constraints

Explanation: Allowing the model to draw from external training data during queries introduces hallucination and breaks citation integrity. Fix: Lock the scope to /synthesized/ directories. Use negative constraints (never speculate, do not reference external knowledge) and require explicit source citations for every claim.

4. Treating the Master Index as Static

Explanation: The index file drifts as new concepts are added. Stale indexes cause query engines to miss recently compiled material. Fix: Automate index regeneration during every compilation cycle. Append new entries rather than rewriting, and maintain a last_updated timestamp for cache invalidation.

5. Ignoring Model Version Drift

Explanation: Swapping models without version pinning changes compilation output formatting and relationship logic. Downstream parsers break. Fix: Pin model versions in the orchestration script. Maintain a model_manifest.json that tracks which model version produced each compiled batch. Re-compile only when necessary.

6. Neglecting Citation Verification

Explanation: Models occasionally fabricate file paths or section references. Unverified citations erode trust in the knowledge base. Fix: Implement a post-compilation validation step that checks if cited files exist and contain the referenced keywords. Flag mismatches for manual review.

7. Accumulating Obsolete Sources

Explanation: Old specifications and deprecated guides remain in /raw-inputs/, polluting future compilation cycles. Fix: Archive deprecated sources to /raw-inputs/archived/ and exclude that directory from prompt scopes. Maintain a status: active/deprecated field in source metadata.

Production Bundle

Action Checklist

Initialize persistent workspace with isolated directory schema (/raw-inputs/, /staging/, /synthesized/)
Configure model endpoint and API credentials in workspace settings
Deploy extraction prompt template with strict markdown output schema
Deploy relationship mapping prompt template with gap detection logic
Lock query persona to synthesized directories only
Implement citation validation script to verify file paths post-compilation
Schedule incremental compilation runs for new staging files
Archive deprecated sources and update prompt scopes accordingly

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Small team (<5 users), static docs	Incremental Compilation	Low infra overhead, high synthesis quality	Medium (compute for initial build)
High-frequency updates, real-time sync	Traditional RAG + Vector DB	Faster ingestion, lower latency for fresh data	High (embedding + storage costs)
Compliance/audit requirements	Incremental Compilation	Full citation trail, version-controllable markdown	Low (storage only)
Cross-domain research synthesis	Incremental Compilation	Native relationship mapping across disparate sources	Medium (prompt engineering overhead)

Configuration Template

# knowledge-system-config.yaml
workspace:
  root: /knowledge-archival
  scopes:
    raw: /raw-inputs/
    staging: /staging/
    synthesized: /synthesized/

compilation:
  extraction_prompt: |
    Analyze {source_file} and extract core technical concepts.
    Output markdown with H2 concept names, 1-2 sentence definitions,
    bullet-point properties, and source references.
  relationship_prompt: |
    Review {concept_files}. Identify implicit relationships.
    Output relationship notes with Concept A/B, relationship type,
    and evidence summary. Update master-index.md.
  max_concepts_per_run: 15
  relationship_pairs_per_run: 5

query:
  persona: |
    ROLE: Knowledge Synthesis Engine
    SCOPE: /synthesized/concepts/, /synthesized/relationships/
    CONSTRAINTS: Cite sources, flag gaps, never speculate.
  model: claude-sonnet-4-20250514
  temperature: 0.1
  max_tokens: 1024

validation:
  citation_check: true
  index_regenerate: true
  archive_deprecated: true

Quick Start Guide

Initialize Workspace: Create the directory schema (/raw-inputs/, /staging/, /synthesized/) in your persistent environment. Upload 3-5 source documents to /staging/.
Run Extraction: Execute the extraction prompt against each staging file. Verify output matches the markdown schema and move results to /synthesized/concepts/.
Map Relationships: Run the relationship prompt against the compiled concepts. Confirm relationship notes are generated and master-index.md is updated.
Configure Query Persona: Apply the constrained persona configuration. Test with a cross-concept question and verify citations point to compiled files.
Validate & Iterate: Run citation validation. Archive any stale sources. Add new documents to /staging/ and repeat the compilation cycle.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back