n edge-list stores only existing connections, enabling O(1) relationship lookups and straightforward traversal queries.
3. Identifier Strategy: UUIDv4 for nodes. Distributed generation prevents ID collisions when merging knowledge across machines or exporting/importing datasets.
4. Temporal Tracking: ISO-8601 timestamps on creation and modification. Enables time-based queries (e.g., "show decisions made in Q3") and supports temporal decay analysis during cleanup cycles.
Implementation: TypeScript Repository Layer
The following implementation uses better-sqlite3 for synchronous, high-performance local access. The schema separates knowledge units from semantic links, enabling flexible querying without rigid inheritance hierarchies.
import Database from 'better-sqlite3';
import { v4 as uuidv4 } from 'uuid';
// Domain types
type NodeType = 'hypothesis' | 'resolution' | 'implementation' | 'citation';
type EdgeType = 'grounds' | 'overrides' | 'realizes' | 'attributes' | 'contradicts';
interface KnowledgeUnit {
id: string;
type: NodeType;
label: string;
payload: string;
tags: string[];
created_at: string;
updated_at: string;
}
interface SemanticLink {
from_id: string;
to_id: string;
relation: EdgeType;
context: string;
created_at: string;
}
class ContextGraph {
private db: Database.Database;
constructor(dbPath: string) {
this.db = new Database(dbPath);
this.db.pragma('journal_mode = WAL');
this.initializeSchema();
}
private initializeSchema(): void {
this.db.exec(`
CREATE TABLE IF NOT EXISTS knowledge_units (
id TEXT PRIMARY KEY,
type TEXT NOT NULL CHECK(type IN ('hypothesis', 'resolution', 'implementation', 'citation')),
label TEXT NOT NULL,
payload TEXT,
tags TEXT DEFAULT '[]',
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS semantic_links (
from_id TEXT NOT NULL,
to_id TEXT NOT NULL,
relation TEXT NOT NULL CHECK(relation IN ('grounds', 'overrides', 'realizes', 'attributes', 'contradicts')),
context TEXT,
created_at TEXT NOT NULL,
PRIMARY KEY (from_id, to_id, relation),
FOREIGN KEY (from_id) REFERENCES knowledge_units(id) ON DELETE CASCADE,
FOREIGN KEY (to_id) REFERENCES knowledge_units(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_links_from ON semantic_links(from_id);
CREATE INDEX IF NOT EXISTS idx_links_to ON semantic_links(to_id);
`);
}
registerUnit(type: NodeType, label: string, payload: string, tags: string[]): string {
const id = uuidv4();
const now = new Date().toISOString();
const stmt = this.db.prepare(`
INSERT INTO knowledge_units (id, type, label, payload, tags, created_at, updated_at)
VALUES (?, ?, ?, ?, json(?), ?, ?)
`);
stmt.run(id, type, label, payload, JSON.stringify(tags), now, now);
return id;
}
connectUnits(fromId: string, toId: string, relation: EdgeType, context: string): void {
const now = new Date().toISOString();
const stmt = this.db.prepare(`
INSERT OR REPLACE INTO semantic_links (from_id, to_id, relation, context, created_at)
VALUES (?, ?, ?, ?, ?)
`);
stmt.run(fromId, toId, relation, context, now);
}
traverseContext(unitId: string, depth: number = 2): KnowledgeUnit[] {
const query = `
WITH RECURSIVE context_path AS (
SELECT id, type, label, payload, tags, created_at, updated_at, 0 AS depth
FROM knowledge_units WHERE id = ?
UNION ALL
SELECT ku.id, ku.type, ku.label, ku.payload, ku.tags, ku.created_at, ku.updated_at, cp.depth + 1
FROM knowledge_units ku
JOIN semantic_links sl ON ku.id = sl.to_id
JOIN context_path cp ON sl.from_id = cp.id
WHERE cp.depth < ?
)
SELECT * FROM context_path;
`;
const rows = this.db.prepare(query).all(unitId, depth) as any[];
return rows.map(row => ({
...row,
tags: JSON.parse(row.tags)
}));
}
close(): void {
this.db.close();
}
}
export { ContextGraph, NodeType, EdgeType };
Why This Structure Works
- JSON tag storage: SQLite's
json() function enables efficient tag querying without normalization overhead. Tags remain lightweight metadata while edges carry semantic weight.
- Recursive CTE traversal: The
traverseContext method uses SQLite's recursive common table expressions to fetch multi-hop relationships. This eliminates the need for external graph traversal libraries while maintaining query performance.
- Composite primary key on edges: Prevents duplicate relationships and enforces directional semantics. The
ON DELETE CASCADE ensures orphaned links are automatically cleaned when source units are removed.
- WAL journal mode: Enables concurrent reads without locking, critical for CLI tools that query while background processes write.
Pitfall Guide
1. Schema Over-Engineering
Explanation: Attempting to model every possible relationship type upfront creates rigid structures that resist evolution. Developers spend more time designing the graph than capturing knowledge.
Fix: Start with four node types and five edge types. Add new relations only when three distinct use cases require them. Treat the schema as emergent, not prescriptive.
2. Tag Sprawl & Vocabulary Drift
Explanation: Uncontrolled tagging creates synonym fragmentation (cache, caching, memoization, store). Search becomes unreliable, and automated queries fail.
Fix: Enforce a controlled vocabulary. Use edges for semantic relationships and reserve tags for orthogonal metadata (e.g., language:typescript, domain:auth). Implement a CLI command that validates tags against a known list before insertion.
3. Ignoring Edge Directionality
Explanation: Treating relationships as undirected collapses causal chains. hypothesis -> resolution means something fundamentally different from resolution -> hypothesis.
Fix: Always define source and target explicitly. Document edge semantics in a RELATIONSHIP_GUIDE.md. When querying, filter by direction: WHERE from_id = ? for outgoing context, WHERE to_id = ? for incoming dependencies.
4. Neglecting Temporal Decay
Explanation: Knowledge graphs accumulate stale entries. Outdated implementations and superseded decisions create noise, reducing trust in the system.
Fix: Schedule quarterly graph audits. Query units older than 180 days with no recent updated_at timestamps. Archive or mark as deprecated using a status field. Automate decay alerts via a cron job that flags low-activity nodes.
5. Storing Sensitive Context
Explanation: Developers occasionally paste API keys, internal URLs, or proprietary logic into knowledge units. Local storage doesn't guarantee security if devices are shared or backed up to cloud services.
Fix: Implement a pre-commit hook or CLI validator that scans payloads for common secret patterns (regex for AKIA, sk-, password:). Replace sensitive values with placeholder references like {{vault:aws_key}}.
6. Forcing Hierarchical Thinking
Explanation: Treating the graph like a folder tree leads to artificial parent-child constraints. Real engineering knowledge is网状 (mesh-like), with multiple overlapping contexts.
Fix: Embrace many-to-many relationships. A single implementation can realize multiple resolutions. A hypothesis can contradict several prior decisions. Use the graph's non-linear nature intentionally; avoid creating "category" nodes that act as folders.
7. Skipping the "Why" in Resolution Nodes
Explanation: Recording decisions without trade-offs or success criteria creates black-box artifacts. Future engineers (including yourself) cannot evaluate whether the decision still holds.
Fix: Mandate a structured payload format for resolution nodes: Problem, Chosen Path, Rejected Alternatives, Success Metrics, Review Date. Enforce this via template injection in the CLI or UI.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Solo developer / small team | SQLite + TypeScript CLI | Zero infrastructure, ACID compliance, portable | $0 (local storage only) |
| Multi-user collaboration | SQLite + REST API + IndexedDB sync | Centralized writes, offline resilience, conflict resolution | ~$5-15/mo (VPS or serverless) |
| Enterprise-scale knowledge base | Neo4j / Amazon Neptune | Advanced graph algorithms, role-based access, audit trails | $50-200+/mo (managed service) |
| Rapid prototyping / experimentation | JSON file + in-memory graph | No setup, instant iteration, easy serialization | $0 (file I/O overhead) |
Configuration Template
schema.sql
PRAGMA journal_mode = WAL;
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS knowledge_units (
id TEXT PRIMARY KEY,
type TEXT NOT NULL CHECK(type IN ('hypothesis', 'resolution', 'implementation', 'citation')),
label TEXT NOT NULL,
payload TEXT,
tags TEXT DEFAULT '[]',
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS semantic_links (
from_id TEXT NOT NULL,
to_id TEXT NOT NULL,
relation TEXT NOT NULL CHECK(relation IN ('grounds', 'overrides', 'realizes', 'attributes', 'contradicts')),
context TEXT,
created_at TEXT NOT NULL,
PRIMARY KEY (from_id, to_id, relation),
FOREIGN KEY (from_id) REFERENCES knowledge_units(id) ON DELETE CASCADE,
FOREIGN KEY (to_id) REFERENCES knowledge_units(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_links_from ON semantic_links(from_id);
CREATE INDEX IF NOT EXISTS idx_links_to ON semantic_links(to_id);
CREATE INDEX IF NOT EXISTS idx_units_type ON knowledge_units(type);
CREATE INDEX IF NOT EXISTS idx_units_updated ON knowledge_units(updated_at);
package.json (dependencies)
{
"dependencies": {
"better-sqlite3": "^9.4.3",
"uuid": "^9.0.0",
"commander": "^12.0.0"
},
"devDependencies": {
"typescript": "^5.3.3",
"@types/better-sqlite3": "^7.6.9",
"@types/uuid": "^9.0.7"
}
}
Quick Start Guide
- Initialize project: Run
npm init -y && npm i better-sqlite3 uuid commander && npm i -D typescript @types/better-sqlite3 @types/uuid
- Create database: Execute
sqlite3 kg.db < schema.sql to provision tables and indexes
- Register first unit: Use the TypeScript repository or CLI:
node -e "const { ContextGraph } = require('./graph'); const g = new ContextGraph('./kg.db'); console.log(g.registerUnit('hypothesis', 'Reduce cold starts', 'Investigate edge runtime caching', ['perf', 'serverless'])); g.close();"
- Link concepts:
g.connectUnits(unitId1, unitId2, 'grounds', 'Based on latency benchmarks in staging');
- Query context:
const context = g.traverseContext(unitId1, 2); console.log(JSON.stringify(context, null, 2)); g.close();
The graph is now operational. Iterate by adding edge types as domain complexity grows, enforce tag discipline through automation, and treat the knowledge base as a living artifact that compounds with every engineering decision.