ce DocEntry {
uid: string;
slug: string;
payload: string;
checksum: string;
metadata: {
created: string;
modified: string;
owner: string;
};
revisionLog: RevisionRecord[];
}
interface RevisionRecord {
seq: number;
timestamp: string;
actor: string;
changes: DeltaSet;
}
interface DeltaSet {
added: number;
removed: number;
summary: string;
}
**Rationale:**
- **`uid` and `slug`:** Separating the unique identifier from the human-readable slug prevents collisions and allows renaming without breaking references.
- **`checksum`:** A SHA-256 hash of the payload ensures integrity. Any tampering or corruption is detectable immediately.
- **`revisionLog`:** Storing deltas rather than full snapshots reduces storage overhead while preserving full history. Each record captures the actor, timestamp, and a summary of changes.
- **ISO 8601 Timestamps:** All timestamps use UTC to avoid timezone ambiguity in distributed teams.
#### 2. Storage Engine
The storage layer manages file I/O with atomic writes to prevent corruption. It maintains a manifest file that indexes all documents for fast lookup.
```typescript
import fs from 'fs/promises';
import path from 'path';
import crypto from 'crypto';
class DocVault {
private basePath: string;
private manifestPath: string;
constructor(baseDir: string) {
this.basePath = path.resolve(baseDir);
this.manifestPath = path.join(this.basePath, 'manifest.json');
}
private async ensureVault(): Promise<void> {
await fs.mkdir(this.basePath, { recursive: true });
try {
await fs.access(this.manifestPath);
} catch {
await this.writeManifest({ entries: [] });
}
}
private async writeManifest(manifest: { entries: string[] }): Promise<void> {
const tmpPath = `${this.manifestPath}.tmp`;
await fs.writeFile(tmpPath, JSON.stringify(manifest, null, 2));
await fs.rename(tmpPath, this.manifestPath);
}
private computeChecksum(content: string): string {
return crypto.createHash('sha256').update(content).digest('hex');
}
async saveEntry(entry: DocEntry): Promise<void> {
await this.ensureVault();
const entryPath = path.join(this.basePath, `${entry.uid}.json`);
const tmpPath = `${entryPath}.tmp`;
entry.checksum = this.computeChecksum(entry.payload);
entry.metadata.modified = new Date().toISOString();
await fs.writeFile(tmpPath, JSON.stringify(entry, null, 2));
await fs.rename(tmpPath, entryPath);
const manifest = JSON.parse(await fs.readFile(this.manifestPath, 'utf-8'));
if (!manifest.entries.includes(entry.uid)) {
manifest.entries.push(entry.uid);
await this.writeManifest(manifest);
}
}
async loadEntry(uid: string): Promise<DocEntry | null> {
const entryPath = path.join(this.basePath, `${uid}.json`);
try {
const raw = await fs.readFile(entryPath, 'utf-8');
return JSON.parse(raw);
} catch {
return null;
}
}
}
Rationale:
- Atomic Writes: Using a temporary file and
rename ensures that partial writes never corrupt the vault. If the process crashes, the original file remains intact.
- Checksum Verification: The
computeChecksum method generates a hash of the payload. This allows downstream tools to verify that the content hasn't been altered outside the CLI.
- Manifest Management: The manifest acts as a lightweight index, enabling fast listing and search without scanning the entire directory.
3. Delta Engine
The delta engine computes differences between revisions. Instead of storing full text copies, it records structural changes. This approach minimizes storage usage and enables efficient diff rendering.
function generateDelta(oldContent: string, newContent: string): DeltaSet {
const oldLines = oldContent.split(/\r?\n/);
const newLines = newContent.split(/\r?\n/);
let added = 0;
let removed = 0;
const maxLen = Math.max(oldLines.length, newLines.length);
for (let i = 0; i < maxLen; i++) {
const oldLine = oldLines[i] ?? '';
const newLine = newLines[i] ?? '';
if (oldLine !== newLine) {
if (oldLine) removed++;
if (newLine) added++;
}
}
return {
added,
removed,
summary: `${added} added, ${removed} removed`,
};
}
Rationale:
- Line-Based Comparison: Splitting by newline handles both Unix and Windows line endings. This provides a balance between simplicity and accuracy for text-based documentation.
- Delta Summary: The
summary field offers a quick human-readable overview of changes, useful for CLI output and audit logs.
- Extensibility: This function can be replaced with a more sophisticated algorithm (e.g., Myers diff) if needed, but the interface remains stable.
4. CLI Interface
The CLI provides commands for creating, updating, and reviewing documents. It integrates with the storage engine and delta engine to manage the lifecycle.
import { Command } from 'commander';
import { DocVault } from './vault';
import { generateDelta } from './delta';
const program = new Command();
const vault = new DocVault('./docs');
program
.command('create <slug> <owner>')
.argument('<content>', 'Initial content')
.action(async (slug, owner, content) => {
const uid = crypto.randomUUID();
const now = new Date().toISOString();
const entry: DocEntry = {
uid,
slug,
payload: content,
checksum: '',
metadata: { created: now, modified: now, owner },
revisionLog: [{
seq: 1,
timestamp: now,
actor: owner,
changes: { added: content.split('\n').length, removed: 0, summary: 'Initial creation' },
}],
};
await vault.saveEntry(entry);
console.log(`Created doc: ${uid}`);
});
program
.command('update <uid> <actor>')
.argument('<content>', 'New content')
.action(async (uid, actor, content) => {
const entry = await vault.loadEntry(uid);
if (!entry) {
console.error('Doc not found');
process.exit(1);
}
const delta = generateDelta(entry.payload, content);
const nextSeq = entry.revisionLog.length + 1;
const now = new Date().toISOString();
entry.payload = content;
entry.revisionLog.push({
seq: nextSeq,
timestamp: now,
actor,
changes: delta,
});
await vault.saveEntry(entry);
console.log(`Updated doc: ${uid} | ${delta.summary}`);
});
program.parse();
Rationale:
- Commander.js: A lightweight argument parser that supports subcommands and type coercion. This keeps the CLI modular and extensible.
- Sequential Revision Numbers: The
seq field ensures that revisions are ordered correctly, even if timestamps collide.
- Immediate Feedback: The CLI outputs the UID and delta summary, providing instant confirmation of actions.
Pitfall Guide
Implementing a local-first documentation system requires careful attention to data integrity and workflow consistency. The following pitfalls highlight common mistakes and their solutions.
| Pitfall | Explanation | Fix |
|---|
| Race Conditions | Concurrent writes to the same file can corrupt data or overwrite changes. | Use atomic writes with temporary files and rename. Implement file locking for multi-process environments. |
| ID Instability | Using slugs or titles as identifiers leads to collisions when content changes. | Generate UUIDs for uid and treat slugs as mutable metadata. Never rely on slugs for internal references. |
| Timezone Drift | Local timestamps vary across machines, causing audit confusion. | Enforce UTC ISO 8601 timestamps at write time. Validate timestamps during sync operations. |
| History Bloat | Storing full snapshots for every edit consumes excessive storage. | Store deltas instead of full copies. Implement a pruning strategy to compress old revisions. |
| Diff Noise | Whitespace changes trigger false positives in diffs. | Normalize whitespace before comparison. Ignore trailing spaces and line ending differences. |
| Sync Conflicts | Offline edits can diverge, causing merge failures. | Use last-write-wins with checksum validation. Implement a manual merge workflow for critical conflicts. |
| Missing Provenance | Edits without actor attribution break audit trails. | Require actor field for all updates. Validate that every revision record includes a valid actor. |
Production Bundle
This section provides actionable resources for deploying and maintaining a local-first documentation system in production.
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Solo Developer | Pure CLI + Local Files | Minimal overhead, full control, instant feedback. | Low (No infrastructure) |
| Small Team | CLI + Git Sync | Leverages existing Git workflows, enables code review for docs. | Low (Git hosting) |
| Enterprise | CLI + Sync Service | Centralized audit, access control, automated compliance. | Medium (Service maintenance) |
| High Compliance | CLI + Checksums + CI | Immutable audit trail, automated validation, tamper detection. | Medium (CI/CD setup) |
Configuration Template
Use this configuration file to customize the vault behavior. It supports paths, sync settings, and validation rules.
{
"vault": {
"basePath": "./docs",
"manifestFile": "manifest.json",
"atomicWrites": true
},
"sync": {
"strategy": "git",
"remote": "origin",
"branch": "main"
},
"validation": {
"requireChecksum": true,
"requireActor": true,
"maxHistorySize": 100
},
"pruning": {
"enabled": true,
"threshold": 30,
"compressOld": true
}
}
Quick Start Guide
Follow these steps to initialize and use the documentation vault in under five minutes.
-
Initialize Vault:
Run doc-vault init to create the directory structure and manifest file.
-
Create Document:
Execute doc-vault create "api-guide" "alice" "Initial API documentation..." to add a new entry.
-
Update Content:
Use doc-vault update <uid> "bob" "Updated API guide with rate limits..." to apply changes and record the delta.
-
Review History:
Run doc-vault history <uid> to view the revision log, including timestamps, actors, and change summaries.
-
Sync Changes:
If using Git, commit the docs/ directory and push to the remote repository to share updates with the team.