I made Claude Code and Codex talk to each other across machines. Here's what broke.
Cross-Session Agent Coordination: A File-First Architecture for Convergent AI Workflows
Current Situation Analysis
Modern AI coding assistants operate in strict isolation. Each session maintains a private context window, unaware of parallel work happening in other terminals, hosts, or even different vendor ecosystems. When developers run Claude Code on Windows, Claude Code on Linux, and Codex in a tmux pane simultaneously, they are effectively managing three independent reasoning engines. The default architecture assumes human mediation: you read output from Session A, manually relay context to Session B, and hope the convergence happens before context windows drift.
This siloed design is a direct consequence of how the industry prioritizes parallelism over convergence. Most multi-agent frameworks focus on task decomposition: split a repository into five subtasks, dispatch them to isolated workers, and merge the results. That model works for batch processing, but it fails for architectural decision-making, API contract negotiation, and iterative refactoring. Those workflows require conversational turn-taking, shared evidence, and explicit convergence protocols.
The problem is frequently overlooked because developers treat AI sessions as disposable scratchpads rather than persistent collaborators. When sessions are ephemeral, coordination overhead feels acceptable. When sessions become long-running, cross-platform, and multi-vendor, manual context shuttling becomes a bottleneck. File-based state sharing emerged as a pragmatic response to this gap. By leveraging append-only markdown files with structured frontmatter, teams can establish a product-neutral, platform-agnostic coordination layer that requires zero infrastructure, zero network ports, and zero vendor-specific APIs. The trade-off is latency: synchronous file locking and read-back verification typically add ~1 second per turn. For human-paced AI coordination, that delay is negligible. For high-frequency message passing, it is unacceptable. The architecture deliberately optimizes for auditability and convergence over throughput.
WOW Moment: Key Findings
The shift from isolated execution to shared-state coordination fundamentally changes how AI sessions behave. When given a transparent, append-only surface, sessions stop operating as independent calculators and start exhibiting convergent reasoning patterns. They reference shared evidence, challenge conflicting assumptions, and self-correct when discrepancies appear in the conversation log.
| Approach | Context Synchronization | Cross-Platform Support | Debugging Visibility | Convergence Latency |
|---|---|---|---|---|
| Parallel Task Queues | None (stateless workers) | Limited (requires unified runtime) | Low (aggregated logs only) | High (merge conflicts post-execution) |
| API-Based Orchestration | Tight (vendor-dependent) | Poor (cross-vendor friction) | Medium (structured payloads) | Low (<100ms, but high infra cost) |
| File-Based Convergence Layer | Shared (append-only evidence) | Native (OS-agnostic file I/O) | High (plain-text audit trail) | Medium (~1s/turn, acceptable for reasoning) |
This finding matters because it decouples coordination from vendor lock-in. The file system becomes the universal protocol. Sessions no longer need to understand each other's internal APIs; they only need to agree on a shared directory structure, a turn-taking schema, and a delivery mechanism. The result is a lightweight convergence layer that works across Windows, Linux, tmux, and Windows Terminal without requiring custom plugins or cloud dependencies.
Core Solution
The architecture rests on three pillars: a shared state format, an atomic turn submission mechanism, and a platform-aware notification system. Each component is designed to be auditable, fallback-safe, and vendor-neutral.
Step 1: Define the Shared State Format
Conversations live in a dedicated directory. Each conversation is a single append-only markdown file with a YAML frontmatter header. The header tracks metadata; the body tracks turn history.
// types/conversation.ts
export interface ConversationHeader {
topic: string;
initiator: string;
participants: string[];
status: 'ACTIVE' | 'RESOLVED' | 'BLOCKED';
turnCount: number;
maxTurns: number;
nextSpeaker: string;
}
export interface TurnEntry {
speaker: string;
timestamp: string;
content: string;
nonce: string;
}
The append-only constraint is non-negotiable. Every turn adds a new block to the file. The header updates atomically alongside the body. This design guarantees that any session can reconstruct the full conversation history by reading the file from top to bottom, and any discrepancy becomes immediately visible through diff tools.
Step 2: Implement Atomic Turn Submission
Direct file writes introduce race conditions. The solution uses a file lock, a unique nonce, and a read-back verification step.
// core/turn-submitter.ts
import { createHash } from 'crypto';
import { readFileSync, writeFileSync, renameSync, existsSync } from 'fs';
import { lock, unlock } from 'proper-lockfile';
export class TurnSubmitter {
private readonly lockPath: string;
private readonly conversationPath: string;
constructor(conversationPath: string) {
this.conversationPath = conversationPath;
this.lockPath = `${conversationPath}.lock`;
}
async submitTurn(speaker: string, content: string): Promise<boolean> {
const release = await lock(this.lockPath, { retries: 3, stale: 10000 });
try {
const nonce = createHash('sha256').update(`${Date.now()}-${Math.random()}`).digest('hex').slice(0, 12);
const current = readFileSync(this.conversationPath, 'utf-8');
const updated = this.appendTurnBlock(current, speaker, content, nonce);
// Atomic write via temp file + rename
const tempPath = `${this.conversationPath}.tmp`;
writeFileSync(tempPath, updated, 'utf-8');
renameSync(tempPath, this.conversationPath);
// Read-back verification
const verified = readFileSync(this.conversationPath, 'utf-8');
if (!verified.includes(`<!-- nonce: ${nonce} -->`)) {
throw new Error('Write verification failed: nonce missing');
}
return true;
} finally {
await release();
}
}
private appendTurnBlock(source: string, speaker: string, content: string, nonce: string): string {
const headerMatch = source.match(/^---\n([\s\S]*?)\n---/);
if (!headerMatch) throw new Error('Invalid conversation format');
const header = headerMatch[1];
const body = source.slice(headerMatch[0].length);
const turnCount = (body.match(/## Turn \d+/g) || []).length + 1;
return `---\n${header.replace(/turnCount: \d+/, `turnCount: ${turnCount}`)}\n---\n${body}\n## Turn ${turnCount} -- ${speaker}\n${content}\n<!-- nonce: ${nonce} -->\n`;
}
}
Architecture Rationale:
proper-lockfileprovides cross-platform advisory locking without requiring a daemon.- The temp-file +
renameSyncpattern guarantees atomicity on POSIX and Windows NTFS. - The nonce acts as a cryptographic receipt. If the read-back fails, the submitter knows a concurrent write occurred and can retry with jitter.
- Header parsing is regex-based for simplicity, but production deployments should migrate to a strict YAML parser with schema validation.
Step 3: Build the Notification Watcher
Polling every session for new turns wastes CPU and introduces latency. Instead, a lightweight watcher monitors the conversation directory and injects a prompt into the target terminal when it's that session's turn.
// daemon/watcher.ts
import { watch } from 'chokidar';
import { execSync } from 'child_process';
import { readFileSync } from 'fs';
export class ConversationWatcher {
private readonly targetDir: string;
private readonly sessionAlias: string;
constructor(targetDir: string, sessionAlias: string) {
this.targetDir = targetDir;
this.sessionAlias = sessionAlias;
}
start(): void {
const watcher = watch(this.targetDir, { ignoreInitial: true });
watcher.on('change', (filePath) => {
if (!filePath.endsWith('.md')) return;
const content = readFileSync(filePath, 'utf-8');
const nextMatch = content.match(/nextSpeaker: (\w+)/);
if (!nextMatch || nextMatch[1] !== this.sessionAlias) return;
this.injectPrompt(filePath, this.sessionAlias);
});
}
private injectPrompt(filePath: string, alias: string): void {
const platform = process.platform;
const message = `Read ${filePath} (you are '${alias}')`;
if (platform === 'win32') {
// Windows Terminal injection via PowerShell SendKeys
execSync(`powershell -Command "Add-Type -AssemblyName System.Windows.Forms; [System.Windows.Forms.SendKeys]::SendWait('${message}')"`, { stdio: 'inherit' });
} else {
// Linux/macOS injection via tmux
execSync(`tmux send-keys -t ${alias} "${message}" Enter`, { stdio: 'inherit' });
}
}
}
Architecture Rationale:
chokidarprovides cross-platform file system events with fallback to polling when native watchers fail (e.g., network mounts).- Terminal injection bypasses the need for custom agent plugins. The AI session receives the prompt as if a human typed it, triggering its standard read-and-respond loop.
- Platform branching isolates delivery mechanics from the core protocol. The watcher only cares about the
nextSpeakerfield; it doesn't validate conversation logic.
Step 4: Handle Terminal Binding Ambiguity
On Windows, multiple terminal windows share a single process ID. Injecting into the wrong window breaks coordination. The solution uses an OSC title escape sequence at session startup to tag the window, then resolves the target via window enumeration.
// utils/terminal-binder.ts
import { execSync } from 'child_process';
export class TerminalBinder {
static bindWindowsTerminal(alias: string): void {
const marker = `TALK-INIT-${alias}-${Math.random().toString(36).slice(2, 8)}`;
// Write OSC escape to stdout to set window title
process.stdout.write(`\x1b]0;${marker}\x07`);
// Store marker in a local registry for later HWND resolution
const registryPath = `./.talks/registry/${alias}.marker`;
require('fs').writeFileSync(registryPath, marker);
}
static resolveTarget(alias: string): string {
const marker = require('fs').readFileSync(`./.talks/registry/${alias}.marker`, 'utf-8').trim();
// PowerShell enumeration returns the matching window handle
const ps = `powershell -Command "Get-Process | Where-Object { $_.MainWindowTitle -like '*${marker}*' } | Select-Object -ExpandProperty Id"`;
return execSync(ps, { encoding: 'utf-8' }).trim();
}
}
Architecture Rationale:
- The OSC escape (
\x1b]0;...\x07) is a standard VT100/ANSI sequence that Windows Terminal respects. It runs before the AI agent captures stdout, ensuring the marker reaches the PTY. - Binding is launcher-only. Agents cannot self-bind, which prevents malicious or misconfigured sessions from hijacking terminal targets.
- The marker registry provides a deterministic lookup path for the watcher daemon.
Pitfall Guide
1. Direct File Mutation Bypass
Explanation: AI sessions may attempt to edit conversation files directly using their native Edit or Write tools. This bypasses file locks, skips nonce generation, and corrupts the turn counter.
Fix: Enforce a PreToolUse hook or wrapper script that intercepts file operations targeting .talks/. Block direct writes and return a strict instruction to use the coordinator CLI. Document the rule in AGENTS.md for sessions lacking hook support.
2. Network File System Metadata Staleness
Explanation: SSHFS, NFS, and SMB often cache directory metadata. Concurrent writers may both believe they hold the lock, leading to silent overwrites.
Fix: Implement nonce-based read-back verification with exponential backoff. If the written nonce is missing after rename, retry up to 5 times with jitter. Add a doctor command that scans for missing or duplicate nonces to detect silent corruption.
3. Terminal Target Ambiguity
Explanation: On Windows, process IDs map to multiple HWNDs. Injecting into the wrong terminal causes prompts to appear in unrelated sessions.
Fix: Use OSC title escapes at startup to tag windows. Resolve targets via window title enumeration rather than PID. On Linux, explicitly target tmux panes using -t session:window.pane syntax to avoid broadcast injection.
4. Polling Storms & CPU Burn
Explanation: A naive watcher that polls every 100ms across dozens of conversations will spike CPU usage and trigger file system event throttling.
Fix: Use chokidar or native inotify/FSEvents watchers. Fall back to adaptive polling (1s β 5s β 10s) only when native events fail. Debounce rapid file changes to avoid duplicate injections.
5. Turn Sequence Drift
Explanation: The YAML header turnCount may fall out of sync with the actual number of turn blocks in the body, especially after manual edits or failed writes.
Fix: Validate turn counts on every read. If drift is detected, automatically reconcile by counting ## Turn \d+ blocks and updating the header. Log discrepancies for audit purposes.
6. Cross-Vendor Protocol Drift
Explanation: Different AI models may interpret the conversation format differently. One session might add markdown tables, another might strip YAML headers, breaking parser consistency.
Fix: Define a strict schema version in the header (schema: v1.2). Include a validation step that rejects turns failing schema checks. Provide a normalize command that strips non-compliant formatting before injection.
7. Silent Write Failures
Explanation: Disk full errors, permission changes, or antivirus locks can cause atomic writes to fail without throwing exceptions, leaving the conversation in a corrupted state.
Fix: Wrap all file operations in try/catch blocks with explicit error propagation. Use checksum verification post-rename. Implement a fallback .bak directory that preserves the last known good state before each write attempt.
Production Bundle
Action Checklist
- Initialize
.talks/directory structure withconversations/,registry/, andlocks/subdirectories - Configure
PreToolUsehooks or wrapper scripts to block direct file edits on conversation paths - Deploy the watcher daemon as a background service with auto-restart on crash
- Validate terminal binding on each host using the OSC escape marker test
- Run the
doctorcommand weekly to scan for nonce mismatches and header drift - Set up a
.gitignorerule to exclude.talks/from version control while preserving conversation history locally - Document the turn-taking protocol in
AGENTS.mdwith explicit examples for cross-vendor sessions
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Single-host, single-vendor sessions | Local file coordination + tmux/terminal injection | Zero infra, native tool compatibility | None |
| Multi-host, cross-vendor sessions | SSHFS-mounted .talks/ + nonce verification |
Shared state without cloud dependencies | SSHFS latency overhead |
| High-frequency agent loops (>10 turns/min) | Redis/ZMQ message bus + structured JSON | File locking becomes a bottleneck | Infrastructure cost + maintenance |
| Audit-heavy compliance environments | Append-only markdown + cryptographic nonces | Immutable history, plain-text diffing | Storage growth (~50KB/conversation) |
| Enterprise air-gapped networks | Local file coordination + manual sync scripts | No external ports or cloud APIs required | Manual sync overhead |
Configuration Template
# .talks/config.yaml
schema_version: "1.2"
conversation_dir: ".talks/conversations"
lock_strategy: "proper-lockfile"
max_turns_per_session: 20
turn_delay_ms: 1000
watcher:
polling_fallback: true
debounce_ms: 500
inject_method: "auto" # auto, tmux, sendkeys
security:
block_direct_edits: true
require_nonce_verification: true
audit_log: ".talks/audit.log"
Quick Start Guide
- Initialize the coordination layer: Run
mkdir -p .talks/{conversations,registry,locks}and place theconfig.yamltemplate in the root directory. - Launch the watcher daemon: Execute
node watcher.js --alias dev-session-1 --dir .talks/conversationsin a background terminal. Verify it detects file changes. - Bind your terminal: Run the launcher script that emits the OSC escape sequence. Confirm the window title updates with the
TALK-INIT-marker. - Start a conversation: Create
.talks/conversations/api-contract.mdwith the YAML header and first turn block. The watcher will inject the prompt into the target session. - Validate convergence: Monitor
.talks/audit.logfor nonce verification results. Run thedoctorcommand to confirm header integrity and turn sequence alignment.
This architecture trades millisecond latency for transparency, auditability, and cross-platform neutrality. When AI sessions share a single source of truth, they stop operating as isolated calculators and start behaving as coordinated reasoning units. The file system isn't just storage; it's the protocol.
Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register β Start Free Trial7-day free trial Β· Cancel anytime Β· 30-day money-back
