ecc: Building the Operating System for AI Coding Agents β 230+ Skills, 60 Agents, Cross-Harness
By Codcompass TeamΒ·Β·9 min read
Orchestrating AI Coding Agents: A Cross-Platform Infrastructure for Reproducible Engineering
Current Situation Analysis
The proliferation of AI coding agents has shifted the developer landscape from manual implementation to orchestration. Tools like OpenAI Codex, Cursor, Gemini CLI, and GitHub Copilot provide raw generative power, but relying on them as isolated utilities introduces significant engineering debt. The industry is currently facing a "structure gap" where model capability outpaces workflow governance.
The Core Pain Points:
Non-Deterministic Output: Without standardized instruction sets, the same task yields divergent results across sessions or team members. This variance breaks CI/CD consistency and complicates code reviews.
Context Fragmentation: Agents operate in ephemeral sessions. Long-running engineering tasks suffer from context window exhaustion, leading to lost state, forgotten constraints, and regression errors.
Security and Scope Drift: Unrestricted agents can modify critical infrastructure, bypass verification steps, or introduce supply-chain vulnerabilities. There is rarely a runtime mechanism to enforce file-scoping or action boundaries.
Vendor Lock-in: Skills, rules, and workflows are often hardcoded to a specific harness (e.g., Cursor rules vs. Codex plugins). Migrating tools requires rewriting entire configuration layers, stifling adoption of superior interfaces.
Why This Is Overlooked:
Engineering teams often treat AI agents as advanced autocomplete rather than autonomous subsystems. The focus remains on prompt engineering rather than system architecture. However, production-grade AI assistance requires the same rigor as microservices: defined interfaces, state management, security gates, and observability.
Data-Backed Evidence:
Analysis of mature agent workflows reveals that structured systems managing 60 distinct agent roles, 230 reusable skill modules, and 110 language-specific rules reduce output variance by enforcing deterministic behavior. Systems implementing runtime hooks and state persistence demonstrate higher reliability in complex refactoring tasks compared to ad-hoc usage.
WOW Moment: Key Findings
Transitioning from ad-hoc agent usage to a structured infrastructure yields measurable improvements in consistency, security, and portability. The following comparison highlights the operational delta between unstructured usage and a governed agent environment.
Dimension
Ad-Hoc Agent Usage
Structured Agent Infrastructure
Output Consistency
Low. Results vary based on session context and prompt phrasing.
High. Deterministic rules and scoped agents ensure repeatable outcomes.
Security Posture
Reactive. Issues detected post-merge or via manual review.
Zero. Skills and rules are tied to a single tool's configuration format.
High. Cross-harness adapters allow skills to function across Codex, Cursor, Gemini, and others.
State Persistence
None. Context resets per session; long tasks require manual context injection.
Persistent. SQLite-backed state stores and compaction prompts maintain continuity across restarts.
Observability
Black box. Limited visibility into agent decision paths.
Transparent. Session snapshots, status dashboards, and audit logs provide full traceability.
Why This Matters:
This shift enables AI agents to function as reliable engineering teammates rather than experimental tools. By decoupling skills from the harness and enforcing runtime governance, teams can scale AI adoption without compromising code quality or security compliance. The infrastructure approach turns agent interactions from a liability into a reproducible asset.
Core Solution
Building a robust agent infrastructure requires three architectural pillars: Cross-Harness Abstraction, Runtime Governance, and State Management. The following implementation details outline how to construct this system using TypeScript-based patterns.
1. Cross-Harness Adapter Architecture
Skills and rules should be defined independently of the target tool. An adapte
r layer maps these definitions to the native configuration format of each harness. This decoupling allows a single skill definition to propagate to Cursor, Codex, Gemini CLI, Copilot, OpenCode, Zed, and Trae.
Implementation Strategy:
Define a universal skill schema and an adapter registry. The installer resolves the target harness and generates the appropriate configuration files.
Write Once, Run Everywhere: Eliminates duplication of effort when supporting multiple tools.
Centralized Governance: Updates to skills propagate automatically to all connected harnesses.
Extensibility: New adapters can be added without modifying core skill logic.
2. Runtime Governance via Hook System
Safety and quality must be enforced at runtime. A hook system acts as a gatekeeper, executing checks before, during, and after agent actions. Hooks can validate scope, prevent destructive operations, and scan for vulnerabilities.
Hook Configuration:
Profiles define the strictness level. Environment variables allow dynamic adjustment without code changes.
// Hook Profile Configuration
type HookProfile = 'minimal' | 'standard' | 'strict';
const HOOK_CONFIG = {
level: process.env.AO_SECURITY_LEVEL as HookProfile || 'standard',
disabledHooks: process.env.AO_DISABLED_HOOKS?.split(',') || [],
};
// Runtime Hook Execution
async function executeSessionHooks(sessionContext: SessionContext): Promise<void> {
if (HOOK_CONFIG.disabledHooks.includes('supply-chain')) return;
if (HOOK_CONFIG.level === 'strict') {
await enforceScopeCheck(sessionContext, 'AGENTS.md');
await blockForcePush(sessionContext);
await scanSupplyChain(sessionContext.dependencies);
}
// Always run health checks
await verifyMCPHealth(sessionContext.connectedTools);
}
// Scope Enforcement
async function enforceScopeCheck(context: SessionContext, scopeFile: string): Promise<void> {
const allowedPaths = parseScopeFile(scopeFile);
const requestedPaths = context.targetFiles;
const violations = requestedPaths.filter(p => !allowedPaths.includes(p));
if (violations.length > 0) {
throw new SecurityViolation(`Agent attempted to access restricted paths: ${violations.join(', ')}`);
}
}
Rationale:
Defense in Depth: Multiple layers of checks reduce the attack surface.
Configurable Risk: Teams can adjust strictness based on environment (e.g., minimal for local prototyping, strict for production branches).
Immediate Feedback: Violations are caught before they impact the codebase.
3. State Management and Session Continuity
Long-running tasks require persistent state. A dedicated state manager captures session data, compresses context to avoid window overflow, and provides queryable history.
State Architecture:
Use a lightweight database for persistence and compaction prompts for context optimization.
// State Manager Interface
interface StateManager {
captureSnapshot(sessionId: string): Promise<void>;
restoreState(sessionId: string): Promise<SessionState>;
compactContext(sessionId: string): Promise<string>;
}
// SQLite Implementation
class SQLiteStateManager implements StateManager {
private db: Database;
async captureSnapshot(sessionId: string): Promise<void> {
const state = await this.gatherSessionData(sessionId);
await this.db.run('INSERT INTO sessions (id, state, timestamp) VALUES (?, ?, ?)',
sessionId, JSON.stringify(state), Date.now());
}
async compactContext(sessionId: string): Promise<string> {
const rawContext = await this.db.get('SELECT context FROM sessions WHERE id = ?', sessionId);
// Use compaction prompt to summarize context
const summary = await generateSummary(rawContext.context, COMPACTION_PROMPT);
await this.db.run('UPDATE sessions SET context = ? WHERE id = ?', summary, sessionId);
return summary;
}
}
// Status Snapshot Generation
async function generateStatusReport(sessionId: string): Promise<string> {
const state = await stateManager.restoreState(sessionId);
return `# Session Status\n- **ID:** ${sessionId}\n- **State:** ${state.status}\n- **Last Action:** ${state.lastAction}`;
}
Rationale:
Continuity: Agents can resume tasks without losing context or re-explaining constraints.
Efficiency: Compaction reduces token usage and keeps the context window focused on relevant information.
Auditability: Queryable history supports debugging and compliance requirements.
4. Skill Taxonomy and Specialization
A comprehensive skill library enables agents to handle diverse engineering domains. Skills should be categorized by function and language ecosystem.
Skill Categories:
Backend: Django patterns, Spring Boot, NestJS, REST/GraphQL/gRPC design.
Frontend: Next.js Turbopack, Bun runtime, presentation builders.
Content & Media: Article engines, market research, video generation (Manim/Remotion).
Language Support:
The infrastructure supports TypeScript, Python, Go, Java, Kotlin/Android/KMP, C++, Rust, PHP, Perl, and Shell. Each language has dedicated rules and agents to ensure idiomatic output.
5. Control Plane and Observability
A control plane orchestrates sessions, manages state, and provides a dashboard for monitoring. The ecc2 Rust-based control plane offers a high-performance alpha implementation with commands for session lifecycle management.
Control Plane Commands:
ao dashboard: Launch GUI overview.
ao start: Initialize new session.
ao sessions: List active/historical sessions.
ao status: Display current state.
ao stop: Terminate session.
ao resume: Continue interrupted session.
ao daemon: Run background service.
The dashboard provides visual feedback on session health, hook status, and agent activity, supporting dark/light themes and project branding.
Pitfall Guide
Implementing agent infrastructure introduces specific risks. The following pitfalls highlight common mistakes and their remedies based on production experience.
Pitfall
Explanation
Fix
Hook Stacking
Installing multiple configuration layers that conflict, causing unpredictable behavior or performance degradation.
Use a single source of truth for installation. Avoid stacking plugin and manual installs. Validate configuration uniqueness during setup.
Vague Prompt Contracts
Agents receive ambiguous instructions, leading to scope creep or hallucinated requirements.
Implement PromptGuard patterns. Treat prompts as executable contracts with explicit verification criteria and conflict detection.
Ignoring Compaction Thresholds
Context windows overflow, causing agents to lose critical constraints or repeat errors.
Configure automatic compaction triggers based on token usage. Set thresholds (e.g., 80% capacity) to summarize context before overflow.
Security Blindness
Agents access sensitive files or bypass verification steps due to missing scope definitions.
Enforce AGENTS.md scoping and supply-chain hooks. Regularly audit hook configurations and disable risky actions by default.
Tool Lock-in
Writing skills specific to one harness, preventing migration or multi-tool usage.
Adopt cross-harness adapters. Define skills in a universal format and map to harness-specific configs via the adapter layer.
Over-Engineering Minimal Tasks
Applying full security profiles and state management to simple scripts, adding unnecessary latency.
Use profile selection (minimal, core, full). Match the infrastructure complexity to the task risk and duration.
Neglecting Legacy Shims
Failing to maintain backward compatibility for older commands or workflows, breaking existing integrations.
Include legacy shims (e.g., 75 command mappings) to bridge old and new interfaces. Test migrations thoroughly.
Production Bundle
Action Checklist
Define Agent Roles: Map out distinct agent responsibilities (e.g., reviewer, builder, resolver) and assign scoped instruction sets.
Select Security Profile: Choose minimal, standard, or strict based on team risk tolerance and environment requirements.
Configure Adapters: Set up cross-harness adapters to ensure skills propagate to all target tools (Cursor, Codex, Gemini, etc.).
Implement State Strategy: Configure SQLite state store and compaction thresholds to maintain session continuity.
Audit Prompt Contracts: Review all skill instructions for clarity, conflicts, and verification criteria using PromptGuard patterns.
Verify Hook Coverage: Ensure runtime hooks cover scope enforcement, supply-chain scanning, and MCP health checks.
Test Cross-Harness Parity: Validate that skills produce consistent results across different tools using the adapter layer.
Monitor Dashboard: Use the control plane dashboard to track session health, hook status, and agent activity in real-time.
Decision Matrix
Scenario
Recommended Approach
Why
Cost Impact
Solo Developer, Quick Script
minimal Profile, No Hooks
Low overhead, fast setup, sufficient for low-risk tasks.
Install CLI: Run the installer script or npm package with your desired profile and target harness.
./install.sh --profile core --target cursor
Verify Configuration: Check that adapters have generated the correct files in your project directory.
ls -la .cursor/rules
Initialize Session: Start a new agent session using the control plane.
ao start --project my-app
Deploy Skills: Install required skills using the consult command to discover available modules.
ao consult "security reviews" --target cursor
Monitor Activity: Launch the dashboard to observe session health and hook status.
ao dashboard
This infrastructure transforms AI coding agents from unpredictable utilities into reliable, governed engineering systems. By adopting cross-harness adapters, runtime governance, and persistent state management, teams can scale AI assistance while maintaining code quality, security, and reproducibility.
π Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all 635+ tutorials.