Difficulty

Intermediate

Read Time

9 min

ecc: Building the Operating System for AI Coding Agents — 230+ Skills, 60 Agents, Cross-Harness

By Codcompass Team·2026-05-19·9 min read

Orchestrating AI Coding Agents: A Cross-Platform Infrastructure for Reproducible Engineering

Current Situation Analysis

The proliferation of AI coding agents has shifted the developer landscape from manual implementation to orchestration. Tools like OpenAI Codex, Cursor, Gemini CLI, and GitHub Copilot provide raw generative power, but relying on them as isolated utilities introduces significant engineering debt. The industry is currently facing a "structure gap" where model capability outpaces workflow governance.

The Core Pain Points:

Non-Deterministic Output: Without standardized instruction sets, the same task yields divergent results across sessions or team members. This variance breaks CI/CD consistency and complicates code reviews.
Context Fragmentation: Agents operate in ephemeral sessions. Long-running engineering tasks suffer from context window exhaustion, leading to lost state, forgotten constraints, and regression errors.
Security and Scope Drift: Unrestricted agents can modify critical infrastructure, bypass verification steps, or introduce supply-chain vulnerabilities. There is rarely a runtime mechanism to enforce file-scoping or action boundaries.
Vendor Lock-in: Skills, rules, and workflows are often hardcoded to a specific harness (e.g., Cursor rules vs. Codex plugins). Migrating tools requires rewriting entire configuration layers, stifling adoption of superior interfaces.

Why This Is Overlooked: Engineering teams often treat AI agents as advanced autocomplete rather than autonomous subsystems. The focus remains on prompt engineering rather than system architecture. However, production-grade AI assistance requires the same rigor as microservices: defined interfaces, state management, security gates, and observability.

Data-Backed Evidence: Analysis of mature agent workflows reveals that structured systems managing 60 distinct agent roles, 230 reusable skill modules, and 110 language-specific rules reduce output variance by enforcing deterministic behavior. Systems implementing runtime hooks and state persistence demonstrate higher reliability in complex refactoring tasks compared to ad-hoc usage.

WOW Moment: Key Findings

Transitioning from ad-hoc agent usage to a structured infrastructure yields measurable improvements in consistency, security, and portability. The following comparison highlights the operational delta between unstructured usage and a governed agent environment.

Dimension	Ad-Hoc Agent Usage	Structured Agent Infrastructure
Output Consistency	Low. Results vary based on session context and prompt phrasing.	High. Deterministic rules and scoped agents ensure repeatable outcomes.
Security Posture	Reactive. Issues detected post-merge or via manual review.	Proactive. Runtime hooks enforce scope, block dangerous actions, and scan dependencies.
Portability	Zero. Skills and rules are tied to a single tool's configuration format.	High. Cross-harness adapters allow skills to function across Codex, Cursor, Gemini, and others.
State Persistence	None. Context resets per session; long tasks require manual context injection.	Persistent. SQLite-backed state stores and compaction prompts maintain continuity across restarts.
Observability	Black box. Limited visibility into agent decision paths.	Transparent. Session snapshots, status dashboards, and audit logs provide full traceability.

Why This Matters: This shift enables AI agents to function as reliable engineering teammates rather than experimental tools. By decoupling skills from the harness and enforcing runtime governance, teams can scale AI adoption without compromising code quality or security compliance. The infrastructure approach turns agent interactions from a liability into a reproducible asset.

Core Solution

Building a robust agent infrastructure requires three architectural pillars: Cross-Harness Abstraction, Runtime Governance, and State Management. The following implementation details outline how to construct this system using TypeScript-based patterns.

1. Cross-Harness Adapter Architecture

Skills and rules should be defined independently of the target tool. An adapte

r layer maps these definitions to the native configuration format of each harness. This decoupling allows a single skill definition to propagate to Cursor, Codex, Gemini CLI, Copilot, OpenCode, Zed, and Trae.

Implementation Strategy: Define a universal skill schema and an adapter registry. The installer resolves the target harness and generates the appropriate configuration files.

// Universal Skill Definition
interface AgentSkill {
  id: string;
  name: string;
  description: string;
  category: 'backend' | 'frontend' | 'security' | 'ml' | 'ops';
  instructions: string;
}

// Adapter Interface
interface HarnessAdapter {
  target: 'cursor' | 'codex' | 'gemini' | 'copilot' | 'opencode' | 'zed' | 'trae';
  generateConfig(skill: AgentSkill): string;
  installPath: string;
}

// Example Adapter Implementation
const cursorAdapter: HarnessAdapter = {
  target: 'cursor',
  installPath: '.cursor/rules',
  generateConfig: (skill) => `# ${skill.name}\n${skill.instructions}`
};

// Installation Orchestration
async function deploySkill(skill: AgentSkill, targetHarness: HarnessAdapter['target']): Promise<void> {
  const adapter = getAdapter(targetHarness);
  const configContent = adapter.generateConfig(skill);
  await writeFileSync(`${adapter.installPath}/${skill.id}.md`, configContent);
}

Rationale:

Write Once, Run Everywhere: Eliminates duplication of effort when supporting multiple tools.
Centralized Governance: Updates to skills propagate automatically to all connected harnesses.
Extensibility: New adapters can be added without modifying core skill logic.

2. Runtime Governance via Hook System

Safety and quality must be enforced at runtime. A hook system acts as a gatekeeper, executing checks before, during, and after agent actions. Hooks can validate scope, prevent destructive operations, and scan for vulnerabilities.

Hook Configuration: Profiles define the strictness level. Environment variables allow dynamic adjustment without code changes.

// Hook Profile Configuration
type HookProfile = 'minimal' | 'standard' | 'strict';

const HOOK_CONFIG = {
  level: process.env.AO_SECURITY_LEVEL as HookProfile || 'standard',
  disabledHooks: process.env.AO_DISABLED_HOOKS?.split(',') || [],
};

// Runtime Hook Execution
async function executeSessionHooks(sessionContext: SessionContext): Promise<void> {
  if (HOOK_CONFIG.disabledHooks.includes('supply-chain')) return;

  if (HOOK_CONFIG.level === 'strict') {
    await enforceScopeCheck(sessionContext, 'AGENTS.md');
    await blockForcePush(sessionContext);
    await scanSupplyChain(sessionContext.dependencies);
  }

  // Always run health checks
  await verifyMCPHealth(sessionContext.connectedTools);
}

// Scope Enforcement
async function enforceScopeCheck(context: SessionContext, scopeFile: string): Promise<void> {
  const allowedPaths = parseScopeFile(scopeFile);
  const requestedPaths = context.targetFiles;
  
  const violations = requestedPaths.filter(p => !allowedPaths.includes(p));
  if (violations.length > 0) {
    throw new SecurityViolation(`Agent attempted to access restricted paths: ${violations.join(', ')}`);
  }
}

Rationale:

Defense in Depth: Multiple layers of checks reduce the attack surface.
Configurable Risk: Teams can adjust strictness based on environment (e.g., minimal for local prototyping, strict for production branches).
Immediate Feedback: Violations are caught before they impact the codebase.

3. State Management and Session Continuity

Long-running tasks require persistent state. A dedicated state manager captures session data, compresses context to avoid window overflow, and provides queryable history.

State Architecture: Use a lightweight database for persistence and compaction prompts for context optimization.

// State Manager Interface
interface StateManager {
  captureSnapshot(sessionId: string): Promise<void>;
  restoreState(sessionId: string): Promise<SessionState>;
  compactContext(sessionId: string): Promise<string>;
}

// SQLite Implementation
class SQLiteStateManager implements StateManager {
  private db: Database;

  async captureSnapshot(sessionId: string): Promise<void> {
    const state = await this.gatherSessionData(sessionId);
    await this.db.run('INSERT INTO sessions (id, state, timestamp) VALUES (?, ?, ?)', 
      sessionId, JSON.stringify(state), Date.now());
  }

  async compactContext(sessionId: string): Promise<string> {
    const rawContext = await this.db.get('SELECT context FROM sessions WHERE id = ?', sessionId);
    // Use compaction prompt to summarize context
    const summary = await generateSummary(rawContext.context, COMPACTION_PROMPT);
    await this.db.run('UPDATE sessions SET context = ? WHERE id = ?', summary, sessionId);
    return summary;
  }
}

// Status Snapshot Generation
async function generateStatusReport(sessionId: string): Promise<string> {
  const state = await stateManager.restoreState(sessionId);
  return `# Session Status\n- **ID:** ${sessionId}\n- **State:** ${state.status}\n- **Last Action:** ${state.lastAction}`;
}

Rationale:

Continuity: Agents can resume tasks without losing context or re-explaining constraints.
Efficiency: Compaction reduces token usage and keeps the context window focused on relevant information.
Auditability: Queryable history supports debugging and compliance requirements.

4. Skill Taxonomy and Specialization

A comprehensive skill library enables agents to handle diverse engineering domains. Skills should be categorized by function and language ecosystem.

Skill Categories:

Backend: Django patterns, Spring Boot, NestJS, REST/GraphQL/gRPC design.
Frontend: Next.js Turbopack, Bun runtime, presentation builders.
Security: AgentShield integration, vulnerability scanning pipelines.
ML/Deep Learning: PyTorch patterns, MLOps workflows.
Operations: Brand voice management, billing ops, workspace integrations.
Content & Media: Article engines, market research, video generation (Manim/Remotion).

Language Support: The infrastructure supports TypeScript, Python, Go, Java, Kotlin/Android/KMP, C++, Rust, PHP, Perl, and Shell. Each language has dedicated rules and agents to ensure idiomatic output.

5. Control Plane and Observability

A control plane orchestrates sessions, manages state, and provides a dashboard for monitoring. The ecc2 Rust-based control plane offers a high-performance alpha implementation with commands for session lifecycle management.

Control Plane Commands:

ao dashboard: Launch GUI overview.
ao start: Initialize new session.
ao sessions: List active/historical sessions.
ao status: Display current state.
ao stop: Terminate session.
ao resume: Continue interrupted session.
ao daemon: Run background service.

The dashboard provides visual feedback on session health, hook status, and agent activity, supporting dark/light themes and project branding.

Pitfall Guide

Implementing agent infrastructure introduces specific risks. The following pitfalls highlight common mistakes and their remedies based on production experience.

Pitfall	Explanation	Fix
Hook Stacking	Installing multiple configuration layers that conflict, causing unpredictable behavior or performance degradation.	Use a single source of truth for installation. Avoid stacking plugin and manual installs. Validate configuration uniqueness during setup.
Vague Prompt Contracts	Agents receive ambiguous instructions, leading to scope creep or hallucinated requirements.	Implement PromptGuard patterns. Treat prompts as executable contracts with explicit verification criteria and conflict detection.
Ignoring Compaction Thresholds	Context windows overflow, causing agents to lose critical constraints or repeat errors.	Configure automatic compaction triggers based on token usage. Set thresholds (e.g., 80% capacity) to summarize context before overflow.
Security Blindness	Agents access sensitive files or bypass verification steps due to missing scope definitions.	Enforce `AGENTS.md` scoping and supply-chain hooks. Regularly audit hook configurations and disable risky actions by default.
Tool Lock-in	Writing skills specific to one harness, preventing migration or multi-tool usage.	Adopt cross-harness adapters. Define skills in a universal format and map to harness-specific configs via the adapter layer.
Over-Engineering Minimal Tasks	Applying full security profiles and state management to simple scripts, adding unnecessary latency.	Use profile selection (`minimal`, `core`, `full`). Match the infrastructure complexity to the task risk and duration.
Neglecting Legacy Shims	Failing to maintain backward compatibility for older commands or workflows, breaking existing integrations.	Include legacy shims (e.g., 75 command mappings) to bridge old and new interfaces. Test migrations thoroughly.

Production Bundle

Action Checklist

Define Agent Roles: Map out distinct agent responsibilities (e.g., reviewer, builder, resolver) and assign scoped instruction sets.
Select Security Profile: Choose minimal, standard, or strict based on team risk tolerance and environment requirements.
Configure Adapters: Set up cross-harness adapters to ensure skills propagate to all target tools (Cursor, Codex, Gemini, etc.).
Implement State Strategy: Configure SQLite state store and compaction thresholds to maintain session continuity.
Audit Prompt Contracts: Review all skill instructions for clarity, conflicts, and verification criteria using PromptGuard patterns.
Verify Hook Coverage: Ensure runtime hooks cover scope enforcement, supply-chain scanning, and MCP health checks.
Test Cross-Harness Parity: Validate that skills produce consistent results across different tools using the adapter layer.
Monitor Dashboard: Use the control plane dashboard to track session health, hook status, and agent activity in real-time.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Solo Developer, Quick Script	`minimal` Profile, No Hooks	Low overhead, fast setup, sufficient for low-risk tasks.	Free
Team Project, Security Critical	`full` Profile, `strict` Hooks	Enforces compliance, prevents drift, ensures auditability.	Higher latency, setup time
Multi-Tool Environment	Cross-Harness Adapters	Maintains consistency across Cursor, Codex, Gemini, etc.	Initial configuration effort
Long-Running Refactoring	State Manager + Compaction	Preserves context, enables resume, reduces token waste.	Storage overhead
Legacy Codebase Migration	Legacy Shims + Selective Install	Bridges old commands, minimizes disruption during transition.	Maintenance burden

Configuration Template

# agent-orchestrator.config.yaml
version: "2.0"

security:
  level: "standard"  # minimal | standard | strict
  disabled_hooks: []
  scope_file: "AGENTS.md"

state:
  backend: "sqlite"
  compaction_threshold: 8000  # tokens
  snapshot_interval: 300      # seconds

adapters:
  - target: "cursor"
    path: ".cursor/rules"
    format: "markdown"
  - target: "codex"
    path: ".codex/skills"
    format: "json"
  - target: "gemini"
    path: ".gemini/config"
    format: "yaml"

skills:
  categories:
    - backend
    - frontend
    - security
    - ml
    - ops
  languages:
    - typescript
    - python
    - go
    - java
    - rust

control_plane:
  rust_alpha: true
  dashboard:
    theme: "dark"
    font: "monospace"

Quick Start Guide

Install CLI: Run the installer script or npm package with your desired profile and target harness.
```
./install.sh --profile core --target cursor
```
Verify Configuration: Check that adapters have generated the correct files in your project directory.
```
ls -la .cursor/rules
```
Initialize Session: Start a new agent session using the control plane.
```
ao start --project my-app
```
Deploy Skills: Install required skills using the consult command to discover available modules.
```
ao consult "security reviews" --target cursor
```
Monitor Activity: Launch the dashboard to observe session health and hook status.
```
ao dashboard
```

This infrastructure transforms AI coding agents from unpredictable utilities into reliable, governed engineering systems. By adopting cross-harness adapters, runtime governance, and persistent state management, teams can scale AI assistance while maintaining code quality, security, and reproducibility.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back