Back to KB
Difficulty
Intermediate
Read Time
4 min

Run a small headless task in your project.

By Codcompass TeamΒ·Β·4 min read

Run a small headless task in your project.

Current Situation Analysis

AI coding agents generate functional code diffs but leave no auditable reasoning trail. In regulated engineering environments (avionics DO-178C/DO-330, medical IEC 62304, automotive ISO 26262, defense CMMC, finance SOC 2), this creates a hard compliance stop. When a TUI session closes, the model version, tool invocations, policy decisions, and causal reasoning vanish.

Traditional logging approaches fail for three structural reasons:

  1. Logs are developer artifacts, not reviewer artifacts: Flat log files lack the integrity guarantees, structured schema, and audience-focused design required for external audit.
  2. Missing causal chain linkage: Auditors need to trace file change β†’ tool call β†’ model response. Flat logs cannot establish parent-child event ordering or prove causality.
  3. Portability and offline verification gaps: Audit evidence must travel independently of the source repository. Shell-script stitching creates fragile, host-dependent infrastructure that cannot be verified offline or across air-gapped environments.

Without a deterministic, tamper-evident evidence chain, AI-generated code becomes a compliance liability rather than a productivity multiplier.

WOW Moment: Key Findings

ApproachAudit CompletenessChain IntegrityOffline Portability
Standard AI Agent + Flat Logs~20% (answers 1/5 core questions)None (unordered, no parent linkage)Requires live repo/dashboard access
Akmon (AGEF Audit Chain)100% (answers all 5 core questions)Tamper-evident, content-addressed CBOR streamSelf-contained tar.zst bundle, verifiable without source repo

Key Findings:

  • Evidence vs. Logs: Structured evidence summaries paired with content-addressed audit chains close the compliance gap by explicitly answering what the agent read, which tools were called, what changed, which policy allowed it, and whether the session is replayable.
  • Deterministic Policy Merge: A fixed 4-layer precedence model eliminates ad-hoc policy ambiguity, replacing subjective "I think the policy was" with inspectable, version-controlled TOML/JSON merges.
  • Runtime Reproducibility: Shipping as a single compiled Rust binary eliminates plugin/runtime drift, ensuring identical behavior across developer laptops, CI runners, and hardened SSH hosts.

Core Solution

Akmon is a single Rust binary that functions as an AI coding agent (interactive TUI + headless --task mode) engineered for environments where reasoning, tool calls, and file changes must be independently reviewable. Every session writes a tamper-evident, content-addressed audit chain to .akmon/audit/<session-id>.jsonl and a structured evidence summary to .akmon/evidence/<session-id>.json. Sessions support deterministic replay, diff comparison, pre-review redaction, and export as portable AGEF (Agent Governance Evidence Format) bundles.

Architecture Decisions:

  • Single Binary Runtime: Explicit runtime

state prevents host mismatch. Troubleshooting isolates to policy, providers, repository state, or model behavior.

  • Deterministic Policy Merge: Four fixed precedence layers ensure predictable rule resolution:
    1. Built-in profile (dev, staging, prod)
    2. Policy packs (.akmon/policy-packs/*.toml|json)
    3. Project-local policy (.akmon/policy.toml|json)
    4. CLI override (--policy-override) List fields append and deduplicate, preserving last-occurrence order. Higher precedence rules override lower ones deterministically.
  • Open Evidence Format (AGEF): The tar.zst bundle contains manifest.json, an events.bin stream of length-delimited canonical CBOR records, and content-addressed objects/<hex> files. Spec v0.1.1 is governed independently to outlive any single implementation.

Implementation Pipeline:

# Run a small headless task in your project.
cd your-project
akmon --yes --output json --task "summarize failing tests and propose minimal fixes" | tee run.json

# Verify the tamper-evident audit chain for the session.
akmon audit verify .akmon/audit/<session-id>.jsonl

# Verify the evidence schema and the linkage to the audit chain.
akmon evidence verify .akmon/evidence/<session-id>.json

# Enforce a single-run SLO check.
akmon slo verify .akmon/evidence/<session-id>.json --strict

# Detect regressions vs a historical baseline.
akmon slo trend .akmon/evidence/<session-id>.json \
  --baseline-dir .akmon/evidence/history \
  --window 20 \
  --strict
akmon policy show-effective --profile prod \
  --policy-pack .akmon/policy-packs/org.toml \
  --policy-pack .akmon/policy-packs/team.toml

Pitfall Guide

  1. Treating Logs as Evidence: Flat logs lack integrity guarantees and reviewer-focused structure. Evidence requires tamper-evident chains, explicit parent-child linkage, and offline verifiability. Always validate against the 5 core auditor questions.
  2. Ignoring Causal Chain Linkage: Unordered event streams cannot prove that a specific file change resulted from a specific tool call and model response. Without parent linkage, audit trails are legally and technically inadmissible in regulated workflows.
  3. Assuming Host Runtime Equivalence: Plugin-based or interpreted runtimes drift across environments. A single compiled binary ensures behavioral reproducibility. Never rely on host-specific dependencies for compliance-critical tooling.
  4. Relying on Implicit Policy Resolution: Ad-hoc or undocumented policy overrides create audit gaps. Policy must be deterministically merged, inspectable, and version-controlled. Use akmon policy show-effective to capture the exact merged configuration before execution.
  5. Expecting Model Behavior Guarantees: Governance tools do not prevent model hallucinations or misbehavior. They bound, observe, and prove consequences. Design workflows assuming unpredictable model actions and rely on the audit chain for post-hoc validation and regression detection.
  6. Overlooking Offline Portability: Audit bundles must travel independently of the source repository. Relying on live dashboards, network-dependent verifiers, or repo-linked logs fails during external reviews or air-gapped audits. Always export and validate AGEF bundles in isolated environments.

Deliverables

πŸ“˜ Akmon Trust Pipeline Blueprint A step-by-step implementation guide for integrating Akmon into regulated CI/CD workflows. Covers binary deployment, headless task execution, audit/evidence verification, SLO enforcement, regression trending, and AGEF bundle export. Includes architecture diagrams for policy merge precedence and content-addressed journal storage.

βœ… Regulated AI Agent Adoption Checklist

  • Install single Rust binary and verify runtime reproducibility across target environments
  • Execute headless task and capture JSON output + session ID
  • Run akmon audit verify to confirm tamper-evident chain integrity
  • Run akmon evidence verify to validate schema and chain linkage
  • Execute akmon slo verify --strict for single-run compliance gating
  • Configure --baseline-dir and run akmon slo trend for regression detection
  • Export AGEF bundle and validate offline using independent verifier
  • Document effective policy merge using akmon policy show-effective
  • Integrate 5-command pipeline exit codes into CI gate configuration

βš™οΈ Configuration Templates

  • .akmon/policy.toml (project-local override template)
  • .akmon/policy-packs/org.toml & team.toml (layered precedence examples)
  • manifest.json & events.bin structure reference for AGEF v0.1.1 offline verification