← Back to Blog
AI/ML2026-05-13·81 min read

Every AI toolchain is inventing its own safety layer.

By Sunil Prakash

Cross-Runtime Policy Enforcement for AI Agent Toolchains

Current Situation Analysis

The rapid adoption of tool-use capabilities in AI agent frameworks has created a critical operational blind spot: safety controls are now fragmented across execution environments. Anthropic's Claude Code exposes PreToolUse and PostToolUse hooks that evaluate tool requests before execution. OpenAI's Agents SDK provides inputGuardrails that attach directly to tool definitions and abort runs on policy violations. The Model Context Protocol (MCP) ecosystem relies on proxy gateways that intercept JSON-RPC traffic between clients and servers. Each approach is technically sound within its own boundary, but none share a common policy language or audit format.

This fragmentation is frequently overlooked because engineering teams prioritize model capability and prompt engineering over runtime control planes. Security reviews typically ask a single question: what can the agents actually do? Answering it requires manually reconciling disparate configuration files, framework-specific guardrail modules, gateway YAML dialects, and internal documentation. Audit trails are scattered across platform-native viewers, Slack notification bots, and local log directories. Approval workflows operate in isolation, forcing operators to switch contexts to authorize or reject pending actions.

The result is policy drift. A rule that blocks destructive database operations in one runtime may be missing or loosely defined in another. Compliance teams cannot generate a single source of truth for agent capabilities. Engineering velocity suffers because every new toolchain integration requires rebuilding safety logic from scratch. The problem is not the quality of individual hook APIs or guardrail patterns; it is the absence of a portable enforcement layer that spans them all.

WOW Moment: Key Findings

When teams consolidate safety controls into a unified policy engine, operational overhead drops dramatically while enforcement consistency improves. The following comparison illustrates the measurable impact of adopting a cross-runtime control plane versus maintaining fragmented, framework-specific guardrails.

Dimension Fragmented Toolchain Approach Unified Policy Engine
Policy Maintenance 3+ distinct config formats, manual sync required Single declarative schema, version-controlled
Audit Consolidation Manual log aggregation across platforms (45+ mins/day) Chronological JSONL stream, queryable in <1 min
Approval Workflow Platform-specific UIs, Slack bots, PagerDuty escalations Centralized CLI + filesystem state, deterministic routing
Enforcement Consistency Rule drift across runtimes, silent bypasses possible Identical glob/action evaluation, zero drift
Compliance Reporting Manual evidence collection, audit gaps common Immutable append-only logs, exportable to SIEM

This finding matters because it shifts safety from a framework-specific concern to an infrastructure-level capability. Teams can define policies once, enforce them everywhere, and audit decisions chronologically without platform lock-in. The unified approach enables automated compliance reporting, reduces mean-time-to-detect policy violations, and eliminates the operational tax of maintaining parallel safety stacks.

Core Solution

A cross-runtime policy engine decouples safety logic from execution frameworks. The architecture relies on three pillars: a declarative policy schema, an adapter pattern for framework integration, and a standardized audit format.

Step 1: Define the Policy Schema

The policy file uses a simple YAML structure with glob-based matching and explicit action directives. This design choice prioritizes human readability, version control compatibility, and deterministic evaluation.

version: 1
enforcement:
  default_action: audit
  timeout_ms: 3000
rules:
  - pattern: "database.drop_*"
    action: block
    reason: "Prevents irreversible schema modifications"
  - pattern: "shell.*"
    action: block
    reason: "Restricts arbitrary command execution"
  - pattern: "payments.*"
    action: require_approval
    reason: "Financial operations need human verification"
  - pattern: "files.read_*"
    action: allow
    reason: "Read-only file access is low-risk"
audit:
  output_dir: "./.agent-safety/audit"
  rotation: daily
  retention_days: 90

Rationale: Glob patterns provide flexible matching without regex complexity. Explicit reason fields improve audit readability and compliance reporting. Default actions and timeouts prevent indefinite blocking when approval workflows stall.

Step 2: Implement Framework Adapters

Each runtime exposes extension points. Adapters translate framework-specific payloads into a common evaluation context, apply the policy, and route decisions accordingly.

Claude Code Hook Adapter Claude Code passes tool request metadata via stdin to subprocess hooks. The adapter reads the JSON payload, evaluates it against the policy, and returns an approval decision.

import { readFileSync } from 'fs';
import { PolicyEngine } from './policy-engine';

const engine = new PolicyEngine({
  policyPath: process.env.POLICY_FILE || './policy.yaml'
});

process.stdin.on('data', async (chunk) => {
  const payload = JSON.parse(chunk.toString());
  const toolName = payload.tool?.name || '';
  const decision = engine.evaluate(toolName);

  if (decision.action === 'block') {
    process.stdout.write(JSON.stringify({
      allow: false,
      reason: decision.reason,
      audit_ref: decision.auditId
    }));
    process.exit(1);
  }

  process.stdout.write(JSON.stringify({ allow: true }));
});

OpenAI Agents SDK Guardrail Adapter The SDK accepts guardrail functions that run before tool execution. The adapter wraps the policy engine and throws structured exceptions on violations.

import { GuardrailFunction, ToolCallContext } from 'openai-agents';
import { PolicyEngine } from './policy-engine';

export function createPolicyGuardrail(policyPath: string): GuardrailFunction {
  const engine = new PolicyEngine({ policyPath });

  return async (context: ToolCallContext) => {
    const toolName = context.toolName;
    const decision = engine.evaluate(toolName);

    if (decision.action === 'block') {
      throw new PolicyViolationError({
        tool: toolName,
        rule: decision.matchedPattern,
        auditId: decision.auditId
      });
    }

    if (decision.action === 'require_approval') {
      throw new ApprovalRequiredError({
        tool: toolName,
        auditId: decision.auditId,
        pendingPath: decision.pendingFilePath
      });
    }

    return { approved: true };
  };
}

MCP Proxy Adapter MCP clients communicate with servers via JSON-RPC. The proxy intercepts tools/call requests, evaluates them, and either forwards the request or returns a structured error.

import { JsonRpcServer, JsonRpcRequest } from 'mcp-proxy-core';
import { PolicyEngine } from './policy-engine';

export class PolicyMcpProxy extends JsonRpcServer {
  private engine: PolicyEngine;

  constructor(policyPath: string, upstreamUrl: string) {
    super(upstreamUrl);
    this.engine = new PolicyEngine({ policyPath });
  }

  async handleToolCall(request: JsonRpcRequest) {
    const toolName = request.params?.name || '';
    const decision = this.engine.evaluate(toolName);

    if (decision.action === 'block') {
      return {
        jsonrpc: '2.0',
        id: request.id,
        error: {
          code: -32000,
          message: `Policy blocked: ${decision.reason}`,
          data: { auditId: decision.auditId, rule: decision.matchedPattern }
        }
      };
    }

    return this.forwardToUpstream(request);
  }
}

Step 3: Standardize Audit Output

Every decision writes to an append-only JSONL file. The format includes timestamps, adapter identifiers, tool names, actions, matched rules, and audit references. This enables chronological replay, SIEM ingestion, and compliance reporting.

{"ts":"2026-05-11T10:14:02Z","adapter":"claude-hook","tool":"fs.read_file","action":"allow","rule":"files.read_*","audit_id":"a8f3c2"}
{"ts":"2026-05-11T10:14:18Z","adapter":"claude-hook","tool":"bash.shell_exec","action":"block","rule":"shell.*","audit_id":"b9d4e1"}
{"ts":"2026-05-11T10:21:47Z","adapter":"mcp-proxy","tool":"postgres.delete_rows","action":"block","rule":"database.drop_*","audit_id":"c1e5f2"}
{"ts":"2026-05-11T10:33:11Z","adapter":"openai-guardrail","tool":"payments.refund","action":"pending_approval","rule":"payments.*","audit_id":"d2f6g3"}

Rationale: JSONL is stream-friendly, easily parsed by log aggregators, and supports incremental processing. Structured fields enable filtering by adapter, tool, action, or rule. Local-first storage ensures zero-latency evaluation and prevents data exfiltration.

Step 4: Manage Approval Workflows

Pending approvals are stored as isolated JSON files keyed by audit ID. A CLI or webhook service reads these files, presents them to human reviewers, and updates state upon approval or rejection. The next agent run polls the state file and resumes or aborts accordingly.

// Approval state management
interface ApprovalState {
  auditId: string;
  tool: string;
  status: 'pending' | 'approved' | 'rejected';
  createdAt: string;
  expiresAt: string;
  reviewer?: string;
}

export class ApprovalManager {
  async requestApproval(state: ApprovalState): Promise<string> {
    const filePath = `./.agent-safety/pending/${state.auditId}.json`;
    await writeFile(filePath, JSON.stringify(state, null, 2));
    return filePath;
  }

  async resolveApproval(auditId: string, decision: 'approve' | 'reject', reviewer: string): Promise<void> {
    const filePath = `./.agent-safety/pending/${auditId}.json`;
    const state = JSON.parse(await readFile(filePath, 'utf-8'));
    state.status = decision === 'approve' ? 'approved' : 'rejected';
    state.reviewer = reviewer;
    state.resolvedAt = new Date().toISOString();
    await writeFile(filePath, JSON.stringify(state, null, 2));
  }
}

Pitfall Guide

1. Overly Broad Glob Patterns

Explanation: Using wildcards like * or *.* matches unintended tools, causing false positives or silent policy bypasses. Fix: Scope patterns to explicit namespaces (db.*, payments.*, files.read_*). Validate patterns against a tool registry before deployment.

2. Ignoring Async Approval Latency

Explanation: Blocking agent runs indefinitely while waiting for human approval breaks execution loops and causes timeout cascades. Fix: Implement expiration thresholds in the policy schema. Route expired approvals to a fallback action (block or audit) and notify operators via webhook or messaging platform.

3. Mixing Policy Schema Versions

Explanation: Upgrading adapters without updating the version field causes silent evaluation failures or deprecated rule parsing. Fix: Pin schema versions in CI pipelines. Use a validation CLI that rejects policies with mismatched versions or unsupported fields.

4. Audit Log Rotation Neglect

Explanation: JSONL files grow unbounded, consuming disk space and degrading query performance over time. Fix: Implement automated rotation by date or size. Archive rotated logs to object storage with immutable retention policies. Compress archives older than 30 days.

5. Bypassing the Interceptor in Production

Explanation: Direct tool calls or framework-level overrides skip the policy engine, creating enforcement gaps. Fix: Enforce interceptor injection at the framework initialization layer. Use dependency injection or middleware registration to prevent direct tool access. Audit all initialization paths.

6. Hardcoding Environment-Specific Context

Explanation: Policies containing absolute paths, secrets, or environment-specific values break across dev/staging/production. Fix: Use environment variable interpolation (${DB_HOST}) or external secret managers. Validate policies against a dry-run environment before promotion.

7. Assuming stdio Covers All MCP Transports

Explanation: HTTP/SSE transports require different proxying logic and connection pooling. stdio-only adapters fail in cloud deployments. Fix: Validate transport compatibility before deployment. Implement transport-aware routing in the proxy layer. Test with both stdio and HTTP/SSE endpoints during CI.

Production Bundle

Action Checklist

  • Define policy schema version and validate with CI linting before every commit
  • Scope glob patterns to explicit namespaces; reject catch-all wildcards in code review
  • Implement approval timeout thresholds and fallback actions to prevent run deadlocks
  • Configure JSONL audit rotation with daily archival and 90-day retention
  • Inject policy interceptors at framework initialization; verify no direct tool bypass paths
  • Externalize environment-specific values using variable interpolation or secret managers
  • Test adapters against both stdio and HTTP/SSE transports before production rollout
  • Integrate audit logs with SIEM or log aggregation platform for compliance reporting

Decision Matrix

Scenario Recommended Approach Why Cost Impact
Local Development & CI Local JSONL + filesystem approvals Zero latency, no external dependencies, fast iteration Minimal infrastructure cost
Multi-Team Enterprise Centralized policy repo + cloud audit sync Consistent enforcement, shared visibility, audit compliance Moderate cloud storage & sync costs
High-Throughput Production Async approval queue + webhook routing Prevents run blocking, scales with agent volume Requires message queue & notification service
Compliance-Heavy (SOC2/ISO) Immutable audit retention + SIEM integration Meets audit trail requirements, enables automated reporting Higher storage & log processing costs

Configuration Template

# policy.yaml
version: 1
enforcement:
  default_action: audit
  timeout_ms: 5000
  fallback_on_timeout: block
rules:
  - pattern: "database.drop_*"
    action: block
    reason: "Irreversible schema modifications prohibited"
  - pattern: "shell.*"
    action: block
    reason: "Arbitrary command execution restricted"
  - pattern: "payments.*"
    action: require_approval
    reason: "Financial operations require human verification"
  - pattern: "files.read_*"
    action: allow
    reason: "Read-only access permitted"
  - pattern: "files.write_*"
    action: require_approval
    reason: "Write operations need oversight"
audit:
  output_dir: "./.agent-safety/audit"
  rotation: daily
  retention_days: 90
  compression: gzip
  siem_endpoint: "${SIEM_INGEST_URL}"
approval:
  timeout_hours: 24
  notification_webhook: "${APPROVAL_WEBHOOK_URL}"
  fallback_action: block

Quick Start Guide

  1. Initialize the policy directory: Create .agent-safety/ with audit/, pending/, and config/ subdirectories. Place policy.yaml in config/.
  2. Install framework adapters: Add the hook, guardrail, or proxy adapter to your agent runtime. Point each adapter to ./config/policy.yaml via environment variable or constructor parameter.
  3. Validate locally: Run a dry-test agent loop. Verify that blocked tools throw structured errors, pending approvals generate JSON state files, and audit entries appear in audit/ as JSONL.
  4. Configure rotation & notifications: Set up log rotation via cron or systemd timer. Configure the approval webhook to route pending files to your team's messaging platform or ticketing system.
  5. Promote to CI/CD: Add policy validation to your pipeline. Reject commits that introduce unsupported schema versions, catch-all patterns, or missing reason fields. Deploy adapters alongside agent services with version-pinned configurations.