Cross-Runtime Safety Parity for Multi-Service LLM Architectures

Current Situation Analysis

Modern LLM applications rarely run inside a single process. Production systems typically distribute workloads across edge functions, serverless handlers, and containerized agent runtimes. This polyglot deployment model introduces a critical security blind spot: most guardrail libraries and safety frameworks assume a monolithic execution environment. They guarantee safety only if every request traverses the specific runtime where the filter is installed.

When traffic can reach secondary runtimes through alternate routing, webhook handlers, or internal service meshes, the primary safety gate becomes irrelevant. An adversary or a hallucinating model can route malicious payloads, price manipulation attempts, or identity spoofing directly to a backend agent that lacks the original safety context. This is the deployed-agent gap. It is frequently overlooked because safety tooling is marketed as a drop-in middleware for inference endpoints, not as a distributed system primitive.

The industry response has been to centralize safety checks behind a dedicated microservice. While conceptually clean, this approach introduces measurable latency penalties. A centralized safety API typically adds 10–50ms per request due to network round-trips, TLS handshakes, and serialization overhead. For high-throughput commerce or real-time chat interfaces, this latency compounds quickly. Meanwhile, in-process deterministic filters operate in microseconds, but only if they are consistently deployed across every reachable runtime.

The fundamental misunderstanding lies in treating LLM safety as a probabilistic property rather than a deterministic boundary condition. When safety rules are expressed as finite-state machines, regex patterns, or rule engines, they admit mathematical equivalence testing. This property enables a deployable security primitive that scales across language and runtime boundaries without sacrificing performance or architectural flexibility.

WOW Moment: Key Findings

The following comparison illustrates why cross-runtime parity contracts outperform traditional single-process guardrails and centralized safety services in polyglot deployments.

Approach	Latency (p99)	Cross-Runtime Coverage	False Positive Rate	Deployment Friction
Single-Process Guardrail	~3.4 μs	None (runtime-bound)	<0.1%	Low
Centralized Safety API	10–50 ms	Full (if all traffic routed)	~0.5%	High
Cross-Runtime Parity Contract	~8.79 μs	Full (enforced via CI)	<0.1%	Medium

The parity contract approach delivers microsecond-level filtering comparable to single-process guardrails while guaranteeing behavioral equivalence across independently deployed runtimes. The 8.79 μs p99 latency figure demonstrates that deterministic safety checks can run orders of magnitude faster than network-dependent alternatives. More importantly, the CI-enforced equivalence gate eliminates silent divergence when engineering teams modify safety rules in one language but forget to update the corresponding implementation in another. This pattern enables safe polyglot architectures without forcing teams into a single runtime or accepting network latency penalties.

Core Solution

Building a cross-runtime safety system requires three architectural shifts: reclassifying model outputs as untrusted input, implementing layered deterministic filters, and enforcing behavioral parity through automated testing.

Step 1: Treat LLM Tool Arguments as Untrusted Input

LLM-generated tool calls must be handled identically to client-supplied JSON payloads. The model cannot be trusted to supply accurate pricing, customer identifiers, or inventory states. Every tool boundary requires server-side truth re-derivation.

When a commerce agent invokes a checkout function, the backend must ignore any total_cents or customer_ref supplied by the model. Instead, it should fetch the authoritative price from the product catalog, apply modifier deltas, and resolve identity exclusively from the authenticated session token. If the model-supplied values drift from the server-computed values by any measurable unit, the transaction must fail immediately. This defense covers both prompt injection attacks and stochastic hallucinations, as both manifest as untrusted input at the tool boundary.

Step 2: Implement Layered Deterministic Filters

Safety gates should operate at three distinct stages of the request lifecycle. Each layer serves a specific purpose and compensates for the limitations of the others.

Layer 1: Pre-Inference Interception Before any prompt reaches the LLM, a synchronous regex engine evaluates the raw input. This layer blocks known violation patterns before token billing occurs. It operates entirely in memory with no external dependencies.

// edge-runtime/safety/pre_inference_filter.ts
import { createHash } from 'crypto';

const BLOCKED_PATTERNS = [
  /\b(?:peanut|tree.nut|shellfish)\b/i,
  /\b(?:guaranteed|100%)\s+(?:safe|free|allergen)\b/i,
  /\b(?:medical|prescription|dosage)\s+(?:override|bypass)\b/i
];

const BLOCKED_RESPONSE = { status: 'blocked', message: 'Safety policy violation detected.' };

export function evaluateInput(rawText: string): { allowed: boolean; payload?: object } {
  const normalized = rawText.trim().toLowerCase();
  const matchIndex = BLOCKED_PATTERNS.findIndex(pattern => pattern.test(normalized));
  
  if (matchIndex !== -1) {
    return { 
      allowed: false, 
      payload: { ...BLOCKED_RESPONSE, ruleIndex: matchIndex } 
    };
  }
  return { allowed: true };
}

Layer 2: Stream-Level Scrubbing If a prompt evades pre-inference checks, the outbound token stream must be monitored. A lookahead buffer captures partial phrases before they reach the client. When a dangerous pattern completes, the stream terminates and substitutes a safe fallback payload.

// edge-runtime/safety/stream_scrubber.ts
const LOOKAHEAD_WINDOW = 50;
const DANGEROUS_REPLY_RE = /\b(?:guaranteed|certified)\s+(?:safe|free|nutless)\b/i;

export class StreamSafetyWrapper {
  private buffer: string = '';
  
  processChunk(token: string): string | null {
    this.buffer += token;
    if (this.buffer.length > LOOKAHEAD_WINDOW) {
      this.buffer = this.buffer.slice(-LOOKAHEAD_WINDOW);
    }
    
    if (DANGEROUS_REPLY_RE.test(this.buffer)) {
      this.buffer = '';
      return JSON.stringify({ status: 'intercepted', message: 'Safety policy violation detected.' });
    }
    return token;
  }
}

Layer 3: Post-Response Audit Every interception event must be logged to a persistent store. This layer provides forensic evidence but should never be treated as an execution guarantee. Serverless and containerized runtimes operate on fire-and-forget execution models; absence of an audit row does not prove non-execution. Positive evidence only.

Step 3: Enforce Parity Contracts via CI

A parity contract guarantees that deterministic safety classifiers behave identically across runtimes. The contract consists of three obligations: mathematical equivalence, a shared test corpus, and automated CI enforcement.

The most robust implementation parses the source regex from one runtime and recompiles it under the other runtime's engine. This eliminates drift when engineers update patterns in one language but neglect the other.

# agent-runtime/tests/test_safety_parity.py
import re
import pathlib
import pytest

# Load TypeScript source at test time
TS_SOURCE = pathlib.Path('../edge-runtime/safety/pre_inference_filter.ts').read_text()

# Extract regex declarations
PATTERN_RE = re.compile(r'const\s+BLOCKED_PATTERNS\s*=\s*\[(.*?)\]', re.DOTALL)
RAW_PATTERNS = PATTERN_RE.search(TS_SOURCE).group(1)
EXTRACTED_PATTERNS = re.findall(r'/(.+?)/([a-z]*)', RAW_PATTERNS)

# Compile under Python's re engine
PYTHON_PATTERNS = [re.compile(pat, re.IGNORECASE) for pat, _ in EXTRACTED_PATTERNS]

# Shared corpus: 90 cases (27 allergen-positive, 27 medical-positive, 10 dietary-safety-positive, 19 dangerous-reply-positive, 7 negative controls)
CORPUS = [
    ("I need a peanut-free option", True),
    ("Certified safe for shellfish allergies", True),
    ("What are your operating hours?", False),
    # ... 87 additional cases
]

@pytest.mark.parametrize("input_text, should_block", CORPUS)
def test_cross_runtime_equivalence(input_text: str, should_block: bool):
    blocked = any(p.search(input_text) for p in PYTHON_PATTERNS)
    assert blocked == should_block, f"Parity violation on: {input_text}"

This approach catches the most common parity bug: unilateral pattern updates. If a developer modifies the TypeScript regex without updating the Python equivalent, the CI gate blocks deployment. The test either passes by accident (parity preserved) or fails explicitly (CI blocks). There is no middle ground where silent divergence ships to production.

Pitfall Guide

1. Trusting Model-Generated Tool Payloads

Explanation: Engineers often assume that because a tool call originates from a trusted agent, its arguments are safe. LLMs hallucinate prices, swap customer IDs, and fabricate inventory states. Fix: Treat all tool arguments as untrusted input. Re-derive pricing, identity, and inventory state server-side. Fail transactions on any drift between model-supplied and server-computed values.

2. Applying Parity Contracts to Probabilistic Filters

Explanation: Parity contracts require deterministic equivalence. Applying them to LLM-based guardrails or neural classifiers introduces false confidence because probabilistic models cannot guarantee byte-identical behavior across runs. Fix: Reserve parity contracts for regex, finite-state automata, and rule engines. For probabilistic safety layers, use distributional equivalence testing and confidence thresholding instead.

3. Ignoring Stream Boundary Conditions

Explanation: Safety filters that only evaluate complete messages miss partial phrase injection. Adversaries split dangerous tokens across multiple chunks to evade detection. Fix: Implement a lookahead buffer (typically 40–60 characters) that maintains state across stream chunks. Evaluate the buffer on every token arrival and terminate the stream immediately upon pattern completion.

4. Timestamp Drift in Cross-Service Signing

Explanation: HMAC-based request signing between runtimes fails when system clocks drift beyond the freshness window. A 60-second tolerance is standard, but containerized agents often run on unsynchronized hosts. Fix: Enforce NTP synchronization across all deployment targets. Implement clock skew tolerance with explicit logging when requests fall within a 5-second grace period. Reject requests exceeding the maximum window.

5. Assuming Audit Logs Are Execution Guarantees

Explanation: Serverless and containerized runtimes use fire-and-forget execution models. If a safety filter intercepts a request but the logging call fails due to network partitioning, the audit row will be missing. Fix: Treat audit logs as positive evidence only. Never use absence of a log to prove non-execution. Implement synchronous blocking for critical safety layers and asynchronous logging for forensic analysis.

6. Regex Boundary Blind Spots

Explanation: Word boundary anchors (\b) fail on plural forms, hyphenated compounds, and Unicode whitespace. Patterns like \bsulfite\b miss sulfites, and {0,30} proximity windows break when keywords are separated by punctuation. Fix: Use Unicode-aware boundary matching. Test plural/singular variants explicitly. Replace fixed-distance proximity windows with flexible token-based parsing or finite-state machines for complex linguistic patterns.

7. Skipping CI Enforcement Gates

Explanation: Parity contracts degrade silently when engineers update safety rules in one runtime but forget the other. Manual code reviews cannot catch regex divergence at scale. Fix: Integrate parity tests into the deployment pipeline. Block merges when behavioral equivalence fails. Require security-tagged reviewers for any modification to safety-critical files across runtimes.

Production Bundle

Action Checklist

Reclassify all LLM tool arguments as untrusted input and implement server-side truth re-derivation for pricing, identity, and inventory
Deploy deterministic safety filters at pre-inference, stream-level, and post-audit stages
Extract safety patterns into a shared configuration format that can be parsed across runtimes
Implement a CI pipeline that compiles patterns from one language under another and runs a shared test corpus
Configure HMAC-SHA256 request signing with 60-second timestamp freshness for cross-runtime communication
Establish CODEOWNERS rules requiring security-reviewed approvals for safety filter modifications
Monitor p99 latency of safety layers and alert when execution exceeds 50 μs
Maintain a living adversarial corpus and re-run parity tests before every major deployment

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Single runtime, low throughput	In-process guardrail middleware	Simplicity outweighs distribution complexity	Low
Multi-runtime, deterministic rules	Cross-runtime parity contract	Guarantees equivalence without network latency	Medium
Multi-runtime, probabilistic filters	Centralized safety API + confidence thresholds	Probabilistic models require unified inference context	High
High-frequency commerce transactions	Pre-inference regex + stream scrubbing	Microsecond latency prevents billing drift	Low
Cross-team language boundaries	CI-enforced parity tests	Eliminates silent divergence from unilateral updates	Medium

Configuration Template

# safety-pipeline/config/parity-contract.yaml
version: "2.0"
contract:
  name: "commerce_safety_boundary"
  deterministic: true
  layers:
    pre_inference:
      runtime: "edge"
      pattern_source: "src/safety/filters.ts"
      timeout_ms: 5
    stream_scrub:
      runtime: "edge"
      lookahead_chars: 50
      timeout_ms: 10
    audit_log:
      runtime: "agent"
      destination: "supabase.franklin_safety_audit"
      async: true
  test_corpus:
    path: "tests/corpus/commerce_safety.json"
    total_cases: 90
    breakdown:
      allergen_positive: 27
      medical_positive: 27
      dietary_safety_positive: 10
      dangerous_reply_positive: 19
      negative_controls: 7
  ci_gate:
    block_on_divergence: true
    required_reviewers: ["security-team"]
    max_latency_p99_us: 50

Quick Start Guide

Define your deterministic safety rules as regex patterns or finite-state machines. Ensure they cover positive, negative, and boundary cases. Store them in a single source file per runtime.
Implement the three-layer filter architecture in your primary runtime. Add pre-inference blocking, stream-level lookahead scrubbing, and asynchronous audit logging.
Create the parity test harness in your secondary runtime. Write a script that parses the primary runtime's pattern source, recompiles it under the secondary engine, and executes the shared test corpus.
Integrate the parity test into your CI pipeline. Configure the gate to block deployments when behavioral equivalence fails. Add CODEOWNERS rules for safety-critical files.
Deploy and monitor. Verify p99 latency remains under 50 μs. Confirm that cross-runtime requests are signed with HMAC-SHA256 and timestamp freshness is enforced. Run the adversarial corpus monthly to validate continued coverage.

Don't Trust Your LLM's Safety Promises Across Runtimes