e execution." This enables immediate isolation of compromised tools, cryptographic attribution of tool origins, and automated policy enforcement without blocking agent autonomy.
Core Solution
Securing an MCP toolchain requires three coordinated layers: cryptographic manifest signing, runtime behavior validation, and decentralized provenance resolution. Each layer addresses a specific failure mode in the agent execution lifecycle.
Step 1: Cryptographic Manifest Signing
Every MCP server must ship with a signed manifest that declares its tools, resources, and expected data schemas. The manifest is signed using an Ed25519 key pair controlled by the publisher. Verification happens before the server is loaded into the agent's execution context.
import { ed25519 } from '@noble/curves/ed25519';
import { createHash } from 'crypto';
interface ManifestDeclaration {
serverId: string;
version: string;
tools: Array<{ name: string; schema: Record<string, unknown> }>;
resources: Array<{ uri: string; access: 'read' | 'write' }>;
timestamp: number;
}
export class ManifestSigner {
private privateKey: Uint8Array;
constructor(privateKeyHex: string) {
this.privateKey = Uint8Array.from(Buffer.from(privateKeyHex, 'hex'));
}
public sign(manifest: ManifestDeclaration): string {
const payload = JSON.stringify(manifest, Object.keys(manifest).sort());
const hash = createHash('sha256').update(payload).digest();
const signature = ed25519.sign(hash, this.privateKey);
return Buffer.from(signature).toString('base64');
}
}
Architecture Rationale: We separate signing from verification to allow publishers to generate manifests offline while agents validate them at runtime. Ed25519 is chosen for its compact signatures and resistance to side-channel attacks. Sorting manifest keys ensures deterministic hashing, preventing signature mismatches due to JSON serialization differences.
Step 2: Runtime Behavior Validation
Static manifests declare intent, but runtime validation enforces it. A guard interceptor wraps every tool invocation, comparing actual behavior against declared schemas and monitoring for anomalous data flows.
import { z } from 'zod';
interface ToolInvocation {
toolName: string;
input: unknown;
output: unknown;
executionTime: number;
}
export class RuntimeGuard {
private schemas: Map<string, z.ZodType> = new Map();
public registerSchema(toolName: string, schema: z.ZodType) {
this.schemas.set(toolName, schema);
}
public validateInvocation(invocation: ToolInvocation): boolean {
const schema = this.schemas.get(invocation.toolName);
if (!schema) return false;
const inputValid = schema.safeParse(invocation.input).success;
const outputValid = schema.safeParse(invocation.output).success;
const withinTimeLimit = invocation.executionTime < 5000;
return inputValid && outputValid && withinTimeLimit;
}
}
Architecture Rationale: Runtime validation catches dynamic attacks that static analysis misses, such as prompt-induced tool abuse or chained data poisoning. We use Zod for schema validation because it provides runtime type safety and clear error boundaries. The 5-second execution threshold prevents resource exhaustion attacks without blocking legitimate long-running operations.
Step 3: Trust Scoring & DID Provenance
Trust is not binary. We calculate a composite trust score based on manifest validity, runtime behavior history, and decentralized identifier (DID) resolution. DIDs provide cryptographic attribution without relying on centralized certificate authorities.
interface TrustSignal {
manifestValid: boolean;
runtimeCompliance: number;
didResolutionSuccess: boolean;
historicalIncidents: number;
}
export class TrustEvaluator {
public calculateScore(signals: TrustSignal): number {
const base = signals.manifestValid ? 40 : 0;
const runtime = Math.min(signals.runtimeCompliance * 30, 30);
const provenance = signals.didResolutionSuccess ? 20 : 0;
const history = Math.max(10 - (signals.historicalIncidents * 5), 0);
return Math.min(base + runtime + provenance + history, 100);
}
public isTrusted(score: number, threshold: number = 75): boolean {
return score >= threshold;
}
}
Architecture Rationale: The scoring model weights manifest validity highest because cryptographic integrity is the foundation of supply chain security. Runtime compliance accounts for behavioral drift. DID resolution adds decentralized attribution, reducing reliance on registry trust. Historical incidents penalize repeat offenders. This multi-signal approach prevents score gaming and aligns with zero-trust principles.
Pitfall Guide
1. Static-Only Scanning
Explanation: Relying exclusively on package-level vulnerability scanners misses runtime behavior, tool chaining effects, and prompt-induced abuse.
Fix: Deploy a runtime guard that intercepts tool calls and validates against declared schemas before execution completes.
2. Blind Registry Trust
Explanation: Public registries verify package integrity but do not validate MCP semantics or publisher identity beyond basic authentication.
Fix: Require cryptographic manifests and resolve DIDs before loading any server into the agent context.
3. Over-Reliance on Trust Scores
Explanation: Trust scores can be manipulated if based on a single signal or if thresholds are too permissive.
Fix: Use multi-signal scoring with hard gates for manifest validity and DID resolution. Treat scores as advisory, not authoritative.
Explanation: Agents can be tricked into invoking tools with malicious payloads or chaining tools in unintended sequences.
Fix: Implement input/output sanitization at the runtime guard layer and enforce strict tool invocation policies based on user intent classification.
5. DID Resolution Without Caching
Explanation: Resolving DIDs on every invocation introduces latency and creates a denial-of-service vector.
Fix: Cache DID documents with TTL-based expiration and validate resolution signatures against known root keys.
6. Chaining Without Sandboxing
Explanation: Unisolated tool chains allow a compromised server to access context from other tools or escalate privileges.
Fix: Execute each MCP server in a sandboxed process with restricted network access and explicit data-sharing contracts.
7. Missing Runtime Output Validation
Explanation: Tools may return poisoned data that corrupts downstream agent reasoning or exfiltrates sensitive context.
Fix: Validate all tool outputs against declared schemas and apply content filtering before passing data back to the model.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Internal tool deployment | Static manifest signing + runtime guard | Controlled publisher identity reduces need for DID resolution | Low |
| Public marketplace integration | Full provenance pipeline with DID resolution | Untrusted publishers require cryptographic attribution and multi-signal trust | Medium |
| High-risk data processing | Sandboxed execution + strict output validation | Prevents context exfiltration and data poisoning | High |
| Low-latency agent workflows | Cached DID resolution + lightweight runtime guard | Balances security with performance constraints | Low-Medium |
Configuration Template
mcp_security_policy:
manifest_verification:
algorithm: ed25519
require_signature: true
reject_unsigned: true
runtime_guard:
enabled: true
execution_timeout_ms: 5000
schema_validation: strict
output_filtering: true
trust_engine:
scoring_model: multi_signal
threshold: 75
did_resolution:
cache_ttl_seconds: 3600
require_root_signature: true
historical_penalties:
max_incidents: 2
penalty_per_incident: 5
sandbox:
enabled: true
network_isolation: true
resource_limits:
memory_mb: 512
cpu_cores: 1
data_sharing: explicit_only
Quick Start Guide
- Initialize the signing pipeline: Generate an Ed25519 key pair using
@noble/curves/ed25519 and configure your CI/CD pipeline to sign MCP manifests before publishing.
- Deploy the runtime guard: Integrate the
RuntimeGuard class into your agent execution layer. Register tool schemas during initialization and wrap all tool invocations with validation logic.
- Configure DID resolution: Set up a DID resolver service with TTL caching. Point your trust engine to the resolver endpoint and enable root signature verification.
- Enforce trust thresholds: Apply the configuration template to your security policy engine. Set the trust threshold to 75, enable sandboxing, and configure automated rollback triggers for score drops or runtime violations.