exposes the results via MCP-compatible interfaces.
Architecture Decisions and Rationale
- Sliding Window Telemetry: Trust scores must reflect recent behavior, not historical averages. A 24-hour sliding window with exponential decay ensures that sudden degradation or improvement impacts the score proportionally.
- Tool-Level Granularity: MCP servers often bundle multiple tools. Aggregating metrics at the server level masks broken endpoints. Scoring must operate at the tool level, with a weighted server-level composite.
- Anomaly Detection via Baseline Drift: Absolute thresholds fail in dynamic environments. Instead, the system tracks rolling baselines and flags deviations exceeding two standard deviations, capturing both performance drops and suspicious improvements (e.g., cached garbage responses).
- MCP-Native Exposure: The scoring engine itself operates as an MCP server using Streamable HTTP transport. This allows any MCP-capable agent to query trust data without custom adapters.
- Economic Integration Hook: The
beforeSettle mechanism intercepts agent-to-agent payment flows, evaluating trust scores before funds transfer. This prevents settlement with degraded or compromised servers.
Implementation
// telemetry.types.ts
export interface ToolInteraction {
serverId: string;
toolName: string;
timestamp: number;
success: boolean;
latencyMs: number;
statusCode?: number;
}
export interface TrustScore {
serverId: string;
toolScores: Record<string, number>;
compositeScore: number;
anomalyFlag: boolean;
lastUpdated: number;
}
// telemetry.collector.ts
export class InteractionCollector {
private buffer: ToolInteraction[] = [];
private readonly WINDOW_MS = 24 * 60 * 60 * 1000;
ingest(interaction: ToolInteraction): void {
this.buffer.push(interaction);
this.prune();
}
private prune(): void {
const cutoff = Date.now() - this.WINDOW_MS;
this.buffer = this.buffer.filter(i => i.timestamp >= cutoff);
}
getRecent(serverId: string, toolName?: string): ToolInteraction[] {
return this.buffer.filter(i =>
i.serverId === serverId &&
(!toolName || i.toolName === toolName)
);
}
}
// scoring.engine.ts
export class ReputationEngine {
private baselines: Map<string, { mean: number; stdDev: number }> = new Map();
calculateScore(interactions: ToolInteraction[]): TrustScore {
const serverId = interactions[0]?.serverId ?? 'unknown';
const toolGroups = this.groupByTool(interactions);
const toolScores: Record<string, number> = {};
for (const [tool, records] of Object.entries(toolGroups)) {
toolScores[tool] = this.computeToolScore(records);
}
const composite = this.weightedAverage(Object.values(toolScores));
const anomaly = this.detectAnomaly(serverId, interactions);
return {
serverId,
toolScores,
compositeScore: Math.round(composite * 100) / 100,
anomalyFlag: anomaly,
lastUpdated: Date.now()
};
}
private computeToolScore(records: ToolInteraction[]): number {
const successRate = records.filter(r => r.success).length / records.length;
const avgLatency = records.reduce((sum, r) => sum + r.latencyMs, 0) / records.length;
const latencyPenalty = Math.min(avgLatency / 2000, 1); // 2s threshold
return Math.max(0, successRate * (1 - latencyPenalty));
}
private detectAnomaly(serverId: string, interactions: ToolInteraction[]): boolean {
const recentLatencies = interactions.map(i => i.latencyMs);
const mean = recentLatencies.reduce((a, b) => a + b, 0) / recentLatencies.length;
const baseline = this.baselines.get(serverId);
if (!baseline) {
this.baselines.set(serverId, { mean, stdDev: 0 });
return false;
}
const deviation = Math.abs(mean - baseline.mean);
const threshold = baseline.stdDev * 2 || 500; // fallback threshold
this.updateBaseline(serverId, mean);
return deviation > threshold;
}
private updateBaseline(serverId: string, newMean: number): void {
const current = this.baselines.get(serverId)!;
current.mean = current.mean * 0.9 + newMean * 0.1; // exponential smoothing
}
private groupByTool(records: ToolInteraction[]): Record<string, ToolInteraction[]> {
return records.reduce((acc, r) => {
acc[r.toolName] = acc[r.toolName] || [];
acc[r.toolName].push(r);
return acc;
}, {} as Record<string, ToolInteraction[]>);
}
private weightedAverage(scores: number[]): number {
return scores.reduce((sum, s) => sum + s, 0) / scores.length;
}
}
// mcp.bridge.ts
export class TrustMCPBridge {
constructor(
private collector: InteractionCollector,
private engine: ReputationEngine
) {}
async evaluateServerReputation(serverId: string, toolName?: string): Promise<TrustScore> {
const interactions = this.collector.getRecent(serverId, toolName);
return this.engine.calculateScore(interactions);
}
async flagRuntimeAnomalies(serverId: string): Promise<boolean> {
const interactions = this.collector.getRecent(serverId);
return this.engine['detectAnomaly'](serverId, interactions);
}
async submitInteractionLog(interaction: ToolInteraction): Promise<void> {
this.collector.ingest(interaction);
}
}
The architecture separates concerns cleanly: ingestion handles buffering and pruning, calculation manages statistical baselines and scoring logic, and the bridge exposes MCP-compatible endpoints. The exponential smoothing in baseline updates prevents score volatility while remaining responsive to genuine shifts. Tool-level scoring ensures that a single broken endpoint doesn't artificially inflate or deflate the entire server's reputation.
Pitfall Guide
Explanation: Calculating trust scores at the server level averages out performance across all exposed tools. A server with four healthy tools and one consistently failing endpoint will still report a moderate score, causing agents to route requests to the broken tool.
Fix: Implement tool-level scoring with explicit tool routing validation. Require agents to query toolScores directly before invocation, and fail fast if a specific tool's score drops below the operational threshold.
2. Static Thresholds in Dynamic Environments
Explanation: Hardcoding trust score cutoffs (e.g., score < 0.7 = reject) ignores context. A financial transaction tool requires stricter thresholds than a logging utility. Static thresholds cause false rejections during legitimate traffic spikes or maintenance windows.
Fix: Implement context-aware thresholds. Allow agents to pass risk profiles (critical, standard, best-effort) that dynamically adjust acceptance criteria. Combine score thresholds with anomaly flags for compound decision logic.
3. Ignoring Temporal Availability Patterns
Explanation: Many MCP servers exhibit geographic or time-zone-dependent availability. A server that performs well during US business hours may drop requests during Asian or European peak times. Ignoring temporal patterns leads to unpredictable agent failures.
Fix: Incorporate time-bucketed telemetry. Track success rates and latency across rolling 4-hour windows. Flag servers with >30% variance between time buckets and route agents to regionally optimal endpoints when available.
4. Anomaly False Positives from Baseline Drift
Explanation: Sudden infrastructure upgrades or dependency updates can shift latency baselines legitimately. If the anomaly detector reacts too aggressively, it will flag healthy servers as compromised, causing unnecessary routing changes.
Fix: Use dual-threshold anomaly detection. Require both statistical deviation (>2σ) and sustained duration (e.g., 3 consecutive measurement cycles) before raising an anomaly flag. Implement a grace period for newly deployed servers to establish baselines.
5. Telemetry Data Poisoning
Explanation: Malicious or misconfigured agents can flood the scoring engine with falsified interaction logs, artificially inflating or deflating trust scores. Without validation, the reputation system becomes a single point of manipulation.
Fix: Implement weighted contribution scoring. New or low-reputation agents contribute less to the global score. Require cryptographic signatures for interaction logs and validate against known agent identities. Apply outlier rejection algorithms before baseline updates.
6. Blocking Critical Paths Without Fallbacks
Explanation: Strict trust gating can halt agent workflows entirely when all available servers fall below thresholds. In production, this creates cascading failures rather than graceful degradation.
Fix: Design fallback routing chains. When primary servers fail trust checks, automatically query secondary providers or cached responses. Implement circuit breaker patterns that temporarily bypass trust checks for idempotent operations during ecosystem-wide degradation events.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-frequency tool calls (>100/min) | Tool-level scoring with 5-minute aggregation windows | Prevents latency bottlenecks while maintaining endpoint accuracy | Low compute overhead, higher storage |
| Financial settlement flows | Strict trust gating + anomaly flag validation | Prevents economic loss from degraded or compromised servers | Higher latency (~50ms), reduced fraud risk |
| Development/testing environments | Relaxed thresholds + baseline grace periods | Allows rapid iteration without false trust rejections | Minimal infrastructure cost |
| Multi-region agent deployments | Time-bucketed telemetry + regional routing | Accounts for geographic availability differences | Moderate increase in telemetry processing |
| Legacy MCP servers with no telemetry | Fallback to static analysis + conservative scoring | Bridges gap until behavioral data accumulates | Higher operational risk initially |
Configuration Template
# trust-engine.config.yaml
telemetry:
window_hours: 24
prune_interval_minutes: 15
max_buffer_size: 100000
scoring:
granularity: tool_level
latency_threshold_ms: 2000
success_rate_weight: 0.7
latency_weight: 0.3
baseline_smoothing_factor: 0.1
anomaly_detection:
std_dev_threshold: 2.0
sustained_cycles: 3
grace_period_hours: 48
gating:
critical_operations:
min_composite_score: 0.85
allow_anomaly_override: false
standard_operations:
min_composite_score: 0.70
allow_anomaly_override: true
best_effort:
min_composite_score: 0.50
fallback_enabled: true
mcp_transport:
protocol: streamable_http
endpoint: /mcp/trust
max_concurrent_queries: 500
Quick Start Guide
- Initialize the telemetry collector: Deploy the
InteractionCollector alongside your MCP client runtime. Hook into every tool invocation to capture latency, success state, and server/tool identifiers.
- Configure scoring parameters: Adjust the YAML template to match your operational risk tolerance. Set stricter thresholds for financial or infrastructure-modifying tools, and relaxed limits for read-only or logging endpoints.
- Expose the trust bridge: Register the
TrustMCPBridge as an MCP server using Streamable HTTP. Ensure your agent framework can route trust queries to this endpoint before executing tool calls.
- Integrate settlement gating: Wrap agent-to-agent payment flows with a
beforeSettle evaluation. Query the trust score, validate anomaly flags, and conditionally proceed or route to fallback providers.
- Monitor and iterate: Track false positive rates and threshold adjustments over a 7-day period. Tune baseline smoothing and anomaly detection parameters based on your specific server ecosystem behavior.