m 'lru-cache';
type CacheTier = 'human_chat' | 'bot_trial' | 'bot_paid';
interface CacheConfig {
ttlMs: number;
maxEntries: number;
}
const TIER_CONFIG: Record<CacheTier, CacheConfig> = {
human_chat: { ttlMs: 1800_000, maxEntries: 5000 },
bot_trial: { ttlMs: 60_000, maxEntries: 10000 },
bot_paid: { ttlMs: 60_000, maxEntries: 10000 }
};
class DualTierCache {
private stores: Record<CacheTier, LRUCache<string, unknown>>;
constructor() {
this.stores = {
human_chat: new LRUCache(TIER_CONFIG.human_chat),
bot_trial: new LRUCache(TIER_CONFIG.bot_trial),
bot_paid: new LRUCache(TIER_CONFIG.bot_paid)
};
}
get(tier: CacheTier, key: string): unknown | undefined {
return this.stores[tier].get(${tier}:${key});
}
set(tier: CacheTier, key: string, value: unknown): void {
this.stores[tier].set(${tier}:${key}, value);
}
has(tier: CacheTier, key: string): boolean {
return this.stores[tier].has(${tier}:${key});
}
}
**Architecture Rationale:**
- Separate `LRUCache` instances per tier prevent eviction collisions. A burst of bot requests won't flush human chat snapshots.
- Cache keys are prefixed with the tier identifier. This guarantees that a `human_chat` response never satisfies a `bot_paid` lookup, eliminating stale-data leakage.
- TTLs are enforced at the cache layer, not the application layer. This reduces downstream compute and ensures consistent SLA delivery.
### Step 2: Stateful Trial-to-Payment Funnel
Autonomous agents require frictionless onboarding. Requiring wallet signatures or smart contract interactions for trial access creates unnecessary drop-off. Instead, track usage per caller identifier and trigger micro-payments only after quota exhaustion.
```typescript
interface CallerQuota {
used: number;
limit: number;
tier: 'trial' | 'paid';
}
class QuotaManager {
private registry = new Map<string, CallerQuota>();
private readonly TRIAL_LIMIT = 100;
private readonly PAID_FEE = 0.01; // FET
checkAccess(callerId: string): { allowed: boolean; fee: number; tier: CacheTier } {
let record = this.registry.get(callerId);
if (!record) {
record = { used: 0, limit: this.TRIAL_LIMIT, tier: 'trial' };
this.registry.set(callerId, record);
}
if (record.used < record.limit) {
record.used++;
return { allowed: true, fee: 0, tier: 'bot_trial' };
}
// Trial exhausted, enforce micro-payment
return { allowed: true, fee: this.PAID_FEE, tier: 'bot_paid' };
}
}
Architecture Rationale:
- Trial enforcement lives in the agent runtime storage, not on-chain. This eliminates signature friction while maintaining accurate per-caller accounting.
- The quota manager returns the appropriate cache tier alongside the fee requirement. This tightly couples access control with cache routing.
- Conversion logging (
trial_exhausted) should be emitted when the threshold is crossed, enabling product teams to track A2A monetization funnels rather than traditional pageview metrics.
Step 3: Machine-Readable Discovery & Sync Fallback
Agents do not parse HTML marketing pages. They consume structured manifests. Expose capabilities through llm.txt for crawler indexing, Model Context Protocol (MCP) endpoints for tool enumeration, and agent.json for capability contracts. Additionally, document a direct synchronous HTTP fallback to bypass resolver dependencies in constrained environments.
// MCP Tool Manifest (simplified)
const mcpManifest = {
name: "defi_market_brief",
description: "Returns risk-scored liquidity pool analysis",
inputSchema: {
type: "object",
properties: {
poolId: { type: "string" },
window: { type: "string", enum: ["1h", "24h", "7d"] }
},
required: ["poolId"]
}
};
// Direct sync fallback route
app.post('/submit', async (req, res) => {
const connectionHeader = req.headers['x-uagents-connection'];
if (connectionHeader === 'sync') {
// Bypass mailbox/Almanac, return structured envelope immediately
const payload = await processAgentRequest(req.body);
res.json(payload);
}
});
Architecture Rationale:
- MCP tools over Streamable HTTP allow LLM orchestrators to dynamically discover and invoke capabilities without hardcoded endpoints.
- The
/submit sync route eliminates dependency on distributed mailboxes or Brotli-compressed resolvers, which frequently timeout in CI pipelines or Windows environments.
robots.txt should explicitly permit /llm.txt, /llms-full.txt, /mcp, and /agent.json to ensure crawler compliance without scraping restrictions.
Step 4: Verifiable Response Schema
Bots validate responses through schema contracts, not marketing claims. Every response must include metadata that allows the consumer to verify tier compliance, freshness, and data provenance.
interface MarketBriefEnvelope {
status: 'ok' | 'error';
schemaVersion: string;
verdict: 'SAFE' | 'CAUTION' | 'AVOID';
riskScore: number;
flags: string[];
evidence: {
citations: string[];
notes: string;
};
meta: {
consumerTier: CacheTier;
ttlSeconds: number;
cacheHit: boolean;
generatedAt: string;
};
}
Architecture Rationale:
meta.consumerTier and meta.ttlSeconds enable automated verification that the delivered freshness matches the paid tier.
schemaVersion allows backward-compatible evolution without breaking downstream parsers.
- Publishing a public backtest dashboard (e.g., ~70% proxy accuracy on historical DLMM data) provides the empirical trust surface that autonomous systems require before integrating paid endpoints.
Pitfall Guide
1. Shared Cache Keys Across Tiers
Explanation: Using a single cache namespace for both human and bot requests causes tier collision. A human chat request may populate the cache, causing a bot to receive 30-minute stale data despite paying for 60-second freshness.
Fix: Namespace all cache keys with the consumer tier prefix. Maintain separate LRU instances per tier to prevent eviction interference.
2. Over-Engineering Trial Payments
Explanation: Requiring smart contract interactions or wallet signatures for trial access creates friction that kills adoption. Bots will route to competitors with frictionless onboarding.
Fix: Track trial usage in agent runtime storage per caller ID. Enforce quota limits programmatically and trigger on-chain micro-payments only after exhaustion.
3. Ignoring Machine-Readable Discovery
Explanation: Publishing OpenAPI specs behind login walls or relying on SEO leaves agents unable to discover capabilities. Bots parse llm.txt, MCP manifests, and agent.jsonβnot landing pages.
Fix: Serve structured discovery files at root paths. Explicitly allow crawler access in robots.txt. Keep MCP tool definitions synchronized with actual endpoint behavior.
4. Assuming Resolver Reliability
Explanation: Relying exclusively on distributed mailboxes or Almanac-style resolvers fails in CI environments, Windows hosts, or high-latency networks due to Brotli compression timeouts and DNS resolution delays.
Fix: Document and maintain a direct synchronous HTTP fallback (/submit with x-uagents-connection: sync). This bypasses mailbox routing and guarantees deterministic response delivery.
5. Marketing-First Trust Signals
Explanation: Bots do not trust adjectives, brand logos, or UI polish. They require verifiable metrics, reproducible backtests, and explicit schema contracts before committing compute or capital.
Fix: Publish historical accuracy scores, sample response envelopes, and open trial quotas. Include meta.ttlSeconds and meta.cacheHit in every response for automated SLA verification.
6. JWT-Based Tier Routing
Explanation: Routing cache behavior based on JWT claims or user roles assumes human authentication flows. Headless agents often operate without traditional session management, causing routing failures or misaligned cache hits.
Fix: Use explicit request headers (X-Consumer-Type) or protocol-level metadata to determine cache tier. Decouple authentication from cache routing logic.
7. Missing Conversion Telemetry
Explanation: Tracking pageviews or API call volume misses the actual monetization funnel. The critical metric for A2A products is trial-to-paid conversion, not raw traffic.
Fix: Emit structured events when trial quotas are exhausted and when the first paid micro-transaction completes. Feed these into product analytics to optimize pricing and quota limits.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-volume bot automation with strict freshness | Dual-tier cache with 60s TTL + direct sync fallback | Prevents stale data from breaking downstream logic; avoids resolver timeouts | Higher compute for frequent refreshes, but prevents economic loss from stale signals |
| Human-facing chat interface | Single-tier cache with 30m TTL + standard auth | Reduces backend load; humans tolerate delayed updates | Lower compute cost; optimal for conversational latency |
| Early-stage A2A product launch | Frictionless trial (100 calls) + on-chain micro-payment after | Maximizes adoption; removes wallet/signature friction during validation | Minimal upfront cost; conversion tracking drives pricing iteration |
| Enterprise bot integration | MCP tool discovery + signed response envelopes | Enables dynamic capability enumeration; provides auditability | Requires MCP server maintenance; higher trust justifies premium pricing |
Configuration Template
# cache-router.config.yaml
tiers:
human_chat:
ttl_seconds: 1800
max_entries: 5000
eviction_policy: lru
bot_trial:
ttl_seconds: 60
max_entries: 10000
eviction_policy: lru
bot_paid:
ttl_seconds: 60
max_entries: 10000
eviction_policy: lru
discovery:
llm_txt: /llm.txt
llms_full_txt: /llms-full.txt
agent_json: /agent.json
mcp_endpoint: /mcp
robots_allow: [/llm.txt, /llms-full.txt, /mcp, /agent.json]
commerce:
trial_limit: 100
paid_fee_fet: 0.01
trial_storage: runtime_agent_context
conversion_event: trial_exhausted_first_paid
fallback:
sync_route: /submit
sync_header: x-uagents-connection
bypass_mailbox: true
Quick Start Guide
- Initialize the dual-tier cache: Deploy separate LRU instances with tier-prefixed keys. Configure TTLs to 1800s for human chat and 60s for bot tiers.
- Wire the quota manager: Track caller IDs in agent runtime storage. Allow 100 free requests, then enforce micro-payment routing for subsequent calls.
- Expose machine discovery: Serve
llm.txt, agent.json, and MCP manifests at root paths. Update robots.txt to permit crawler indexing.
- Add sync fallback: Implement
/submit with x-uagents-connection: sync header support to bypass mailbox routing in constrained environments.
- Validate with schema contract: Ensure every response includes
meta.consumerTier, meta.ttlSeconds, and meta.cacheHit. Publish backtest accuracy metrics to establish trust before enabling paid routing.