quivalence is detected.
2. SHA-256 Hashing: Storing full argument objects in the buffer is memory-inefficient for large payloads. Hashing reduces each entry to 32 bytes. A window of 100 calls consumes only 3.2 KB of RAM.
3. Linear Scan vs. HashMap: For typical window sizes (10β50 calls), a linear scan over the ring buffer is faster than a HashMap. The small size ensures cache locality, and the overhead of hash map allocation and collision handling outweighs the benefits of O(1) lookup.
4. History Integrity: When a loop is detected, the sentinel must not record the blocked call in the history. Recording it would artificially inflate the count and potentially mask legitimate subsequent calls if the agent retries with modified arguments.
Code Example
The following TypeScript implementation demonstrates the sentinel pattern. This example uses distinct naming and structure from reference implementations to illustrate the architecture.
import { createHash } from 'crypto';
interface SentinelConfig {
threshold: number;
windowSize: number;
}
interface SentinelEntry {
hash: string;
toolName: string;
}
class ToolRecursionSentinel {
private buffer: SentinelEntry[];
private head: number = 0;
private count: number = 0;
private config: SentinelConfig;
constructor(config: SentinelConfig) {
this.config = config;
this.buffer = new Array<SentinelEntry>(config.windowSize);
}
/**
* Evaluates a tool call against the recursion history.
* Returns true if the call is safe to proceed.
* Returns false if the recursion threshold is exceeded.
*/
evaluate(toolName: string, args: Record<string, unknown>): boolean {
const canonicalArgs = this.toCanonicalJSON(args);
const hash = this.computeHash(toolName, canonicalArgs);
// Check for threshold violation
let matchCount = 0;
for (let i = 0; i < this.count; i++) {
const index = (this.head - 1 - i + this.buffer.length) % this.buffer.length;
if (this.buffer[index].hash === hash) {
matchCount++;
if (matchCount >= this.config.threshold) {
return false; // Loop detected
}
}
}
// Record the call
const entry: SentinelEntry = { hash, toolName };
this.buffer[this.head] = entry;
this.head = (this.head + 1) % this.buffer.length;
if (this.count < this.config.windowSize) {
this.count++;
}
return true;
}
/**
* Resets the sentinel state. Essential for session boundaries.
*/
reset(): void {
this.head = 0;
this.count = 0;
this.buffer = new Array<SentinelEntry>(this.config.windowSize);
}
private toCanonicalJSON(obj: Record<string, unknown>): string {
const sortedKeys = Object.keys(obj).sort();
const sortedObj: Record<string, unknown> = {};
for (const key of sortedKeys) {
sortedObj[key] = obj[key];
}
return JSON.stringify(sortedObj);
}
private computeHash(toolName: string, args: string): string {
const payload = `${toolName}::${args}`;
return createHash('sha256').update(payload).digest('hex');
}
}
Integration Pattern
Integrate the sentinel into the agent's dispatch loop. The sentinel must be checked before tool execution.
const sentinel = new ToolRecursionSentinel({ threshold: 3, windowSize: 10 });
async function runAgentLoop(agent: Agent) {
while (agent.isActive()) {
const action = await agent.planNextStep();
if (!sentinel.evaluate(action.tool, action.args)) {
// Loop detected: Inject recovery feedback and break
await agent.injectFeedback(
"Warning: Repeated action detected. Switching strategy."
);
break;
}
const result = await executeTool(action);
await agent.consumeResult(result);
}
}
Pitfall Guide
1. The Polling Trap
Explanation: Guarding tools that are designed to be called repeatedly, such as get_job_status or poll_webhook, causes false positives. These tools legitimately return identical results until a state change occurs.
Fix: Maintain an allowlist of polling tools that bypass the sentinel, or configure a separate sentinel with a much higher threshold for specific tool categories.
2. Session Bleed
Explanation: Failing to reset the sentinel between user turns or conversation sessions causes history from previous interactions to trigger false loops.
Fix: Call sentinel.reset() immediately upon receiving a new user message or starting a new session. Ensure the sentinel lifecycle is bound to the session scope.
3. Semantic Blindness
Explanation: The sentinel detects exact argument matches only. It cannot identify semantically similar calls, such as search("apple") vs search("fruit"). If the agent varies arguments slightly to avoid detection, the sentinel will not block the loop.
Fix: Accept this limitation as a trade-off for performance. For semantic deduplication, implement a result cache with embedding-based similarity upstream of the sentinel. The sentinel handles exact loops; the cache handles semantic redundancy.
4. Thread Contention
Explanation: The sentinel is not thread-safe. In environments with parallel tool dispatch, concurrent calls may corrupt the ring buffer or produce inconsistent counts.
Fix: Wrap the sentinel in Arc<Mutex<Sentinel>> for shared access, or instantiate a separate sentinel per dispatch thread. For parallel execution, consider a distributed detection strategy that aggregates counts across threads.
5. Aggressive Thresholds
Explanation: Setting threshold: 1 blocks any repeated call, including legitimate retries after transient errors. This can cause the agent to fail on recoverable failures.
Fix: Set threshold >= 3 to allow for retry logic. The threshold should reflect the maximum number of identical calls expected before a loop is suspected.
6. Silent Aborts
Explanation: Breaking the loop without providing feedback to the LLM leaves the agent in an undefined state. The LLM may continue reasoning based on stale context, leading to further errors.
Fix: Always inject a descriptive message into the conversation history when a loop is detected. This informs the LLM of the constraint and encourages it to select an alternative tool or strategy.
7. Memory Leaks in History
Explanation: Using an unbounded list instead of a ring buffer causes memory usage to grow linearly with the number of turns. In long-running batch agents, this can lead to OOM errors.
Fix: Ensure the implementation uses a fixed-size ring buffer with O(1) eviction. The memory footprint should be strictly bounded by windowSize * 32 bytes.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Exact Argument Repetition | Loop Sentinel | Detects identical tool+args patterns with minimal overhead. | Prevents 13x cost inflation. |
| Semantic Redundancy | Result Cache | Caches results for similar queries using embeddings. | Reduces API calls by ~30%. |
| Provider Outage | Circuit Breaker | Stops calls on consistent 5xx errors. | Prevents wasted retries during downtime. |
| Runaway Token Usage | Budget Guard | Halts execution when token/cost limits are reached. | Caps maximum session cost. |
| Long-Running Batch | Per-Item Sentinel | Isolates loops to individual batch items. | Prevents one loop from blocking the entire job. |
Configuration Template
For production systems, per-tool thresholds provide finer control. The following configuration pattern allows dynamic threshold assignment:
# agent-sentinel-config.yaml
sentinel:
default:
threshold: 3
window_size: 10
overrides:
search_web:
threshold: 2
window_size: 5
reason: "Search should converge quickly; tight limit."
poll_job_status:
threshold: 20
window_size: 50
reason: "Polling requires repeated calls; loose limit."
get_user_profile:
threshold: 5
window_size: 15
reason: "Moderate repetition allowed for caching."
Implement a wrapper that loads this configuration and routes calls to the appropriate sentinel instance. This pattern addresses the limitation of single-threshold guards and enables granular control over agent behavior.
Quick Start Guide
- Add Dependency: Include the sentinel library in your project dependencies.
- Instantiate: Create a sentinel with
threshold: 3 and windowSize: 10 as a starting point.
- Wrap Dispatch: Insert
sentinel.evaluate(tool, args) before every tool execution.
- Handle Failure: On
false return, inject a recovery message and break the loop.
- Reset: Call
sentinel.reset() at the start of each new user session.
By integrating deterministic loop detection, you transform agent reliability from a probabilistic hope into an engineered guarantee. The sentinel acts as a circuit breaker for logic errors, ensuring that agents remain responsive, cost-effective, and recoverable in production environments.