Implement the Execution Orchestrator
The orchestrator manages the tool registry, validates inputs against contracts, executes handlers, and normalizes responses into a consistent observation format. This layer isolates the LLM from infrastructure volatility.
class ExecutionOrchestrator {
private registry: Map<string, ToolContract> = new Map();
private handlers: Map<string, ToolHandler> = new Map();
register(contract: ToolContract, handler: ToolHandler): void {
this.registry.set(contract.name, contract);
this.handlers.set(contract.name, handler);
}
async execute(toolName: string, args: Record<string, unknown>): Promise<Observation> {
const contract = this.registry.get(toolName);
if (!contract) {
return { status: 'error', payload: `Unknown tool: ${toolName}` };
}
const validation = this.validateArgs(contract, args);
if (!validation.valid) {
return { status: 'error', payload: validation.reason };
}
try {
const result = await this.handlers.get(toolName)!(args);
return { status: 'success', payload: JSON.stringify(result) };
} catch (err) {
return { status: 'error', payload: `Execution failed: ${(err as Error).message}` };
}
}
private validateArgs(contract: ToolContract, args: Record<string, unknown>): ValidationResult {
for (const [key, spec] of Object.entries(contract.parameters)) {
if (spec.required && !(key in args)) {
return { valid: false, reason: `Missing required parameter: ${key}` };
}
if (key in args && typeof args[key] !== spec.type) {
return { valid: false, reason: `Type mismatch for ${key}: expected ${spec.type}` };
}
}
return { valid: true, reason: '' };
}
}
type ToolHandler = (args: Record<string, unknown>) => Promise<unknown>;
type Observation = { status: 'success' | 'error'; payload: string };
type ValidationResult = { valid: boolean; reason: string };
Step 3: Construct the Control Loop with Hard Caps
The loop manages context, enforces iteration limits, and surfaces observations back to the model. Context management uses a sliding window strategy: retain the initial system instructions and the most recent exchanges, compress or truncate middle turns to prevent context window exhaustion.
class AgentControlLoop {
private maxIterations: number;
private contextWindow: Message[] = [];
private telemetry: TraceLogger;
constructor(maxIterations: number = 10) {
this.maxIterations = maxIterations;
this.telemetry = new TraceLogger();
}
async run(initialPrompt: string, orchestrator: ExecutionOrchestrator): Promise<string> {
this.contextWindow.push({ role: 'user', content: initialPrompt });
let iteration = 0;
while (iteration < this.maxIterations) {
iteration++;
const modelResponse = await this.queryModel(this.contextWindow);
this.telemetry.log({ iteration, phase: 'model_response', tokens: modelResponse.usage });
if (modelResponse.tool_call) {
const observation = await orchestrator.execute(
modelResponse.tool_call.name,
modelResponse.tool_call.arguments
);
this.contextWindow.push({
role: 'assistant',
content: JSON.stringify(modelResponse.tool_call)
});
this.contextWindow.push({
role: 'tool',
content: observation.payload
});
this.telemetry.log({ iteration, phase: 'tool_execution', status: observation.status });
} else {
this.telemetry.log({ iteration, phase: 'final_response', completed: true });
return modelResponse.content;
}
}
this.telemetry.log({ iteration, phase: 'loop_terminated', reason: 'max_iterations_reached' });
return 'Task exceeded maximum iteration limit. Please refine your request.';
}
private async queryModel(messages: Message[]): Promise<ModelResponse> {
// Integration with target LLM API
throw new Error('Model integration placeholder');
}
}
interface Message { role: 'user' | 'assistant' | 'tool' | 'system'; content: string; }
interface ModelResponse { content: string; tool_call?: { name: string; arguments: Record<string, unknown> }; usage: number; }
Architecture Rationale
- Tool-First Design: Reduces model hallucination by constraining the action space. The LLM selects from a known, validated set of operations rather than inventing workflows.
- Progressive Disclosure: Prevents prompt bloat. Heavy execution guidelines load only when a tool is selected, preserving context window capacity for reasoning.
- Hard Iteration Cap: Protects against budget blowouts and infinite loops. Ten iterations is a practical default; production tuning should derive from trace analysis.
- Structured Observations: Standardizing tool outputs into success/error payloads ensures the model receives consistent feedback, enabling deterministic recovery paths.
- Telemetry Integration: Trace logging from day one transforms agent development from guesswork into data-driven iteration. Every tool call, latency spike, and error state must be recorded.
Pitfall Guide
1. The Generic Verb Trap
Explanation: Naming tools with broad actions like search_data or process_request without specifying trigger conditions or argument constraints. The model defaults to the most obvious interpretation, which rarely aligns with business logic.
Fix: Replace generic names with domain-specific actions. Add explicit trigger conditions and parameter constraints. Example: resolve_customer_order_status with order_id (format: UUID) and include_shipping (boolean).
2. Prompt Inflation
Explanation: Adding new rules to the system prompt every time the agent fails. Each instruction reduces the salience of previous ones, creating conflicting directives that degrade performance.
Fix: Move behavioral logic into tool schemas or progressive disclosure guidelines. If a tool is misused, fix the tool contract, not the system prompt.
3. Silent Execution Failures
Explanation: Catching errors in the execution layer but returning empty or generic responses. The model retries blindly because it lacks visibility into why the previous attempt failed.
Fix: Standardize error payloads. Return structured messages containing the failure reason, expected format, and recovery suggestion. Feed the exact error string back into the context window.
4. Uncapped Autonomy Loops
Explanation: Allowing the agent to run indefinitely until it produces a final response. A single malformed tool call or ambiguous output can trigger infinite retry cycles, consuming tokens and blocking downstream processes.
Fix: Implement hard iteration limits. Add exponential backoff for transient errors. Define a graceful degradation path that returns a partial result or escalation prompt when the cap is reached.
5. Over-Composability Expectations
Explanation: Deploying 6+ tools on day one and expecting the model to chain them into complex workflows. LLMs struggle with multi-step composition without explicit scaffolding, leading to tool selection errors and fragmented execution.
Fix: Start with 2-3 atomic tools. Validate single-step reliability before introducing chaining. Use sub-agents or explicit workflow definitions for multi-step processes.
6. Context Window Starvation
Explanation: Appending every tool call and response to the conversation history without compression. The context window fills rapidly, forcing the model to drop earlier instructions and lose task continuity.
Fix: Implement a sliding window strategy. Retain system instructions and the most recent 3-5 exchanges. Compress or summarize middle turns using semantic truncation or token-aware pruning.
7. Ignoring Execution Telemetry
Explanation: Shipping agents without structured logging. When failures occur, developers lack visibility into tool selection patterns, latency bottlenecks, or error recurrence rates.
Fix: Log every iteration with metadata: tool name, argument payload, execution duration, status code, and model confidence score. Use this data to refine tool contracts and adjust iteration caps.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Internal Knowledge Retrieval | Tool-Centric with 2 tools (search + citation) | Reduces hallucination, enforces evidence-based responses | Low (predictable token usage) |
| Transactional API Calls | Strict schema validation + idempotency keys | Prevents duplicate charges, ensures deterministic state changes | Medium (validation overhead) |
| Multi-Step Workflow Automation | Sub-agent routing + explicit step definitions | LLMs struggle with implicit chaining; explicit paths improve reliability | High (requires orchestration layer) |
| Customer Support Triage | Progressive disclosure + escalation fallback | Balances autonomy with safety; routes complex cases to humans | Low-Medium (scales with volume) |
Configuration Template
// tool-definitions.ts
export const SUPPORT_TOOLS: ToolContract[] = [
{
name: 'lookup_customer_profile',
trigger: 'Use when the user provides a customer ID, email, or account reference and requires account status, tier, or contact history.',
parameters: {
identifier: {
type: 'string',
description: 'Customer email or UUID. Do not pass full names or partial strings.',
constraints: ['Must match RFC 5322 email format or UUID v4'],
required: true
}
},
output_schema: {
format: 'json',
success_payload: '{ "id": string, "tier": string, "status": "active" | "suspended", "last_contact": string }',
error_payload: '{ "code": "NOT_FOUND" | "INVALID_FORMAT", "message": string }'
},
execution_guidelines: 'Extract identifier from user input. If format is ambiguous, ask for clarification before calling. Cache results for 5 minutes.'
},
{
name: 'create_support_ticket',
trigger: 'Use when the user explicitly requests ticket creation, escalation, or when a resolved issue requires formal tracking.',
parameters: {
customer_id: { type: 'string', description: 'UUID from lookup_customer_profile', required: true },
category: { type: 'string', description: 'One of: billing, technical, account, feature_request', required: true },
summary: { type: 'string', description: '2-3 sentence summary. Do not include raw logs.', required: true }
},
output_schema: {
format: 'json',
success_payload: '{ "ticket_id": string, "status": "open", "estimated_response": string }',
error_payload: '{ "code": "VALIDATION_ERROR" | "RATE_LIMITED", "message": string }'
}
}
];
// loop-config.ts
export const AGENT_LOOP_CONFIG = {
maxIterations: 10,
contextWindow: {
headRetention: 1,
tailRetention: 4,
compressionStrategy: 'semantic_truncation'
},
telemetry: {
enabled: true,
logLevel: 'debug',
retentionDays: 30
}
};
Quick Start Guide
- Define two tool contracts: Write explicit trigger conditions, parameter constraints, and output schemas. Avoid generic names.
- Implement the execution layer: Build handlers that validate inputs, execute business logic, and return standardized success/error payloads.
- Initialize the control loop: Set a hard iteration cap (10), configure sliding context retention, and attach a trace logger.
- Run a single-user test: Provide a goal-oriented prompt. Observe tool selection, argument formatting, and error recovery in the telemetry output.
- Iterate on schemas, not prompts: If the agent misbehaves, refine the tool contract or execution guidelines. Only adjust the system prompt for role clarification or safety boundaries.