ce to avoid cascading errors. For example, merging consecutive messages must happen before enforcing alternation.
4. Warning Telemetry: Every fix should generate a warning. This allows operators to monitor the health of their agent loops and identify upstream issues causing structural corruption.
Implementation Strategy
The sanitization process follows a pipeline of fixes:
- Merge Consecutive Same-Role Messages: Combine adjacent messages with the same role. For user messages, concatenate content. For assistant messages, merge content blocks.
- Remove Empty Content: Eliminate messages with empty
content arrays or null strings. Remove empty content blocks within messages.
- Pair Tool Calls: Ensure every
tool_use has a corresponding tool_result. If a pair is broken, remove the tool_use block, as the result is unrecoverable.
- Enforce Alternation: Verify the list alternates user/assistant starting with user. Fix violations by merging or removing offending messages.
- Strip Trailing Assistant (Optional): Remove trailing assistant messages if the provider requires a user turn next.
Code Example: TypeScript Sanitizer
This example demonstrates a robust, immutable sanitizer implemented in TypeScript. It uses a functional pipeline approach for clarity and testability.
// types.ts
export type Role = 'user' | 'assistant';
export type ContentBlock = { type: 'text'; text: string } | { type: 'tool_use'; id: string; name: string; input: object } | { type: 'tool_result'; tool_use_id: string; content: string };
export interface Message {
role: Role;
content: ContentBlock[];
}
export interface SanitizeResult {
messages: Message[];
warnings: string[];
}
// sanitizer.ts
import { Message, SanitizeResult, Role } from './types';
export class ContextSanitizer {
private provider: 'anthropic' | 'openai';
constructor(provider: 'anthropic' | 'openai') {
this.provider = provider;
}
sanitize(messages: Message[]): SanitizeResult {
const warnings: string[] = [];
let currentMessages = [...messages]; // Immutable copy
// Step 1: Remove empty content
currentMessages = this.removeEmptyContent(currentMessages, warnings);
// Step 2: Merge consecutive same-role messages
currentMessages = this.mergeConsecutiveRoles(currentMessages, warnings);
// Step 3: Pair tool calls (Provider specific)
if (this.provider === 'anthropic') {
currentMessages = this.pairToolCalls(currentMessages, warnings);
}
// Step 4: Enforce alternation
currentMessages = this.enforceAlternation(currentMessages, warnings);
// Step 5: Strip trailing assistant (Optional based on provider)
if (this.provider === 'anthropic') {
currentMessages = this.stripTrailingAssistant(currentMessages, warnings);
}
return { messages: currentMessages, warnings };
}
private removeEmptyContent(messages: Message[], warnings: string[]): Message[] {
return messages.filter((msg, index) => {
const hasContent = msg.content.length > 0 && msg.content.some(block => {
if (block.type === 'text') return block.text.trim().length > 0;
return true; // tool_use/tool_result are valid even if input/content is empty
});
if (!hasContent) {
warnings.push(`Removed empty message at index ${index}`);
return false;
}
return true;
});
}
private mergeConsecutiveRoles(messages: Message[], warnings: string[]): Message[] {
const merged: Message[] = [];
for (let i = 0; i < messages.length; i++) {
const current = messages[i];
const prev = merged[merged.length - 1];
if (prev && prev.role === current.role) {
// Merge content blocks
prev.content = [...prev.content, ...current.content];
warnings.push(`Merged consecutive ${current.role} messages at index ${i}`);
} else {
merged.push({ ...current, content: [...current.content] });
}
}
return merged;
}
private pairToolCalls(messages: Message[], warnings: string[]): Message[] {
// Simplified logic: Remove tool_use blocks without matching tool_result
// In production, this requires tracking tool_use IDs across messages
const toolResultIds = new Set<string>();
messages.forEach(msg => {
if (msg.role === 'user') {
msg.content.forEach(block => {
if (block.type === 'tool_result') toolResultIds.add(block.tool_use_id);
});
}
});
return messages.map(msg => {
if (msg.role === 'assistant') {
const originalCount = msg.content.length;
const filteredContent = msg.content.filter(block => {
if (block.type === 'tool_use') {
if (!toolResultIds.has(block.id)) {
warnings.push(`Removed unpaired tool_use ${block.id}`);
return false;
}
}
return true;
});
if (filteredContent.length !== originalCount) {
return { ...msg, content: filteredContent };
}
}
return msg;
});
}
private enforceAlternation(messages: Message[], warnings: string[]): Message[] {
// Ensure starts with user and alternates
// Implementation details omitted for brevity, but follows same pattern
// Merges or removes violations
return messages;
}
private stripTrailingAssistant(messages: Message[], warnings: string[]): Message[] {
if (messages.length > 0 && messages[messages.length - 1].role === 'assistant') {
warnings.push('Stripped trailing assistant message');
return messages.slice(0, -1);
}
return messages;
}
}
Usage Pattern
Integrate the sanitizer into your agent loop immediately before the API call.
// agent-loop.ts
import { ContextSanitizer } from './sanitizer';
import { AnthropicClient } from './api-client';
const sanitizer = new ContextSanitizer('anthropic');
const client = new AnthropicClient();
async function runAgentStep(history: Message[]): Promise<void> {
// 1. Trim context window (may introduce structural issues)
const trimmedHistory = trimContextWindow(history, 180_000);
// 2. Sanitize
const result = sanitizer.sanitize(trimmedHistory);
// 3. Log warnings for telemetry
if (result.warnings.length > 0) {
console.warn('Sanitization warnings:', result.warnings);
// Send to monitoring system
}
// 4. Call API with clean messages
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
messages: result.messages,
max_tokens: 1024,
});
// 5. Append response to history
history.push(response.message);
}
Pitfall Guide
Avoid these common mistakes when implementing message sanitization.
| Pitfall | Explanation | Fix |
|---|
| In-Place Mutation | Modifying the input list directly corrupts the history buffer, making debugging impossible and breaking fallback logic. | Always return a new list. Use immutable operations like filter, map, and spread operators. |
| Silent Swallowing | Ignoring warnings from the sanitizer hides upstream bugs. If your agent constantly generates empty messages, you need to fix the generator, not just hide the symptom. | Log all warnings to your telemetry system. Alert on high warning rates. |
| Content Validation Confusion | Sanitizers check structure, not content. A message with valid structure but malformed JSON in a tool argument will pass sanitization but fail execution. | Use a separate content validator for tool arguments and text quality. Sanitization is only for structural integrity. |
| Truncation Blindness | Trimming context windows without sanitizing afterward leaves orphaned tool calls. The truncation algorithm may cut a tool_use but keep the tool_result. | Always run the sanitizer after context window trimming. This is the most critical integration point. |
| Over-Sanitization | Running the sanitizer on every message addition adds unnecessary overhead. If you know the list is valid, skip it. | Apply sanitization only before API calls or after operations that may corrupt structure (trimming, retries). |
| Provider Agnosticism | Assuming all providers have the same rules leads to errors. Anthropic requires tool pairing; OpenAI handles tools differently. | Pass the provider flag to the sanitizer. Implement provider-specific rule sets. |
| Ignoring SanitizeError | If the sanitizer encounters ambiguous corruption, it may throw an error. Failing to catch this crashes the agent. | Wrap sanitizer calls in try-catch blocks. Implement a fallback strategy, such as using only the last user message. |
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Simple Chat (1:1) | Skip Sanitization | Message lists are static and known-good. Overhead outweighs benefit. | Low |
| Agent Loop with Tools | Sanitize Before Every Call | Dynamic lists and tool calls introduce high risk of structural corruption. | High ROI |
| Context Window Trimming | Sanitize After Trimming | Trimming frequently breaks tool pairs and alternation. Essential here. | Critical |
| High-Volume API Calls | Conditional Sanitization | Profile latency. If overhead is significant, sanitize only after risky operations. | Balanced |
Configuration Template
Use this template to set up a robust sanitization pipeline in your agent codebase.
// config/sanitizer-config.ts
import { ContextSanitizer } from '../sanitizer';
export const createSanitizer = (provider: 'anthropic' | 'openai') => {
return new ContextSanitizer(provider);
};
// agent-config.ts
export const agentConfig = {
provider: 'anthropic' as const,
model: 'claude-sonnet-4-6',
maxTokens: 1024,
contextWindow: 180_000,
sanitizer: createSanitizer('anthropic'),
};
Quick Start Guide
- Define Types: Create TypeScript interfaces for
Message, ContentBlock, and SanitizeResult.
- Implement Sanitizer: Build the
ContextSanitizer class with immutable methods for merging, filtering, and pairing.
- Integrate Pipeline: Hook the sanitizer into your agent loop, ensuring it runs after context trimming and before the API call.
- Add Telemetry: Log warnings and errors to monitor agent health.
- Test Edge Cases: Verify the sanitizer handles empty content, consecutive roles, and broken tool pairs correctly.
By implementing structural sanitization, you ensure your LLM agents operate reliably in production, minimizing API errors and maximizing context efficiency. This pre-flight validation is essential for any agent that dynamically manages message lists or context windows.