e following implementation demonstrates a layered architecture using modern PHP practices. The example focuses on an e-commerce support workflow that routes customer inquiries to specialized agents, executes external tools, and maintains conversation state.
Step 1: Define a Provider-Agnostic Gateway
Hardcoding provider clients creates vendor lock-in and complicates fallback routing. A unified gateway abstracts the underlying HTTP layer and standardizes request/response contracts.
interface ModelGateway
{
public function generateCompletion(CompletionRequest $request): CompletionResponse;
public function getProviderName(): string;
public function supportsToolCalling(): bool;
}
final class UnifiedModelGateway implements ModelGateway
{
public function __construct(
private readonly array $providers,
private readonly string $defaultProvider
) {}
public function generateCompletion(CompletionRequest $request): CompletionResponse
{
$provider = $this->providers[$this->defaultProvider]
?? throw new ProviderNotFoundException('Default provider not configured');
return $provider->generateCompletion($request);
}
public function getProviderName(): string
{
return $this->defaultProvider;
}
public function supportsToolCalling(): bool
{
return true;
}
}
Rationale: Separating the gateway from concrete implementations allows runtime provider switching, circuit breaker integration, and cost-based routing without touching business logic.
Step 2: Enforce Structured Output Contracts
Free-form text parsing is unreliable in production. Defining strict DTOs ensures predictable data shapes and enables static analysis.
final class SupportTicketAnalysis
{
public function __construct(
public readonly string $category,
public readonly int $urgencyScore,
public readonly array $requiredActions,
public readonly string $summary
) {}
}
final class StructuredOutputHandler
{
public function parseResponse(string $rawJson): SupportTicketAnalysis
{
$data = json_decode($rawJson, true, 512, JSON_THROW_ON_ERROR);
return new SupportTicketAnalysis(
category: $data['category'],
urgencyScore: $data['urgency_score'],
requiredActions: $data['required_actions'],
summary: $data['summary']
);
}
}
Rationale: Structured output eliminates regex-based extraction, reduces parsing failures, and integrates cleanly with validation layers. JSON schema enforcement at the provider level further guarantees contract compliance.
Agents require access to external systems. A centralized registry enforces permissions, timeouts, and result serialization.
interface ExecutableTool
{
public function getName(): string;
public function execute(array $parameters): ToolResult;
}
final class ToolRegistry
{
private array $tools = [];
public function register(ExecutableTool $tool): void
{
$this->tools[$tool->getName()] = $tool;
}
public function execute(string $name, array $params): ToolResult
{
if (!isset($this->tools[$name])) {
throw new UnknownToolException("Tool {$name} is not registered");
}
$tool = $this->tools[$name];
return $tool->execute($params);
}
}
Rationale: Centralizing tool execution enables audit logging, rate limiting, and sandboxed execution. Each tool can implement its own timeout and retry policy without polluting the orchestration layer.
Step 4: Add State Management and Context Trimming
Long-running workflows require persistent memory. Unbounded context windows degrade performance and increase costs.
interface StateStore
{
public function load(string $sessionId): ConversationState;
public function save(string $sessionId, ConversationState $state): void;
}
final class ContextTrimmer
{
public function trim(ConversationState $state, int $maxTokens): ConversationState
{
$currentTokens = $this->estimateTokens($state->messages);
if ($currentTokens <= $maxTokens) {
return $state;
}
// Remove oldest non-system messages until within limit
$trimmed = array_filter($state->messages, fn($msg) => $msg['role'] !== 'user' || $msg['is_pinned']);
return new ConversationState($trimmed, $state->metadata);
}
}
Rationale: State persistence decouples inference from session management. Context trimming prevents token overflow, maintains response quality, and controls API costs. Pinning critical system instructions ensures behavioral consistency.
Step 5: Orchestrate Multi-Agent Routing
Specialized agents outperform monolithic models. A coordinator distributes tasks and aggregates results.
final class AgentCoordinator
{
public function __construct(
private readonly StateStore $state,
private readonly ToolRegistry $tools,
private readonly ModelGateway $gateway
) {}
public function route(IncomingRequest $request): AgentResponse
{
$state = $this->state->load($request->sessionId);
$trimmedState = (new ContextTrimmer())->trim($state, 8000);
$analysis = $this->gateway->generateCompletion(
new CompletionRequest($trimmedState->messages, 'gpt-5')
);
$structured = (new StructuredOutputHandler())->parseResponse($analysis->content);
$results = [];
foreach ($structured->requiredActions as $action) {
$results[] = $this->tools->execute($action, $request->payload);
}
$this->state->save($request->sessionId, $trimmedState->append($analysis->content));
return new AgentResponse($structured, $results);
}
}
Rationale: The coordinator acts as a thin orchestration layer. It handles state loading, context management, model invocation, tool execution, and state persistence. Business logic remains isolated in tools and output handlers. This separation enables independent testing, scaling, and provider swapping.
Pitfall Guide
1. Hardcoding Provider Endpoints
Explanation: Directly instantiating provider clients ties the codebase to a single vendor. Switching models requires refactoring multiple service classes.
Fix: Implement a gateway interface with runtime configuration. Use environment variables or feature flags to route requests without code changes.
2. Ignoring Context Window Limits
Explanation: Feeding unlimited conversation history into the model causes token overflow, degraded response quality, and unexpected billing spikes.
Fix: Implement a context trimmer that enforces token budgets. Pin system instructions, summarize older turns, and evict low-priority messages based on recency and relevance.
3. Parsing Free-Form Text for Critical Data
Explanation: Relying on regex or string manipulation to extract structured data from model outputs introduces fragility. Minor phrasing changes break parsers.
Fix: Enforce JSON schema output at the provider level. Validate responses against DTOs before processing. Implement fallback retries with stricter temperature settings.
4. Unbounded Memory Accumulation
Explanation: Storing every interaction in memory without eviction policies leads to performance degradation and storage costs.
Fix: Use a state store with TTL-based eviction. Implement semantic caching for repeated queries. Archive completed sessions to cold storage.
Explanation: External API calls or database queries triggered by agents can hang indefinitely, blocking the entire workflow.
Fix: Wrap all tool executions in timeout guards. Implement circuit breakers that fallback to cached or default responses when downstream services degrade.
6. Neglecting Cost Attribution Per Request
Explanation: Aggregating AI spend at the project level obscures which features or endpoints drive costs. Optimization becomes guesswork.
Fix: Tag every model call with metadata (tenant, feature, agent type). Log token consumption alongside business identifiers. Build dashboards that map cost to ROI.
7. Treating Agents as Stateless Functions
Explanation: Assuming each request is independent ignores the reality of multi-step workflows that require memory, tool results, and user feedback loops.
Fix: Design agents around explicit state machines. Persist intermediate results. Use deterministic routing for critical paths and allow probabilistic branching only where appropriate.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Single-turn classification or embedding generation | AI SDK | Minimal overhead, direct provider access, no state management required | Low |
| Multi-step workflow with tool calling and memory | Agent Framework | Built-in state stores, execution loops, and observability reduce boilerplate | Medium |
| Fleet of specialized agents requiring routing, A/B testing, and quality gates | Agent Platform | Centralized orchestration, regression testing, and distributed tracing justify infrastructure cost | High |
| Strict compliance requirements with audit trails | Agent Framework + Custom Observability | Framework provides execution control; custom spans ensure regulatory compliance | Medium-High |
| Rapid prototyping with frequent model swapping | AI SDK + Gateway Abstraction | Fast iteration, easy provider switching, low initial complexity | Low |
Configuration Template
# config/ai-pipeline.php
return [
'gateway' => [
'default_provider' => env('AI_DEFAULT_PROVIDER', 'openai'),
'fallback_provider' => env('AI_FALLBACK_PROVIDER', 'anthropic'),
'timeout_seconds' => 30,
'max_retries' => 2,
],
'context' => [
'max_tokens' => 8000,
'pin_system_messages' => true,
'eviction_strategy' => 'fifo', // fifo, semantic, or hybrid
],
'tools' => [
'execution_timeout' => 10,
'circuit_breaker_threshold' => 5,
'circuit_breaker_window' => 60,
],
'observability' => [
'enabled' => true,
'tracer_name' => 'ai-pipeline',
'log_token_usage' => true,
'export_endpoint' => env('OTEL_EXPORTER_ENDPOINT'),
],
'cost_tracking' => [
'enabled' => true,
'tag_keys' => ['tenant_id', 'feature_name', 'agent_type'],
'budget_alert_threshold' => 0.85,
],
];
Quick Start Guide
- Initialize the gateway: Install your preferred provider SDKs and register them against the
ModelGateway interface. Configure default and fallback providers via environment variables.
- Define output contracts: Create DTOs for expected model responses. Enable JSON schema enforcement in your provider configuration to guarantee structural compliance.
- Register tools: Implement the
ExecutableTool interface for each external dependency. Add them to the ToolRegistry with timeout and retry configurations.
- Deploy the coordinator: Wire the state store, context trimmer, and gateway into the
AgentCoordinator. Attach OpenTelemetry instrumentation and cost tagging before routing production traffic.