public function switchProvider(string $provider, string $model): self
{
$this->provider = $provider;
$this->model = $model;
return $this;
}
public function generateResponse(string $prompt, array $tools = []): string
{
$builder = Prism::text()
->using($this->provider, $this->model)
->withMaxTokens($this->maxTokens)
->withMaxSteps($this->maxSteps)
->withPrompt($prompt);
if (!empty($tools)) {
$builder->withTools($tools);
}
$response = $builder->asText();
return $response->text;
}
}
**Architecture Rationale:** Centralizing provider resolution in a service class isolates vendor-specific logic. The `switchProvider` method enables runtime fallbacks without touching business logic. Explicit `maxTokens` and `maxSteps` defaults prevent runaway costs at the architectural level.
### Step 2: Defining Executable Tools
Tools should be class-based rather than closure-based to enable dependency injection, mocking, and idempotency tracking. Each tool must return structured strings that the LLM can parse reliably.
```php
// app/Tools/InventoryLookupTool.php
namespace App\Tools;
use Prism\Prism\Facades\Tool;
use App\Services\InventoryService;
class InventoryLookupTool
{
public static function definition(): Tool
{
return Tool::as('check_inventory')
->for('Retrieve current stock levels and warehouse location for a given SKU.')
->withStringParameter('sku', 'The unique stock keeping unit identifier.')
->using(function (string $sku): string {
$inventory = app(InventoryService::class)->getStockLevel($sku);
if ($inventory === null) {
return json_encode(['status' => 'not_found', 'sku' => $sku]);
}
return json_encode([
'status' => 'available',
'sku' => $sku,
'quantity' => $inventory['quantity'],
'warehouse' => $inventory['location'],
]);
});
}
}
// app/Tools/DispatchAlertTool.php
namespace App\Tools;
use Prism\Prism\Facades\Tool;
use App\Services\NotificationGateway;
class DispatchAlertTool
{
public static function definition(): Tool
{
return Tool::as('send_alert')
->for('Dispatch a priority notification to the operations team via SMS or email.')
->withStringParameter('recipient', 'Contact identifier (email or E.164 phone number).')
->withStringParameter('message', 'The alert payload content.')
->using(function (string $recipient, string $message): string {
try {
app(NotificationGateway::class)->deliver($recipient, $message);
return json_encode(['status' => 'delivered', 'recipient' => $recipient]);
} catch (\Exception $e) {
return json_encode(['status' => 'failed', 'error' => $e->getMessage()]);
}
});
}
}
Architecture Rationale: Class-based tools separate definition from execution. Returning JSON-encoded strings ensures the LLM receives parseable data regardless of provider. Dependency injection via app() allows swapping gateways during testing. Error states are explicitly returned rather than thrown, preventing loop termination.
Step 3: Orchestrating the Agentic Loop
The agent loop combines system context, tool definitions, and step limits into a single execution path. The orchestrator manages the conversation state internally.
// app/Services/AgentRunner.php
namespace App\Services;
use App\Contracts\AiOrchestrator;
use App\Tools\InventoryLookupTool;
use App\Tools\DispatchAlertTool;
class AgentRunner
{
public function __construct(
protected AiOrchestrator $orchestrator
) {}
public function executeSupportWorkflow(string $userQuery): string
{
$tools = [
InventoryLookupTool::definition(),
DispatchAlertTool::definition(),
];
$systemContext = <<<'PROMPT'
You are an automated supply chain assistant. You have access to inventory data and alert dispatch capabilities.
Always verify stock levels before confirming availability. If inventory is below threshold, trigger an alert.
Never fabricate warehouse locations or quantities.
PROMPT;
return $this->orchestrator
->withSystemPrompt($systemContext)
->generateResponse($userQuery, $tools);
}
}
Architecture Rationale: The AgentRunner acts as a use-case boundary. It injects domain-specific system prompts and tool sets without exposing Prism internals to controllers. Step limits are enforced at the orchestrator level, guaranteeing that tool failures or ambiguous queries terminate predictably.
Step 4: Building the RAG Pipeline with pgvector
Retrieval-Augmented Generation requires precise dimensional alignment and efficient indexing. PostgreSQL 16+ with pgvector provides native vector operations without external dependencies.
// database/migrations/2024_01_15_000000_create_knowledge_base_table.php
use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;
use Illuminate\Support\Facades\DB;
return new class extends Migration
{
public function up(): void
{
DB::statement('CREATE EXTENSION IF NOT EXISTS vector');
Schema::create('knowledge_documents', function (Blueprint $table) {
$table->id();
$table->string('title');
$table->text('content');
$table->string('category');
$table->timestamps();
});
// text-embedding-3-small outputs exactly 1536 dimensions
DB::statement('ALTER TABLE knowledge_documents ADD COLUMN embedding vector(1536)');
// HNSW index optimizes for approximate nearest-neighbor search at scale
DB::statement(
'CREATE INDEX knowledge_documents_embedding_idx ON knowledge_documents USING hnsw (embedding vector_cosine_ops)'
);
}
};
// app/Services/VectorEmbeddingService.php
namespace App\Services;
use Illuminate\Support\Facades\Http;
use RuntimeException;
class VectorEmbeddingService
{
public function generateEmbedding(string $text): array
{
$response = Http::withToken(config('services.openai.api_key'))
->timeout(15)
->retry(3, 500)
->post('https://api.openai.com/v1/embeddings', [
'model' => 'text-embedding-3-small',
'input' => $text,
]);
if (!$response->successful()) {
throw new RuntimeException('Embedding generation failed: ' . $response->body());
}
return data_get($response->json(), 'data.0.embedding', []);
}
public function storeEmbedding(int $documentId, array $vector): void
{
$vectorString = '[' . implode(',', $vector) . ']';
DB::table('knowledge_documents')
->where('id', $documentId)
->update(['embedding' => DB::raw("'$vectorString'::vector")]);
}
}
Architecture Rationale: Dimensional alignment is non-negotiable. text-embedding-3-small produces 1,536 dimensions; mismatched columns cause silent query failures. HNSW indexing outperforms IVFFlat for production workloads by balancing memory usage and query latency. The embedding service is isolated to enable mocking, batch processing, and provider swapping without touching retrieval logic.
Pitfall Guide
1. Unbounded Token Consumption
Explanation: Omitting explicit token ceilings forces the model to default to the provider's maximum context window. A single complex query can consume thousands of tokens across multiple steps.
Fix: Always set withMaxTokens() at the orchestrator level. Implement middleware that logs token usage per request and triggers alerts when thresholds are breached.
2. Autonomous Financial Execution
Explanation: Tools that process payments, refunds, or inventory deductions without human verification create compliance and financial risk. LLMs can misinterpret ambiguous prompts and trigger irreversible actions.
Fix: Route high-impact tools through a confirmation queue. Return a pending_approval status from the tool, then require a webhook callback or admin UI action before execution.
3. Dimensional Mismatch in Vector Stores
Explanation: Embedding models output fixed-dimensional arrays. Storing a 3,072-dimensional vector in a 1,536-dimensional column truncates data silently, degrading retrieval accuracy.
Fix: Validate dimension counts during migration and embedding generation. Create a database constraint or application-level assertion that verifies count($vector) === 1536 before insertion.
4. Hardcoded Provider Routing
Explanation: Scattering provider strings across controllers prevents runtime fallbacks and complicates cost optimization. Vendor outages immediately break user-facing features.
Fix: Centralize provider resolution in a configuration-driven service. Store active provider and model strings in .env or a database-backed config table. Implement a circuit breaker pattern that switches providers after consecutive failures.
Explanation: When a tool throws an exception instead of returning a structured error string, the agentic loop terminates abruptly. The LLM receives no context about the failure and cannot retry or adapt.
Fix: Wrap all tool execution in try/catch blocks. Return JSON-encoded error payloads that the model can parse. Design tools to be idempotent so safe retries don't duplicate side effects.
6. Synchronous RAG in Request Cycle
Explanation: Generating embeddings and querying vector stores synchronously adds 200-800ms to every request. Under load, this creates connection pool exhaustion and timeout cascades.
Fix: Offload embedding generation to queued jobs. Pre-compute vectors during document ingestion. Use database-level vector search with connection pooling and query timeouts.
7. Missing System Prompt Context
Explanation: Agents without explicit behavioral constraints hallucinate tool parameters, ignore business rules, or return unstructured responses. The model defaults to generic training data.
Fix: Inject domain-specific system prompts that define tool usage rules, output formats, and failure handling. Version system prompts alongside code to track behavioral drift.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Simple Q&A or content generation | Laravel laravel/ai SDK | Linear request/response, lower overhead, native framework integration | Low (single API call) |
| Multi-step workflows with external APIs | Prism PHP with tool calling | Native loop control, result injection, step limits | Medium (multiple steps, capped by maxSteps) |
| Semantic search over internal documents | Prism PHP + pgvector RAG | Reduces hallucination, leverages existing PostgreSQL infrastructure | Medium-High (embedding generation + vector queries) |
| High-availability production systems | Prism PHP with provider fallback | Circuit breaker routing, unified streaming, vendor independence | Variable (optimized via model switching) |
Configuration Template
# .env
PRISM_DEFAULT_PROVIDER=anthropic
PRISM_DEFAULT_MODEL=claude-sonnet-4-6
PRISM_MAX_TOKENS=1024
PRISM_MAX_STEPS=5
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-api03-...
GEMINI_API_KEY=...
// config/prism.php
return [
'default_provider' => env('PRISM_DEFAULT_PROVIDER', 'anthropic'),
'default_model' => env('PRISM_DEFAULT_MODEL', 'claude-sonnet-4-6'),
'max_tokens' => env('PRISM_MAX_TOKENS', 1024),
'max_steps' => env('PRISM_MAX_STEPS', 5),
'providers' => [
'openai' => ['key' => env('OPENAI_API_KEY')],
'anthropic' => ['key' => env('ANTHROPIC_API_KEY')],
'gemini' => ['key' => env('GEMINI_API_KEY')],
],
];
Quick Start Guide
- Install the package and publish configuration:
composer require prism-php/prism && php artisan vendor:publish --tag=prism-config
- Register provider API keys in
.env and set default provider/model values in config/prism.php
- Create a class-based tool definition with structured JSON return payloads and error handling
- Bind an orchestrator service to the container that enforces token ceilings and step limits
- Execute the agent loop via
Prism::text()->using()->withTools()->withMaxSteps()->asText() and monitor token metrics