e execution semantics per attempt.
- Arbitrary time horizons: Workflows can pause for human approval or external callbacks without holding compute resources, using
workflow.wait_condition() and signal handlers.
- Full observability: Every state transition is visible in the Temporal Web UI, enabling pause, inspect, and replay capabilities for debugging complex agentic loops.
Core Solution
Temporal operates as a durable execution runtime. Workflow code appears as standard Python async code, but under the hood, the Temporal service records every decision in an immutable event history. If a worker crashes, the next worker replays the history, bypasses completed activities, and resumes execution precisely where it stalled.
Architecture: Workflows vs. Activities
Temporal enforces a strict separation of concerns:
Workflows: Deterministic Orchestrators
Workflows define control flow, handle signals/queries, and coordinate steps. The critical constraint is determinism: workflow code must produce identical decisions during every replay. Non-deterministic operations (datetime.now(), random, filesystem I/O, HTTP requests) are strictly prohibited.
import dataclasses
import datetime
from temporalio import workflow
@dataclasses.dataclass
class ResearchInput:
query: str
max_steps: int = 10
@workflow.defn
class ResearchAgentWorkflow:
def __init__(self) -> None:
self._paused = False
@workflow.signal
async def pause(self) -> None:
self._paused = True
@workflow.query
def is_paused(self) -> bool:
return self._paused
@workflow.run
async def run(self, inp: ResearchInput) -> str:
# Pause signal support built-in
await workflow.wait_condition(lambda: not self._paused)
results = []
for step in range(inp.max_steps):
result = await workflow.execute_activity(
run_research_step,
args=[inp.query, step],
start_to_close_timeout=datetime.timedelta(minutes=5),
)
results.append(result)
if "[DONE]" in result:
break
return "\n".join(results)
Activities: Non-Deterministic Workers
All side effectsâLLM API calls, database writes, file reads, HTTP requestsâmust reside in activities. Temporal executes activities at most once per attempt and manages retries automatically.
from temporalio import activity
from temporalio.common import RetryPolicy
@activity.defn
async def run_research_step(query: str, step: int) -> str:
# Heartbeat keeps Temporal informed the activity is alive
activity.heartbeat(f"Running step {step}")
# Your LLM call goes here â crashes here will be retried
response = await call_llm(f"Research step {step} for: {query}")
return response
Retry behavior is declarative. You can tune backoff, cap attempts, and exclude specific errors:
retry_policy = RetryPolicy(
initial_interval=datetime.timedelta(seconds=1),
backoff_coefficient=2.0,
maximum_interval=datetime.timedelta(seconds=60),
maximum_attempts=5,
non_retryable_error_types=["InvalidInputError", "RateLimitExceeded"],
)
Local Development & Worker Setup
Install the SDK and CLI, then launch the dev server:
pip install temporalio==1.10.0
brew install temporal # macOS
# Windows/Linux: download from github.com/temporalio/cli
temporal server start-dev
# Temporal Service: localhost:7233
# Web UI: http://localhost:8233
Connect and execute workflows via the Python worker:
from temporalio.client import Client
from temporalio.worker import Worker
async def main():
client = await Client.connect("localhost:7233")
async with Worker(
client,
task_queue="ai-agents",
workflows=[ResearchAgentWorkflow],
activities=[run_research_step],
):
# Worker is running; start workflows via client
handle = await client.start_workflow(
ResearchAgentWorkflow.run,
ResearchInput(query="transformer attention mechanisms"),
id="research-001",
task_queue="ai-agents",
)
result = await handle.result()
print(result)
OpenAI Agents SDK Integration
Temporal's native OpenAI Agents SDK integration (GA March 2026) bridges agentic tooling with durable execution. TemporalRunner wraps the OpenAI runner so every agent invocation executes as a Temporal Activity, while activity_as_tool automatically converts Temporal activities into OpenAI-compatible tool schemas:
from openai_agents import Agent
Pitfall Guide
- Violating Workflow Determinism: Introducing non-deterministic calls (
datetime.now(), random, requests.get()) inside workflow code breaks event log replay. Temporal will throw a DeterminismViolationError. Always delegate side effects to activities.
- Mixing I/O into Workflows: Placing LLM calls, database queries, or file operations directly in workflow functions couples orchestration with execution. This defeats durable execution guarantees and causes replay failures. Use
workflow.execute_activity() for all external interactions.
- Omitting Activity Heartbeats: Long-running activities without heartbeats will be marked as timed out by the Temporal service, triggering unnecessary retries. Call
activity.heartbeat() periodically to report progress and reset the activity timeout clock.
- Misconfiguring Retry Policies: Relying on defaults can cause infinite retry loops on non-recoverable errors (e.g.,
429 Rate Limit, 400 Invalid Input). Explicitly define non_retryable_error_types and cap maximum_attempts to prevent token waste and downstream API throttling.
- Using Ephemeral Dev Storage in Production-Like Tests:
temporal server start-dev runs entirely in memory. Without --db-filename temporal.db, workflow state vanishes on restart. Always persist the dev database when testing crash recovery or long-running workflows.
- Blocking the Async Event Loop: Temporal workflows run on Python's
asyncio event loop. Synchronous blocking calls (time.sleep, requests.post, CPU-heavy loops) will stall the entire worker thread. Always use await with async-compatible libraries or offload CPU work to separate activity workers.
Deliverables
- đ Durable AI Agent Architecture Blueprint: Complete reference architecture detailing workflow/activity boundary design, signal/query patterns for human-in-the-loop approval, and event history replay strategies for multi-step agentic research.
- â
Production Readiness Checklist: 12-point validation matrix covering determinism auditing, heartbeat implementation, retry policy tuning, idempotency verification, observability dashboard configuration, and fallback circuit breakers.
- âď¸ Configuration Templates: Ready-to-deploy
temporal_worker.py scaffold, retry_policy.json profiles (aggressive backoff vs. conservative token-saving), and openai_agent_integration.py boilerplate for seamless OpenAI Agents SDK adoption.