ation**: Arguments are converted into a Pydantic model. This ensures that the LLM's output is validated against strict types before execution. If the model generates invalid arguments, the framework can catch this early or provide structured feedback.
3. Docstring as Instruction: In this architecture, docstrings are not just documentation; they are routing instructions. The agent's tool selector uses semantic similarity and keyword matching against the description to decide invocation.
Implementation Example
The following example demonstrates a production-grade tool definition. It uses Annotated types to provide field-level descriptions, which improves the LLM's ability to populate arguments correctly. This is a critical enhancement over basic type hints.
import os
import logging
from typing import Annotated, Literal
from langchain_core.tools import tool
logger = logging.getLogger(__name__)
@tool
def compute_risk_index(
sector_code: Annotated[str, "Industry sector code (e.g., 'FIN', 'TECH', 'HEALTH'). Required."],
volatility_threshold: Annotated[float, "Minimum volatility score to trigger alert. Default: 0.75"] = 0.75,
mode: Annotated[Literal["fast", "deep"], "Analysis mode. 'fast' uses cached data; 'deep' queries live sources."] = "fast"
) -> str:
"""Calculates a proprietary risk index for a given sector.
Use this tool when the user requests a risk assessment, sector analysis,
or volatility check. Do not use for general market queries.
Returns a JSON string containing the risk score and alert status.
"""
try:
# Simulate secure constant injection
api_key = os.environ.get("RISK_ENGINE_API_KEY")
if not api_key:
return '{"error": "Configuration missing. Risk engine unavailable."}'
# Business logic simulation
base_score = 50.0
if sector_code == "FIN":
base_score += 20.0
elif sector_code == "TECH":
base_score += 10.0
risk_score = base_score * (1.0 + volatility_threshold)
is_alert = risk_score > 80.0
result = {
"sector": sector_code,
"risk_score": round(risk_score, 2),
"alert_triggered": is_alert,
"mode": mode
}
logger.info(f"Risk index computed for {sector_code}: {risk_score}")
return str(result)
except Exception as e:
logger.error(f"Risk computation failed: {e}")
return f'{{"error": "Execution failed. Details: {str(e)}"}}'
Rationale for Choices:
Annotated Types: Providing descriptions for individual arguments helps the LLM understand the expected format and constraints for each parameter, reducing argument hallucination.
- Literal Types: Restricting
mode to specific values prevents the model from inventing invalid modes.
- Structured Error Returns: The tool returns JSON-formatted error strings. This allows the agent to parse the failure and potentially retry with different arguments or inform the user, rather than crashing the loop.
- Environment Injection: Secrets are retrieved via
os.environ, preventing leakage into the tool definition or logs.
Agent Integration
Once defined, tools are injected into the ReAct agent's execution environment. The agent iteratively reasons, selects tools, executes them, and observes outputs until a final answer is generated.
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
# 1. Register tools
tools = [compute_risk_index]
# 2. Initialize model
model = ChatOpenAI(model="gpt-4o", temperature=0.0)
# 3. Build agent
agent = create_react_agent(model, tools)
# 4. Invoke with query
query = "What is the risk index for the TECH sector with high volatility?"
response = agent.invoke({"messages": [("human", query)]})
# Extract final response
final_output = response["messages"][-1].content
print(final_output)
Execution Flow:
- The agent receives the query and analyzes the available tools.
- Based on the docstring of
compute_risk_index, the agent determines this tool is relevant.
- The agent generates arguments:
sector_code="TECH", volatility_threshold=0.9 (inferred from "high"), mode="fast".
- The framework validates arguments against the Pydantic schema.
- The tool executes, returns a result, and the agent incorporates this into the final response.
Pitfall Guide
Production experience reveals recurring failure modes in agent tooling. Addressing these proactively prevents runtime instability.
-
Semantic Drift in Descriptions
- Explanation: Vague docstrings like "Calculates value" lack semantic boundaries. The agent may invoke the wrong tool or skip execution.
- Fix: Use conditional triggers. Structure descriptions as: "Use this tool when [condition]. Do not use for [exclusion]." Include examples of valid inputs in the description if necessary.
-
Schema Ambiguity via Loose Types
- Explanation: Using
Any, dict, or omitting type hints forces the LLM to guess argument structures. This leads to ValidationError exceptions that break the agent loop.
- Fix: Enforce strict typing. Use
str, int, float, bool, and Literal enums. Avoid generic containers unless the structure is explicitly documented in the argument description.
-
Concurrency Hazards in Stateful Tools
- Explanation: Tools relying on global variables, module-level caches, or unthread-safe database connections cause race conditions during parallel tool execution or retries.
- Fix: Design tools to be stateless. Pass all necessary context via arguments. If state is required, use dependency injection or external state management services.
-
Monolithic Tool Anti-Pattern
- Explanation: Combining database queries, external API calls, and complex calculations in a single tool obscures the ReAct reasoning chain. If one sub-step fails, the entire tool fails, and the agent cannot isolate the error.
- Fix: Adhere to the Single Responsibility Principle. Decompose complex workflows into granular tools. This allows the agent to chain tools and handle partial failures gracefully.
-
Unstructured Error Returns
- Explanation: Unhandled exceptions or raw tracebacks crash the execution pipeline. LLMs expect string or JSON-serializable outputs.
- Fix: Wrap tool logic in
try/except blocks. Catch domain-specific errors and return structured failure messages that the agent can interpret. Example: return '{"error": "Timeout exceeded. Retry with lower depth."}'.
-
Credential Leakage
- Explanation: Hardcoding API keys, secrets, or business constants in tool functions risks exposure through prompt injection or log leakage.
- Fix: Inject sensitive values via environment variables, secure vaults, or dependency injection. Never embed secrets in the function body or docstrings.
-
Idempotency Neglect
- Explanation: Agents may retry tool calls due to transient errors or reasoning loops. Non-idempotent tools can cause duplicate transactions or data corruption.
- Fix: Ensure tools are idempotent where possible. Use idempotency keys for external API calls. Document side effects clearly in the tool description.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Simple Python Function | @tool Decorator | Fastest integration, auto-schema, type-safe. | Low |
| Complex Async/External API | StructuredTool or Custom BaseTool | Requires custom validation, async handling, or complex serialization. | Medium |
| Legacy REST Endpoint | API Wrapper + @tool | Bridge needed to adapt legacy response formats to agent expectations. | Medium |
| High-Security Operation | @tool with Vault Injection | Ensures secrets are managed securely and audit trails are maintained. | Low |
| Multi-Step Workflow | Decomposed Tools + Agent Chain | Allows granular error handling and reasoning. Avoids monolithic failures. | Low |
Configuration Template
This template provides a robust foundation for production tools, including logging, error handling, and secure configuration.
import os
import logging
from typing import Annotated, Literal
from langchain_core.tools import tool
# Configure logging for observability
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
@tool
def execute_financial_audit(
account_id: Annotated[str, "Unique account identifier. Format: ACC-XXXX."],
audit_depth: Annotated[Literal[1, 2, 3], "Audit depth: 1=Summary, 2=Standard, 3=Full."] = 1,
include_history: Annotated[bool, "Include historical transactions in report."] = False
) -> str:
"""Performs a financial audit check on the specified account.
Use this tool when the user requests an audit, compliance check,
or financial review. Returns a JSON string with audit results.
"""
try:
# Secure configuration retrieval
api_key = os.environ.get("FIN_AUDIT_API_KEY")
if not api_key:
logger.error("Audit API key not configured.")
return '{"error": "Configuration missing. Audit cannot proceed."}'
# Simulate audit logic
logger.info(f"Audit requested for {account_id}, depth={audit_depth}")
# Business logic placeholder
findings = []
if audit_depth >= 2:
findings.append("Standard compliance checks passed.")
if audit_depth == 3:
findings.append("Deep transaction analysis completed.")
result = {
"account_id": account_id,
"status": "success",
"depth": audit_depth,
"findings": findings,
"history_included": include_history
}
return str(result)
except Exception as e:
logger.error(f"Audit execution failed for {account_id}: {e}")
return f'{{"error": "Audit failed. Details: {str(e)}"}}'
Quick Start Guide
- Install Dependencies: Ensure
langchain-core and langgraph are installed.
pip install langchain-core langgraph langchain-openai
- Define Tool: Create a Python function with type hints and a descriptive docstring. Apply the
@tool decorator.
- Create Agent: Initialize the model and pass the tool list to
create_react_agent.
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
model = ChatOpenAI(model="gpt-4o")
agent = create_react_agent(model, [execute_financial_audit])
- Invoke: Call the agent with a query. The agent will automatically select and execute the tool.
response = agent.invoke({"messages": [("human", "Audit account ACC-1234 with full depth.")]})
print(response["messages"][-1].content)
- Validate: Inspect
tool.args to verify schema generation. Test with edge cases to ensure error handling works as expected.
By adopting schema-driven tooling patterns, development teams can significantly reduce integration overhead, improve agent reliability, and maintain strict security boundaries. The @tool decorator serves as a critical abstraction layer, transforming Python functions into deterministic, LLM-ready execution units.