How to Inject Hidden Runtime Context into AI Agent Tools (LangChain + LangGraph)

By Codcompass Team·2026-05-22·7 min read

Current Situation Analysis

Multi-tenant AI applications face a critical architectural vulnerability: unauthorized cross-tenant data access. When engineering agents that query internal knowledge bases, CRMs, or document repositories, developers frequently rely on system prompts to instruct the model to respect tenant boundaries. This approach is fundamentally flawed. Large language models are probabilistic text generators, not deterministic access control engines. They cannot guarantee enforcement of isolation policies, especially when faced with prompt injection, context window manipulation, or ambiguous user intent.

The industry standard has historically been to pass tenant identifiers (e.g., organization_id, workspace_slug) as explicit parameters in tool schemas. This exposes authorization metadata directly to the model and, by extension, to the end-user. Attackers can manipulate these parameters through indirect prompt injection or by crafting queries that force the model to override default scoping. Security audits of early-stage agentic products consistently show that explicit parameter passing creates a mutable attack surface, leading to data leakage across organizational boundaries.

The correct architectural pattern requires decoupling authorization logic from the model's reasoning loop entirely. Tenant context must be injected at the framework execution layer, completely invisible to the LLM's function-calling schema. This shifts data isolation from a "suggestion" to a "hard constraint," ensuring that every tool invocation operates within verified boundaries regardless of what the model attempts to generate.

WOW Moment: Key Findings

The following comparison demonstrates why server-side context injection fundamentally outperforms traditional agent scoping methods:

Approach	Security Surface	Data Isolation Guarantee	Audit Trail Capability	Implementation Overhead
Prompt-Enforced Scoping	High (LLM decides)	Weak (relies on instructions)	Low (no guaranteed tenant ID)	Low
Explicit Parameter Passing	Medium (visible to LLM)	Medium (can be overridden)	Medium (logged but mutable)	Medium
Server-Side Context Injection	Minimal (hidden from LLM)	Strong (enforced at runtime)	High (immutable tenant ID)	Low-Medium

This finding matters because it resolves the core tension between agent flexibility and security compliance. By stripping tenant identifiers from the tool schema and injecting them at execution time, you eliminate the model's ability to influence data scope. This enables SOC 2, HIPAA, and GDPR-compliant multi-tenant deployments without rewriting core agent logic or implementing complex middleware filters. The pattern scales horizontally across any number of tools and graph nodes while maintaining a single source of truth for authorization.

Core Solution

The implementation relies on LangChain's ToolRuntime and context_schema primitives. These components work together to separate user intent from system authorization. The framework automatically strips runtime parameters from the OpenAI function-calling schema, ensuring the LLM only receives the parameters it needs to reason about user intent. At execution time, the graph

runner injects the verified context directly into the tool's signature.

Architecture Decisions & Rationale

Schema Stripping: The runtime parameter is excluded from the JSON schema sent to the model. This prevents the LLM from attempting to generate or modify tenant identifiers.
Immutable Context: The context object is passed by reference but treated as immutable during tool execution. This guarantees auditability and prevents accidental state mutation.
Graph-Level Wiring: context_schema is declared at the agent creation level, not per-tool. This centralizes authorization configuration and reduces boilerplate across large toolsets.
Separation of Concerns: User messages flow through the messages array. Authorization data flows through the context argument. This strict boundary prevents prompt contamination.

Implementation

from dataclasses import dataclass
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import ToolRuntime, tool

# ── 1. Define the Authorization Schema ────────────────────────────────────────
@dataclass
class WorkspaceSession:
    """Server-side payload extracted from verified authentication tokens."""
    account_id: str
    workspace_slug: str
    role_level: str

# ── 2. Mock Data Repository ───────────────────────────────────────────────────
class InternalKnowledgeBase:
    """Simulates a multi-tenant document store."""
    _store = {
        "acme_corp": [
            "Q3 Financial Projections",
            "Engineering Onboarding Playbook",
            "Client SLA Templates",
        ],
        "globex_inc": [
            "Supply Chain Logistics Map",
            "Vendor Compliance Checklist",
            "Regional Marketing Assets",
        ],
    }

    @classmethod
    def fetch(cls, workspace_slug: str, keyword: str) -> list[str]:
        workspace_docs = cls._store.get(workspace_slug, [])
        return [doc for doc in workspace_docs if keyword.lower() in doc.lower()]

# ── 3. Tool Definition with Runtime Injection ─────────────────────────────────
@tool
def query_internal_docs(search_term: str, runtime: ToolRuntime[WorkspaceSession]) -> str:
    """
    Retrieve internal documentation based on a keyword.
    """
    # Context is injected by the framework, never generated by the LLM
    session = runtime.context
    
    # Execute scoped query
    results = InternalKnowledgeBase.fetch(
        workspace_slug=session.workspace_slug,
        keyword=search_term
    )
    
    if not results:
        return f"No documents matched '{search_term}' in your workspace."
    
    # Format response with implicit audit trail
    formatted = "\n- ".join(results)
    return (
        f"Retrieved {len(results)} document(s) for workspace '{session.workspace_slug}':\n"
        f"- {formatted}"
    )

# ── 4. Agent Assembly ─────────────────────────────────────────────────────────
reasoning_model = init_chat_model("openai:gpt-4o-mini")

agent_graph = create_agent(
    model=reasoning_model,
    tools=[query_internal_docs],
    context_schema=WorkspaceSession,  # Declares expected context shape
)

# ── 5. Invocation with Server-Side Context ────────────────────────────────────
def handle_user_request(user_input: str, auth_payload: dict) -> str:
    """
    Entry point from API gateway. Auth payload comes from middleware.
    """
    verified_session = WorkspaceSession(
        account_id=auth_payload["sub"],
        workspace_slug=auth_payload["org_slug"],
        role_level=auth_payload["role"],
    )
    
    graph_response = agent_graph.invoke(
        {"messages": [{"role": "user", "content": user_input}]},
        context=verified_session,
    )
    
    return graph_response["messages"][-1].content

# Example execution
response = handle_user_request(
    user_input="Find the onboarding materials",
    auth_payload={
        "sub": "usr_8842",
        "org_slug": "acme_corp",
        "role": "engineer"
    }
)
print(response)

Why This Works

The ToolRuntime[WorkspaceSession] parameter acts as a framework-level hook. When create_agent serializes the tool for the LLM, it inspects the signature, identifies ToolRuntime as a special framework type, and excludes it from the generated OpenAI function schema. The model receives only search_term: str. During graph execution, the checkpoint runner intercepts the tool call, retrieves the context object passed at invocation, and binds it to the runtime parameter before executing the Python function. This guarantees that authorization metadata never traverses the model's inference pipeline.

Pitfall Guide

Pitfall	Explanation	Fix
Leaking context in docstrings	Including `workspace_slug` or `account_id` in the tool's description or docstring causes the LLM to see it in the schema, defeating the injection pattern.	Keep docstrings focused on user intent. Reference context only via `runtime.context` inside the function body.
Mutating context inside tools	Assigning new values to `runtime.context` properties breaks immutability guarantees and corrupts audit trails across subsequent tool calls.	Treat `runtime.context` as read-only. Create local variables if transformation is needed.
Omitting context at invocation	Calling `agent.invoke()` without the `context` argument raises a `RuntimeError` or silently passes `None`, causing authorization failures.	Always validate that `context` is provided before graph execution. Wrap invocation in a middleware guard.
Using context for business logic	Relying on `role_level` or `account_id` to drive conditional branching inside tools mixes authorization with domain logic, creating tight coupling.	Use context strictly for scoping and audit logging. Route business rules through separate policy engines or graph nodes.
Assuming context availability in all nodes	`context_schema` only injects into tools. Standard LangGraph nodes or state reducers do not automatically receive it.	Pass context explicitly through `State` or use `RunnableConfig` for non-tool nodes that require tenant data.
Mixing user input with context parameters	Allowing the user to supply `workspace_slug` in the prompt while also injecting it via context creates conflicting sources of truth.	Strip tenant identifiers from user-facing prompts. Validate that `context` is the sole source of authorization data.
Ignoring checkpoint serialization	If `WorkspaceSession` contains non-serializable objects (e.g., database connections), LangGraph's checkpointing will fail.	Keep context dataclasses strictly serializable (strings, ints, enums). Resolve connections inside tools, not in context.

Production Bundle

Action Checklist

Define a strict, serializable dataclass for tenant context containing only authorization metadata
Register context_schema at the agent creation level, not per-tool
Verify tool docstrings contain zero references to tenant identifiers or session data
Implement middleware that extracts context from verified JWTs or session tokens before graph invocation
Add runtime validation to reject invocations missing the context argument
Configure LangGraph checkpointers to exclude context objects from state persistence
Log account_id and workspace_slug at tool entry points for audit compliance
Unit test tools with mock ToolRuntime objects to verify context injection behavior

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Single-tenant internal tool	Explicit parameters in tool schema	Simpler implementation, no multi-tenant overhead	Low
Multi-tenant SaaS with strict compliance	Server-side context injection via `ToolRuntime`	Guarantees isolation, prevents prompt injection, audit-ready	Medium
Hybrid (public + private data)	Context injection + explicit public tool parameters	Separates authorized scoping from open queries	Medium-High
High-throughput inference pipeline	Context injection + async tool execution	Reduces schema payload size, improves token efficiency	Low

Configuration Template

# auth_context.py
from dataclasses import dataclass, field
from typing import Optional

@dataclass(frozen=True)
class TenantContext:
    """Immutable authorization payload for agent tool injection."""
    tenant_id: str
    user_id: str
    permissions: list[str] = field(default_factory=list)
    metadata: dict = field(default_factory=dict)

# agent_factory.py
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import ToolRuntime, tool
from .auth_context import TenantContext

@tool
def execute_scoped_query(query: str, runtime: ToolRuntime[TenantContext]) -> str:
    ctx = runtime.context
    # Authorization enforcement happens here
    if "read:documents" not in ctx.permissions:
        return "Access denied: insufficient permissions."
    # Proceed with scoped execution...
    return f"Query executed for tenant {ctx.tenant_id}"

def build_tenant_agent(model_name: str = "openai:gpt-4o-mini"):
    model = init_chat_model(model_name)
    return create_agent(
        model=model,
        tools=[execute_scoped_query],
        context_schema=TenantContext,
    )

Quick Start Guide

Install dependencies: pip install langchain langgraph langchain-openai
Define your context schema: Create a frozen dataclass containing only tenant identifiers and permission flags.
Wire your tools: Add runtime: ToolRuntime[YourContext] as the second parameter in every @tool function.
Initialize the agent: Pass context_schema=YourContext to create_agent().
Invoke with middleware: Extract verified session data from your auth layer, instantiate the context object, and pass it via the context argument during agent.invoke().

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back