Back to KB
Difficulty
Intermediate
Read Time
8 min

When prompts become shells: the tool registry is the attack surface

By Codcompass Team··8 min read

Agent Tool Registries: Hardening the Syscall Boundary in LLM Frameworks

Current Situation Analysis

The industry has largely treated Large Language Model (LLM) integration as a content generation problem. Security efforts focus on prompt injection for data exfiltration or model jailbreaking. This perspective is dangerously incomplete. Modern agent frameworks like Microsoft Semantic Kernel, LangChain, and AutoGen do not just generate text; they map LLM outputs to executable system functions. This mapping creates a bridge where untrusted text becomes privileged code execution.

Microsoft's security research team highlighted this paradigm shift in a retrospective published on May 7, 2026, detailing two Critical (CVSS 9.9) vulnerabilities in Semantic Kernel discovered in February 2026. These flaws demonstrate that when an AI framework acts as a foundational layer, a vulnerability in the tool registry becomes a systemic execution risk.

The vulnerabilities were not subtle logic errors; they were structural failures in how tools were registered and exposed:

  1. CVE-2026-26030: The InMemoryVectorStore component accepted user-supplied filter expressions and evaluated them using Python's eval(). The implementation attempted to mitigate risk via an Abstract Syntax Tree (AST) blocklist. However, attackers bypassed this blocklist using undocumented attribute traversal techniques involving __name__, load_module, and BuiltinImporter. This allowed execution of os.system without triggering the Import node detection, resulting in Remote Code Execution (RCE). The issue was patched in semantic-kernel Python version 1.39.4.
  2. CVE-2026-25592: The SessionsPythonPlugin exposed a method named DownloadFileAsync as a kernel function using the [KernelFunction] attribute. This decorator automatically registered the method as a callable tool for the LLM. The method accepted a localFilePath parameter with zero validation, canonicalization, or directory allowlisting. An attacker could craft a prompt causing the agent to write a malicious executable to C:\Windows\Start Menu\Programs\Startup\, achieving host-level persistence and sandbox escape with a single tool invocation. This was patched in Microsoft.SemanticKernel.Plugins.Core version 1.71.0.

These CVEs prove that the LLM is not a security boundary. Any string generated by the model must be treated as untrusted input at the system call level. The tool registry is the new attack surface; every registered function is a potential syscall that an attacker can trigger via prompt manipulation.

WOW Moment: Key Findings

Most security testing for AI agents relies on runtime probing: sending adversarial prompts and observing behavior. While useful, runtime testing has a critical blind spot regarding tool registration. It detects the symptom (a dangerous call occurred) but misses the structural cause (a dangerous function was registered).

The following comparison illustrates the efficacy gap between runtime-only testing and a registry-aware defense strategy.

Defense StrategyDetection ScopeRemediation TimingCoverage of Structural FlawsFalse Negative Risk
Runtime Probing OnlyInvocation behaviorPost-deploymentLow (Misses registration-time flaws)High
Registry-Aware Static AnalysisDecorator usage + Function bodyCI/CD PipelineHigh (Catches dangerous primitives in tool bodies)Medium (Requires whole-program analysis f

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

Sign in to read the full article and unlock all 635+ tutorials.

Sign In / Register — Start Free Trial

7-day free trial · Cancel anytime · 30-day money-back