s |
| Supply Chain Audit | npm audit, pip audit | None / Manual Review Only |
Why this matters: Treating configurations as data leads to a false sense of security. Hash checks on configuration files can pass even if the behavior changes dynamically via external fetches. Without scanning agent configs with the same rigor as source code, organizations are effectively running unverified instructions in their development and production environments.
Core Solution
Securing the agent context requires a "Config-as-Code" security approach. Configuration files must be inventoried, scanned, and gated within the CI/CD pipeline. The implementation involves integrating dedicated auditing tools, defining strict policies, and managing risk suppressions with accountability.
Step-by-Step Implementation
- Inventory Configuration Artifacts: Identify all files that influence agent behavior, including
mcp.json, skill files (.md, .yaml), system prompts, and plugin manifests.
- Integrate Configuration Scanner: Deploy a scanner capable of parsing natural language directives and detecting injection patterns. The scanner should output results in SARIF format for integration with security dashboards.
- Enforce CI Gates: Configure the pipeline to fail on high-severity findings. This prevents malicious configurations from merging into the main branch.
- Implement Justified Suppressions: Establish a workflow for handling false positives or accepted risks. Suppressions must include justification, reviewer approval, and an expiry date to prevent technical debt.
Code Examples
The following examples demonstrate a secure implementation using a hypothetical agent-audit tool. This replaces manual checks with automated gates.
CI Pipeline Integration
This workflow runs on every pull request, scanning agent configurations and uploading results to GitHub Security.
# .github/workflows/agent-config-security.yml
name: Agent Configuration Security
on:
pull_request:
paths:
- 'agents/**'
- '**/mcp.json'
- '**/*.skill.md'
jobs:
audit-configs:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4
- name: Install Agent Audit Tool
run: npm install -g @org/agent-audit
- name: Run Configuration Scan
run: |
agent-audit scan ./agents/ \
--recursive \
--fail-on-severity high \
--format sarif \
--output audit-results.sarif
- name: Upload SARIF to GitHub Security
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: audit-results.sarif
Pre-Commit Hook
Local feedback prevents developers from committing insecure configurations.
# .pre-commit-config.yaml
repos:
- repo: https://github.com/org/agent-audit
rev: v2.1.0
hooks:
- id: agent-audit-local
name: Scan agent configs
entry: agent-audit scan --fail-on-severity critical
language: node
files: '\.(md|json|yaml)$'
pass_filenames: false
Justified Suppression Format
Suppressions must be explicit and time-bound. This format ensures accountability and prevents silent acceptance of risks.
<!-- agent-audit: ACCEPT-8821
justification: Internal documentation endpoint managed by the security team.
approved_by: sec-lead
date: 2026-06-15
review_due: 2026-12-15
-->
Architecture Decisions
- SARIF Output: Using SARIF ensures compatibility with existing security tooling and dashboards, centralizing vulnerability management.
- Fail-on-Severity: Blocking on
high severity balances security with developer velocity. Critical issues always block; high issues require review.
- Expiry on Suppressions: Risks accepted today may become unacceptable tomorrow. Expiry dates force periodic re-evaluation, eliminating suppression rot.
Pitfall Guide
The following pitfalls are common in agent security implementations. Each includes a detailed explanation and a remediation strategy.
-
The Markdown Mirage
- Explanation: Developers often assume markdown files (
.md) are safe because they are text-based. However, skill files written in markdown contain executable instructions for the LLM.
- Fix: Treat all text-based configuration files as executable. Include
.md, .txt, and .yaml files in the scan scope.
-
Hash Pinning Fallacy
- Explanation: Pinning the hash of a skill file ensures the file content hasn't changed. However, if the file contains a directive to fetch instructions from an external URL, the behavior can change without altering the file hash. This is known as a "rug pull."
- Fix: Analyze external fetch directives within configurations. Block or sandbox external URL fetches unless explicitly whitelisted.
-
RAG Backdoor Injection
- Explanation: Indirect injection occurs when malicious instructions are embedded in documents ingested by a RAG pipeline. The agent retrieves these documents and executes the hidden instructions.
- Fix: Scan ingested documents for instruction injection patterns. Implement context sanitization to strip or neutralize directive-like text before the agent processes it.
-
Environment Variable Leakage
- Explanation: Configuration files may set environment variables that agents use for authentication or configuration. Malicious configs can overwrite these variables to exfiltrate data or escalate privileges.
- Fix: Validate environment variable assignments in configurations. Restrict which variables can be set by agent configs and enforce least-privilege access.
-
Suppression Rot
- Explanation: Teams often suppress findings without justification or expiry. Over time, these suppressions accumulate, creating invisible technical debt and masking real vulnerabilities.
- Fix: Enforce a policy where every suppression requires a reason, reviewer approval, and an expiry date. Automate alerts when suppressions approach expiry.
-
Registry Trust Assumption
- Explanation: Assuming that packages from skill registries are vetted and safe. Attackers can compromise registry packages to distribute malicious skill files.
- Fix: Scan registry packages before merging them into the project. Treat third-party configs with the same scrutiny as internal ones.
-
Subprocess Blindness
- Explanation: Agents may spawn subprocesses based on configuration directives. If the configuration is compromised, the agent can execute arbitrary commands.
- Fix: Audit subprocess spawning directives. Implement sandboxing or allowlists for executable commands permitted by agent configurations.
Production Bundle
Action Checklist
Decision Matrix
Use this matrix to determine the appropriate security approach based on the scenario.
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Internal Team Configs | Strict CI Gate + Pre-Commit | Ensures high security with minimal latency for trusted developers. | Low |
| Third-Party Plugins | Sandbox + Scan | Balances risk mitigation with the need to use external tools. | Medium |
| RAG Ingestion | Pre-Ingestion Scan | Prevents indirect injection without impacting agent runtime performance. | Medium |
| Open Source Contribution | Automated Scan + Manual Review | Allows community contributions while maintaining security standards. | High |
| Legacy Configs | Gradual Migration + Suppression | Reduces disruption while addressing technical debt over time. | Low |
Configuration Template
This template provides a starting point for configuring the agent-audit tool. Customize paths and policies to match your project structure.
{
"agent_audit": {
"version": "1.0",
"scan_targets": [
"./agents",
"**/mcp.json",
"**/*.skill.md"
],
"policies": {
"max_external_fetches": 1,
"require_suppression_expiry": true,
"allowed_env_vars": [
"API_KEY",
"REGION",
"LOG_LEVEL"
],
"blocked_directives": [
"execute_shell",
"write_file",
"network_request"
]
},
"suppression_format": "markdown",
"output_format": "sarif"
}
}
Quick Start Guide
Get your agent configuration security up and running in under five minutes.
- Install the Scanner: Run
npm install -g @org/agent-audit to install the auditing tool globally.
- Add to CI: Copy the CI workflow example into
.github/workflows/agent-config-security.yml.
- Run Initial Scan: Execute
agent-audit scan ./agents/ --format sarif to generate a baseline report.
- Triage Findings: Review the SARIF output, address critical issues, and apply justified suppressions where necessary.
- Enforce Gates: Merge the CI workflow to enforce scanning on all future pull requests.