Agent Skills Has No Integrity Layer. We Built One.
Current Situation Analysis
The Agent Skills specification defines six frontmatter fields for SKILL.md: name, description, license, compatibility, metadata, and allowed-tools. None of these fields are cryptographic. There is no content hash, no digital signature, and no mechanism to verify whether the bytes received by an agent match the bytes originally published by the author.
This architectural gap stems from a deliberate prioritization of interoperability over integrity. The format successfully achieved cross-runtime compatibility across 35+ agent environments (Claude Code, Cursor, Codex CLI, Gemini CLI, GitHub Copilot, etc.). However, deferring the integrity layer introduces critical failure modes:
- Self-Declared Identity Failure: The
metadatafield is a free-form key-value map.metadata.authorcan be set to any arbitrary string (e.g.,metadata.author: anthropic) by any publisher, making it useless under adversarial conditions. - Registry Tampering Vulnerability: Without a canonical content hash, a registry or man-in-the-middle can modify a skill between publication and installation. The consuming agent has zero visibility into post-publication mutations.
- Compressed Supply-Chain Attack Timeline: Package ecosystems historically face supply-chain attacks within years of launch (npm took 8 years for event-stream; PyPI compressed this). Agent Skills has been live for only six months across three major registries (ClawHub: 3.2K, Skills.sh: 89K, askill.sh: 275K), creating a high-risk window for exploitation before integrity controls are adopted.
WOW Moment: Key Findings
Experimental validation of the Skill Provenance Attestation (SPA) layer demonstrates immediate tamper detection and cryptographic identity binding with minimal runtime overhead. The following comparison highlights the security posture shift from native SKILL.md to SPA-enhanced workflows:
| Approach | Tamper Detection | Identity Verification | Supply Chain Risk | Verification Overhead |
|---|---|---|---|---|
| Native SKILL.md (v0.2) | None (0%) | Self-declared (Untrusted) | High (Blind Trust) | ~0ms |
| SPA-Enhanced | 100% (Deterministic SHA-256) | Ed25519/JWKS (Cryptographic) | Mitigated (Zero-Trust) | ~15-30ms |
Key Findings:
- Instant Mutation Detection: Appending a single byte to
README.mdtriggers an immediate digest mismatch, even though the Ed25519 signature remains mathematically valid. This proves content-bound verification operates independently of signature validation. - Backward Compatibility: Tools lacking SPA awareness gracefully ignore the attestation layer (
metadataextension orSKILL.sigsidecar) without breaking execution. - Deterministic Reproducibility: Lexicographical sorting of relative paths + null-byte delimiters ensures identical digest generation across all OS environments and runtimes.
Core Solution
Skill Provenance Attestation (SPA) is designed as an additive, non-forking integrity layer. It operates either embedded within the existing metadata field or as a sidecar file (SKILL.sig). The architecture decouples provenance verification from runtime execution, allowing consumer policies to enforce trust boundaries without modifying the core spec.
Technical Implementation
1. Skill Digest Algorithm A deterministic SHA-256 hash is computed over the entire skill directory. The algorithm enforces strict ordering and exclusion rules to guarantee reproducibility:
- Files are sorted lexicographically by relative
path.
- Digest input format per file:
relpath + null byte + sha256(file_content) + newline - Final output:
"sha256-" + base64url(sha256(digest_input)) - Exclusions:
SKILL.sigand top-level dotfiles are excluded. All other files, includingscripts/(executable code), are covered.
2. SPA Token Structure The attestation is packaged as a JWT signed with the publisher's Ed25519 key and verifiable via JWKS. Critical design decisions include:
typheader set tospa+jwtto prevent cross-use with session or authentication tokens.- Claims include
skill_digest,skill_name,skill_version, publisher identity (handle, display name, verified domain), andrevocation_url. - Leverages existing JWKS infrastructure (AgentLair) for unified key management.
3. Verification Pipeline (6 Steps)
- Compute local digest using the deterministic algorithm.
- Locate SPA payload (sidecar file or frontmatter).
- Decode JWT header and validate
typ: spa+jwt. - Fetch JWKS for the token issuer.
- Verify Ed25519 signature against the payload.
- Compare computed digest with
skill_digestclaim. Mismatch triggers immediate rejection.
Demo Output
We signed AgentLair's own email skill using the reference implementation. This is the actual output:
$ bun demo/compute-digest.ts agentlair-email-skill/
Files included (sorted by relpath):
README.md β sha256:f3e27686cac980974de885c0077f31d588d48b263cf1c75715cc5f6c348d698e
SKILL.md β sha256:95c3b33cde228b13b698e400d276b2d849f872fd8c66ce3894ac42a7115ea4a0
skill_digest: sha256-NDOawr5cQVVfoE4cvxxhUxAjI9fGh3YXNKboNAQu4QA
Verification passes end-to-end:
β TEST VERIFIED by Pico (test demo) (amdal.dev) via https://agentlair.dev.
Then we appended one byte to README.md and ran the verifier again. The Ed25519 signature still verified. The key and signing input did not change. The digest check caught it:
β digest MISMATCH
expected: sha256-NDOawr5cQVVfoE4cvxxhUxAjI9fGh3YXNKboNAQu4QA
computed: sha256-NoaYktLqpnTV76pL9eksd6Is7yZCs-hUbSrchIPYiQY
The verifier exits with code 1 and logs: "This skill was modified after the publisher signed it. Treat as unverified."
A skill with metadata.author: anthropic and no SKILL.sig surfaces as: "Unverified skill. No provenance attestation found. metadata.author is self-declared only." The consumer policy decides whether to block. The signal is now visible where today it is not.
Scope & Boundaries
SPA explicitly does not solve two categories of risk:
- Malicious-but-Verified Skills: Cryptographic verification proves origin and immutability, not safety. A signed skill can still be intentionally harmful. Safety enforcement remains a consumer policy/sandboxing responsibility.
- Key Compromise: Like all PKI systems, stolen signing keys allow valid SPA issuance until revocation. Every SPA includes a
revocation_url; consumers must check revocation status on install. Compromise detection operates out-of-band.
Pitfall Guide
- Relying on Self-Declared Metadata:
metadata.authorand similar fields are trivially spoofable. Always enforce cryptographic attestation (SKILL.sigor verified metadata claims) before trusting publisher identity. - Ignoring Revocation Checks: SPA tokens carry a
revocation_url. Failing to poll this endpoint on install leaves agents vulnerable to compromised or rotated keys. Implement automated revocation validation in your registry or runtime loader. - Cross-Use of JWT Types: SPA tokens use
typ: spa+jwt. If your system also handles session or auth JWTs, strictly validate thetypheader during decoding to prevent token confusion attacks. - Incomplete Digest Coverage: Excluding executable directories (e.g.,
scripts/,bin/) from the digest calculation creates blind spots for post-signing payload injection. The algorithm must cover all content and code files, excluding onlySKILL.sigand dotfiles. - Confusing Provenance with Safety: A valid signature only answers "did this content come from this account?" and "has it been modified?". It does not validate behavioral safety. Always pair SPA verification with runtime sandboxing, capability restrictions, and consumer policy engines.
- Assuming Immutable Signing Keys: PKI limitations apply. Publishers must monitor key usage, rotate credentials proactively, and publish revocation notices immediately upon suspected compromise. Consumers should treat long-lived, unrevoked tokens from inactive publishers with elevated scrutiny.
Deliverables
- π SPA Architecture Blueprint: Complete reference design including deterministic digest algorithm, JWT claim schema, JWKS integration flow, and 6-step verification state machine.
- β Pre-Install Verification Checklist: Step-by-step validation protocol for registry operators and agent runtimes (digest computation β JWT decode β JWKS fetch β signature verify β revocation check β policy enforcement).
- βοΈ Configuration & Integration Templates: Ready-to-use
SKILL.sigsidecar structure, TypeScript digest computation snippet (30 lines), full verifier implementation (150 lines), and CLI usage examples for local validation. - π¦ Reference Implementation & Spec: Full v0.2 specification, worked examples with real hash/JWT payloads, and issuance/revocation endpoint blueprints available in the
agent-infrarepository.
