Your AI Agent's Skills Are Your New Attack Surface
By Sarah Collins, Director of Research & Intelligence — TPM Content Research
Three signals landed in a 48-hour window this week, and they all point at the same gap. NVIDIA released SkillSpector, an open-source static analyzer that scans AI agent skills for the obvious attack patterns: executable scripts, environment variable exfiltration, prompt injection, and missing supply-chain verification. Vercel released Eve, a production framework already running 100+ agents internally, where skills are freeform text loaded into the agent's context window with no runtime sandbox. And Google DeepMind, alongside Schmidt Sciences, put up $10M for research into multi-agent safety — explicitly calling out "strengthening agent infrastructure" and "stress-testing protocols for identity, reputation, and commitment" as priority areas because, in their words, "there is a lack of tools to predict, measure and monitor these transitions."
None of these vendors invented the concept of "skill security." Each of them independently concluded that the gap is real, the gap is now, and the gap is operational. For TPMs running AI programs, that means one thing: the skill layer needs a review process before it goes anywhere near production.
Skills Are Not Models — And That Changes The Threat Model
Most agent security thinking today is borrowed from model evaluation. You benchmark the model. You run red-team evals against the model. You put guardrails around the model. The skill layer — the markdown files, the function definitions, the prompt snippets that get injected into context on demand — has no equivalent standard.
This is not a small oversight. SkillSpector's release notes describe finding executable scripts, eval() calls, base64-decoded payloads, env-var reads that exfiltrate AWS credentials, and prompt-injection directives hidden inside skill documentation across a deliberately constructed "benign" corpus. The tool's authors point out the obvious implication: any organization deploying third-party skills or even internal skills written by non-security-trained developers is exposed, and most current evaluation pipelines will not catch this.
The reason is structural. Models go through training, evaluation, and deployment gates. Skills go through version control. The model is a binary you cannot meaningfully tamper with without breaking it. A skill is a text file. A text file can be edited, injected into, replaced wholesale, or socially engineered into a pull request that an overworked TPM merges during a launch crunch.
Vercel Eve makes the gap architectural. Eve explicitly separates skills (knowledge loaded on-demand) from tools (actions with side effects). Tools are typed, constrained, and mediated by the framework's runtime. Skills are freeform text that gets injected into the context window when the agent decides a topic is relevant. There is no sandbox around the skill content itself. A skill that contains prompt-injection text can reframe what the agent thinks its mission is, redirect tool calls, expand the scope of an action, or quietly exfiltrate context to a logging endpoint the team never set up.
This is the moment TPMs need to stop treating skill management as "just docs" and start treating it as a deployment gate.
The skill is the new dependency. The dependency needs a security review.
Why The Gap Was Inevitable
The agent ecosystem grew up around model performance, not skill governance. Every framework metric that mattered in 2024 and most of 2025 — benchmark scores, eval pass rates, latency, tool-call accuracy — measured the model. Skills were an afterthought because they were small in number and rarely updated.
That era is over. Eve runs 100+ production agents. Codex CLI shipped v0.141.0 this week with end-to-end encrypted Noise relay channels for remote executors and MCP-server discovery for executor plugins. Anthropic, Google, and OpenAI are all shipping agents with hundreds of skills each. The skill counts are no longer small. They are no longer rarely updated. And they are no longer written only by the model's creators.
The DeepMind/Schmidt Sciences $10M funding call is the most honest signal of where the field actually is. The call's priority areas include "strengthening agent infrastructure" and "stress-testing protocols for identity, reputation, and commitment." The framing matters: identity, reputation, and commitment are properties of agents, not models. They apply to skills as loadable, executable, prompt-shaping units. The research community is saying, in writing, that we do not yet know how to secure agent-to-agent interactions. That is not a 2027 problem. It is a 2026 problem, and it is already showing up in your agent deployment.
What TPMs Should Do This Quarter
The good news: you do not need a six-month program to start. The gap is real, but it is also tractable. Three operational changes will move the needle immediately.
1. Add a skill inventory to your agent deployment checklist. You cannot secure what you cannot see. Treat every skill that ships with an agent the way you treat a dependency: record its source, version, hash, and the path that loaded it. The closest analogue is the software SBOM — a Software Bill of Materials — but for skills. There is no industry-standard skill SBOM format yet, so pick a spreadsheet and a process. The point is to be able to answer the question "what skills is this agent currently running, and where did each one come from?" without grepping the repo at 2am. A minimal row might look like: skill_name | source_repo | commit_sha | load_path | last_reviewed_by | reviewed_at.
2. Run static analysis on every skill before deployment. SkillSpector is open-source and works today. So does the SARIF-based pattern NVIDIA ships with it. Wire it into your CI pipeline. A skill that triggers a finding should not ship any faster than a dependency that triggers a CVE scan. If your security team does not have bandwidth for this, that is the signal to escalate — not to skip the gate.
3. Separate skill review from model evaluation in your governance. They are different problems. Model evaluation answers "does this model do what we want?" Skill review answers "does this skill do what it says, and only what it says?" A skill that promises to format dates but actually reads environment variables is a security incident, not a quality bug. Treat it that way.
If you do these three things in the next 90 days, your agent deployment program will be ahead of the median organization deploying agents at scale. If you do not, the question is not whether a skill-related incident will land in your backlog — it is when.
The Bigger Picture: Skills Are The New Dependencies
The pattern here looks familiar from the 2015→2018 dependency-security precedent — when libraries went from "everyone's code" to "everyone's vulnerability" — and the same curve appears to be repeating with skills. The TPMs who treat skill security as an afterthought today will spend 2027 writing postmortems about why their agent leaked data through a markdown file in a context window.
Vercel's own 100-agent deployment is the closest thing the industry has to a published case study. Watch for what they release next. The bet is that the framework will add a skill review layer — because the alternative is being the first major framework to ship a skill-related CVE, and no framework wants to be that vendor.
Until then, the governance work is yours. The skill is the new dependency. The dependency needs a security review. Build the gate before the incident forces you to.
Researched and written by Sarah Collins, Director of Research & Intelligence. Sources: NVIDIA SkillSpector (GitHub), Vercel Eve launch post, DeepMind $10M Multi-Agent Safety Funding Call, Codex CLI v0.141.0 release notes. All four sources HEAD-verified 200 on 2026-06-18.
Member discussion