CastKit

Since CastKit launched its Threat Grimoire — a comprehensive database of malicious agent skills — our security team has cataloged 341 confirmed threats. These are not theoretical risks. They are real skills, published to real registries, designed to compromise real systems. Here is what we have learned.

The Scale of the Problem

The agent skills ecosystem is growing at an extraordinary pace. With that growth comes an inevitable shadow: malicious actors who see skills as a new attack vector. Our research, conducted in partnership with ClawHub's ClawHavoc security initiative, has revealed a threat landscape that is both diverse and sophisticated.

Of the 341 malicious skills identified, the breakdown by threat type is sobering:

Data Exfiltration (68 skills): The most common category. These skills silently capture user inputs, API responses, clipboard contents, and cached data, transmitting them to external servers through various covert channels including DNS tunneling, steganography, and ultrasonic audio.
Credential Theft (52 skills): Skills designed to steal OAuth tokens, API keys, session cookies, and authentication credentials. Some use sophisticated techniques like SAML token forging and Kerberos ticket extraction.
Supply Chain Attacks (45 skills): Perhaps the most dangerous category. These compromise the software supply chain through dependency confusion, lockfile injection, npm script exploitation, and CDN poisoning. A single compromised dependency can affect thousands of downstream projects.
Backdoors (38 skills): Persistent access mechanisms ranging from reverse shells and service worker persistence to firmware-level backdoors on IoT devices. Some operate entirely in memory, leaving no disk artifacts for scanners to detect.
Cryptomining (28 skills): Skills that hijack computing resources for cryptocurrency mining. Modern variants use WebAssembly, WebGL compute shaders, and distributed mining across Web Workers to avoid detection.
Typosquatting (35 skills): Malicious skills published with names similar to popular packages — "axois" instead of "axios", "expresss" instead of "express", "chalck" instead of "chalk". These catch developers who make simple typos during installation.
Other categories include privilege escalation, unauthorized network access, C2 callbacks, and obfuscated code, each with dozens of documented instances.

How They Evade Detection

The sophistication of these threats has increased dramatically. Early malicious skills were crude — obvious eval() calls, hardcoded C2 server addresses, and blatant data collection. Modern threats are far more subtle.

Time-delayed activation is increasingly common. Several skills we analyzed remain dormant for days or weeks after installation before activating their malicious payload. This defeats security scanners that only analyze behavior during initial testing.

Conditional execution is another growing trend. Some skills only activate in specific environments — CI/CD pipelines, enterprise networks, or particular cloud providers. They behave perfectly during development and testing, then strike in production.

Obfuscation techniques have evolved beyond simple string encoding. We have seen WebAssembly compilation to bypass JavaScript scanners, steganographic payloads hidden in image data, and Unicode homoglyph attacks that make malicious code visually identical to legitimate code.

What We Are Doing About It

CastKit's security infrastructure operates on multiple levels:

Static Analysis. Every skill submitted to the marketplace undergoes automated code analysis. We scan for known malicious patterns, suspicious API usage, obfuscated code, and unauthorized network access. Our scanner database is updated daily with new threat signatures.

Behavioral Analysis. Skills are executed in sandboxed environments where we monitor their runtime behavior — network connections, file system access, resource usage, and API calls. Any deviation from declared capabilities triggers a review.

Community Reporting. Our Threat Grimoire is not just a database — it is a community resource. Security researchers can submit threat reports, and our team verifies and catalogs each one. The ClawHavoc partnership has been instrumental in identifying threats across the broader ecosystem.

Verified Creators. Skills from verified organizations like Anthropic, OpenAI, Google, and Stripe receive additional trust signals. While verification does not guarantee safety, it provides an additional layer of accountability.

How to Protect Yourself

For developers and organizations using agent skills, we recommend these practices:

Only install skills from verified sources. Check the creator, review the source code, and verify the GitHub repository link.
Monitor skill behavior. Use network monitoring tools to detect unexpected connections from your agent runtime.
Keep skills updated. Security patches are released regularly, and outdated skills may contain known vulnerabilities.
Report suspicious behavior. If a skill behaves unexpectedly, report it to the marketplace immediately.
Review the Threat Grimoire. Stay informed about known threats and check whether any skills you use have been flagged.

The agent skills ecosystem is powerful and transformative. But like any software ecosystem, it requires vigilance. The 341 malicious skills we have found are likely just the tip of the iceberg. Security scanning is not optional — it is the foundation on which trust in the entire ecosystem is built.