Pipeline Active
Last: 21:00 UTC|Next: 03:00 UTC
← Back to Insights

Enterprise AI Security's Dual Crisis: Outbound Data Leaks + Inbound Agent Malware

77% employee data leakage via shadow AI + 11.9% malicious ClawHub skills create a compound attack surface. Existing EDR cannot detect AI-specific threats.

TL;DRNeutral
  • Outbound vector: 77% of employees leak proprietary data to public LLMs; 42% of violations involve source code; average enterprise sees 223 monthly incidents with only 50% DLP coverage
  • Inbound vector: 11.9% of ClawHub agent skills are malicious (341 out of 2,857); 100% of confirmed malicious skills use dual-vector attack combining code exploit + prompt injection
  • Agent identity theft is a new threat class: infostealers now exfiltrate AI agent configuration files (gateway tokens, cryptographic keys, personality files) alongside traditional credentials
  • Publishing barrier for malicious skills is near-zero: only a one-week-old GitHub account and a SKILL.md file—no code signing, no security review, no sandbox
  • Enterprise security architectures built for perimeter defense (firewall, DLP, EDR) are structurally unprepared for AI-specific attack vectors operating inside the application layer
AI securityshadow AIagent securitysupply chain attackmalware7 min readFeb 17, 2026

Key Takeaways

  • Outbound vector: 77% of employees leak proprietary data to public LLMs; 42% of violations involve source code; average enterprise sees 223 monthly incidents with only 50% DLP coverage
  • Inbound vector: 11.9% of ClawHub agent skills are malicious (341 out of 2,857); 100% of confirmed malicious skills use dual-vector attack combining code exploit + prompt injection
  • Agent identity theft is a new threat class: infostealers now exfiltrate AI agent configuration files (gateway tokens, cryptographic keys, personality files) alongside traditional credentials
  • Publishing barrier for malicious skills is near-zero: only a one-week-old GitHub account and a SKILL.md file—no code signing, no security review, no sandbox
  • Enterprise security architectures built for perimeter defense (firewall, DLP, EDR) are structurally unprepared for AI-specific attack vectors operating inside the application layer

Outbound Vector: The Shadow AI Data Leakage Crisis

The first attack vector is the most visible but least addressable: employees copying sensitive information into public LLMs via browser copy-paste, completely bypassing traditional network-based DLP.

According to Netskope's 2026 Cloud and Threat Report, the scale is massive and accelerating:

  • 77% of employees share sensitive company data with AI tools like ChatGPT; 47% use personal (unmanaged) accounts for work tasks
  • 6.8 pastes per GenAI user per day; 3.8 contain sensitive corporate data
  • Average enterprise experiences 223 GenAI data policy violations per month; top-quartile organizations see 2,100 incidents monthly
  • Source code represents 42% of violations, followed by regulated data (32%) and IP (16%)
  • ChatGPT Free accounts cause 87% of sensitive data exposure incidents; 17% of exposures involve personal accounts with zero organizational visibility
  • Only 50% of organizations apply DLP to GenAI vs. 63% for traditional shadow IT—a critical 13-point governance gap

This is not a technology problem that firewalls or signature-based DLP can solve. An engineer debugging production code can paste entire repositories into ChatGPT within seconds, leaving no traditional network signature. The data then becomes part of OpenAI's training corpus—potentially contaminating frontier models with proprietary source code, financial models, customer databases, and confidential strategies.

The compliance consequence is existential. GDPR permits fines up to 4% of global annual revenue for unauthorized data processing. For a $10B enterprise, that's $400M at risk from a single compliance failure category.

Inbound Vector: The Malicious Agent Skill Marketplace Attack Surface

The second attack vector is even more dangerous because it operates inside the trust boundary: malicious agent skills published to open agent marketplaces like ClawHub.

Koi Security audited 2,857 ClawHub skills and found 341 malicious entries (11.9%), with 335 traced to a single coordinated campaign (ClawHavoc) delivering Atomic Stealer (AMOS) malware. Snyk's ToxicSkills audit of 3,984 total skills found 13.4% (534) contain critical security issues.

The attack vector is novel: 100% of confirmed malicious skills simultaneously employ both malicious code AND prompt injection techniques. A traditional malware scanner detects the code exploit. An AI safety system might detect the prompt injection. But a compromised skill executes both—the code payload handles credential theft while the prompt injection bypasses safety guardrails and enables autonomous malicious actions.

The barrier to publishing a malicious skill is near-zero:

  • One-week-old GitHub account
  • A SKILL.md file with documentation
  • No code signing requirement
  • No security review before publication
  • No sandbox or runtime execution isolation

The typosquat campaign demonstrates attacker persistence. The original "clawhub" malicious skill accumulated 7,743 downloads before removal. The attacker immediately returned with "clawdhub1" variant after takedown.

New Threat: AI Agent Identity Theft

The most alarming finding is the first documented case of AI agent identity theft. Security researchers detected an infostealer (likely Vidar variant) successfully exfiltrating a victim's OpenClaw workspace including:

  • openclaw.json — gateway tokens (authentication credentials)
  • device.json — private cryptographic keys
  • soul.md — AI personality definition and behavioral instructions, plus daily conversation logs

This represents a fundamental expansion of the infostealer threat model. Existing credential stealers (Atomic Stealer, Vidar, Lumma) target browser credentials, email accounts, and banking logins. The next evolution is clear: specialized "AI-stealers" that specifically target agent configuration files, orchestration tokens, and personality definitions.

The implication is severe: a compromised agent doesn't just leak data—it can be impersonated, weaponized for lateral movement in enterprise networks, or used to autonomously extract information from connected systems.

Why Existing Security Architecture Fails

Enterprise security teams built their detection and response capabilities around three assumptions:

  1. Perimeter defense: Keep threats outside the network boundary (firewall, VPN, network segmentation)
  2. Endpoint detection: Detect malware execution on user devices and servers (EDR, antivirus)
  3. Data loss prevention: Detect and block sensitive data leaving the network (DLP, CASB)

AI introduces attack vectors that violate all three assumptions:

  • Shadow AI bypasses perimeter defense because data exfiltration happens via a user's browser (inside the trust boundary) using legitimate SaaS (ChatGPT, Claude) that the organization permits for productivity
  • Agent skills bypass endpoint detection because they execute code inside the agent runtime, not as traditional processes that EDR can observe. The code executes as part of the AI orchestration layer, where most EDR has no visibility
  • DLP cannot detect semantic exfiltration because copy-paste to LLMs leaves no traditional DLP signature (no file transfer, no email attachment, no USB drive)

Existing security vendors (CrowdStrike, SentinelOne, Palo Alto Networks) have massive installed bases in EDR and cloud security, but their product architectures pre-date the AI security threat model.

Market Implications: Two Converging Security Categories

The dual-vector attack surface is creating two distinct (but eventually converging) security market categories:

Category 1: AI-Aware Data Loss Prevention

  • Semantic content analysis of GenAI prompts and responses
  • Detection of sensitive data types (PII, source code, financial data) being pasted to LLMs
  • Integration with generative AI models for false positive reduction
  • Vendors: Nightfall AI, Metomic, Prompt Security, traditional DLP incumbents (Forcepoint, Digital Guardian)

Category 2: Agent Supply Chain Security

  • Malicious skill/plugin detection across agent marketplaces
  • Code vulnerability scanning + prompt injection detection (dual-vector analysis)
  • Agent runtime sandboxing and behavior monitoring
  • Skill signing and provenance verification
  • Vendors: Snyk, Sonatype, new AI-security startups

The combined TAM is substantial. If 223 monthly incidents per enterprise represent only detectable violations (and 50% of orgs lack GenAI DLP), true exposure is 3-5x higher. Add the agent marketplace supply chain risk, and the addressable market for AI security infrastructure rivals the $15B cloud security TAM.

Structural Comparison: AI Agent Attacks vs. Historical Supply Chain Incidents

The ClawHub supply chain attack is worth comparing to historical npm/PyPI incidents because the blast radius is structurally worse:

Dimensionnpm/PyPI Malware (2018-2023)AI Agent Skill Malware (2026)
Malware persistenceInstalled as package dependency; executes once during buildExecutes continuously as part of agent runtime; maintains persistent agent state and memory
Attack surfacePackage code + build environmentCode execution + credential access + autonomous action capability + LLM prompt injection
Detection methodStatic code analysis + signature matchingMust detect dual-vector attack (code + prompt injection) simultaneously; behavioral monitoring required
Dwell time (estimated)Hours to days (developers spot unusual behavior)Days to weeks (agent behavior is expected to be autonomous and unpredictable)
Data exfiltration scopeBuild artifacts, source code, local filesAgent configuration, credentials, cryptographic keys, conversation history, connected system data

The critical difference: AI agents are designed to be autonomous and unpredictable. Traditional malware detection relies on anomaly detection (unusual process execution, network traffic). An agent that suddenly exfiltrates data might be detected as "unusual," but how does a security team distinguish malicious autonomous action from legitimate agent behavior?

What This Means for Practitioners

If you're responsible for AI security or enterprise architecture:

  1. Conduct a shadow AI audit immediately. Work with your CISO and DLP team to instrument GenAI usage across your organization. The 50% of enterprises lacking GenAI DLP should expect to find 200+ monthly violations on your first audit.
  2. Implement AI-aware DLP for outbound data governance. Deploy solutions like Nightfall AI or Metomic that can detect sensitive data types in LLM prompts. This is not optional—GDPR enforcement will accelerate once regulators understand the scale of shadow AI.
  3. If you deploy agent-based AI systems, implement agent skill verification. Require code signing for any agent skill published internally. Use Snyk's ToxicSkills-style scanning for both code vulnerabilities and prompt injection patterns before any skill is installed.
  4. Isolate agent runtimes with least-privilege access. Agents should not have direct access to production databases, credential stores, or payment systems. Route all critical operations through a capability-based security model (e.g., agents can invoke functions but cannot directly query systems).
  5. Monitor agent behavior anomalies. Set up runtime behavior monitoring that detects agent skills exfiltrating data, making unexpected API calls, or engaging in credential theft. This is the AI equivalent of detecting C2 communication.
  6. Plan for GDPR/CCPA enforcement of GenAI governance. Regulators will begin enforcement actions against enterprises with documented shadow AI data leakage within 12-18 months. Proactive remediation now avoids regulatory surprise and brand damage later.

The dual-vector AI security crisis is not a future risk—it's active now. Enterprises that address both the outbound shadow AI crisis and the inbound agent supply chain risk in the next 6 months will establish defensibility that competitors will struggle to replicate.

Share