Pipeline Active
Last: 09:00 UTC|Next: 15:00 UTC
← Back to Insights

The AI Governance Trilemma: You Cannot Have Capability, Openness, and Safety Together

Mythos leak reveals nation-state cyber capabilities, OpenMDW license standardizes open release, NIST RMF advances safety standards. Three simultaneous developments expose an impossible triangle: maximizing any two vertices necessarily constrains the third. The industry is fragmenting into three distinct strategies.

governancesafetyregulationopen-sourcecybersecurity4 min readMar 30, 2026

# The AI Governance Trilemma: You Cannot Have Capability, Openness, and Safety Together

The week of March 25-30, 2026 crystallized a fundamental tension in AI development that has been building for two years. Three seemingly unrelated events—an internal model leak, a licensing standard launch, and regulatory framework advancement—combine to reveal a structural trilemma that will define AI governance for the next decade.

## The Three Vertices

Vertex 1: Capability. Claude Mythos (internal codename: Capybara) represents what Anthropic describes as "by far the most powerful AI model we've ever developed," positioned as a new tier above Opus. Leaked internal documents warn it is "currently far ahead of any other AI model in cyber capabilities" and "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

Anthropically separately disclosed that Chinese state-sponsored actors already used Claude Code to orchestrate infiltration of approximately 30 organizations. Cybersecurity stocks plunged on the news.

Vertex 2: Openness. The same week, the Linux Foundation released OpenMDW v1.0, backed by Amazon, Meta, IBM, and Microsoft—the most significant attempt to create a standard permissive license for AI models. OpenMDW covers code, parameters, datasets, and artifacts under MIT-style terms with no copyleft restrictions.

Alibaba released Qwen3-VL (235B, Apache 2.0) and Qwen 3.5-9B (Apache 2.0), both rivaling frontier proprietary models. The open-source AI ecosystem is stronger than ever.

Vertex 3: Safety. NIST RMF 1.1 is advancing toward testable requirements that map to ISO 42001 international certification, creating the first potentially enforceable AI safety credential. But NIST operates in a voluntary compliance environment after the Trump administration rescinded Executive Order 14110, meaning safety standards have teeth only for government contractors and regulated industries.

## The Impossible Choice

Trying to maximize all three simultaneously creates conflicts:

Capability + Openness (sacrifice Safety): Release Mythos-class models under Apache 2.0. Maximizes innovation and access but puts nation-state offensive cyber capabilities in the hands of anyone with a GPU. This is the path Alibaba and Meta are closest to—Apache 2.0 releases of models approaching GPT-5-class capabilities with no usage restrictions.

Capability + Safety (sacrifice Openness): Anthropic's current strategy with Mythos. Restrict early access to cyber defense organizations, brief government officials privately, keep the model behind API access controls. Preserves safety but creates a capability moat accessible only through a single vendor's terms.

Openness + Safety (sacrifice Capability): Release models under OpenMDW with NIST RMF-compliant safety evaluations, but cap capability at a level below Mythos-class offensive potential. This is approximately what the EU AI Act's high-risk classification mandates—open systems that meet safety requirements but are throttled below the frontier.

## The Real-World Test Case

Qwen3-VL's visual agent capabilities (comparable to Claude computer use) are released under Apache 2.0 with no usage restrictions. If combined with Mythos-class reasoning in a future open release, the result would be an autonomous cyber-attack agent deployable by anyone.

NIST RMF provides no mechanism to prevent this because it is voluntary for non-government entities. The governance infrastructure for maximizing openness (OpenMDW) is being built in the same month that the strongest evidence yet emerges for why unlimited openness is dangerous (Mythos). These two developments are on a collision course.

## Corporate Governance Strategies

With federal mandates removed, Anthropic is performing ad-hoc safety governance that NIST was supposed to standardize. Company-led safety governance fills a regulatory vacuum, but it is unilateral, unaudited, and dependent on the goodwill of a single commercial entity.

The documented exploitation of Claude Code by Chinese state-sponsored groups demonstrates the governance gap is not theoretical. It is measured in compromised organizations and defensive damage.

## What Enterprise Architects Should Do

Explicitly choose which vertex of the trilemma to sacrifice based on regulatory context.

Government contractors: Prioritize safety (NIST RMF + gated models). The liability exposure of deploying open models with offensive capabilities is too high.

Startups in competitive markets: Consider the liability implications of deploying open models. Document your governance rationale—NIST/ISO convergence will eventually create audit requirements.

Regulated industries (finance, healthcare): Build on the capability + safety axis, using API-gated frontier models with auditable safety evaluations.

All organizations should document their governance rationale now. NIST and ISO 42001 convergence will eventually create audit requirements that penalize undocumented decisions.

## Contrarian Perspective

The trilemma may be temporary. If safety evaluation becomes cheap and automated—red-teaming agents, automated vulnerability scanning—it could be applied to open models post-release, effectively making all three vertices achievable.

But Anthropic's own assessment is that Mythos-class models outpace defenders. Offensive capability scales faster than defensive evaluation. Until that changes, the trilemma is real.

Share

Cross-Referenced Sources

5 sources from 1 outlets were cross-referenced to produce this analysis.