Key Takeaways
- OpenAI Agents SDK provides production-grade handoff-based orchestration supporting 100+ LLMs (not just OpenAI models)
- Pydantic's Monty (sub-microsecond sandboxed Python VM in Rust) is 50,000x faster than Docker for agent code execution
- Monty uses deny-by-default interpreter-level security: no filesystem, network, or system calls without explicit MCP connectors
- MCP protocol now has 75+ connectors, donated to Linux Foundation; A2A has 150+ supporting organizations
- Six competing production agent frameworks (OpenAI, Anthropic, Google, LangGraph, CrewAI, Mastra) now converge on MCP + A2A for interoperability
The Orchestration Layer Standardized
Agent frameworks have matured rapidly. OpenAI's Agents SDK (March 2026) evolved from the experimental Swarm framework with a production-grade handoff architecture: agents transfer execution control while carrying conversation context. Google's ADK followed with deeper multimodal capabilities. Anthropic's SDK provides the deepest MCP integration. Six production-grade frameworks now compete (OpenAI SDK, Anthropic SDK, Google ADK, LangGraph, CrewAI, Mastra), with 61% of business leaders already deploying agents and Gartner projecting 15% of daily business decisions automated by agents by 2028.
The Real Breakthrough: Safe Code Execution at Production Latency
The critical gap in agent systems has always been safe code execution. Pydantic's Monty, released February 6, 2026, solves this at the right abstraction layer: a from-scratch Python bytecode VM written in Rust that starts in 4 microseconds cold (sub-microsecond hot). This is a 50,000x improvement over Docker (195ms), 1,000x over third-party sandbox services (1,000ms), and 2,800x over Pyodide (2,800ms).
For agentic workloads—thousands of short code executions per session—this latency difference is transformative. The security model is deny-by-default at the interpreter level: open(), __import__(), eval(), exec() literally do not exist. No filesystem, no network, no system calls unless explicitly enabled through MCP connectors.
Agent Code Execution Sandbox Startup Latency (ms)
Monty's sub-microsecond startup enables thousands of code executions per agent session at production latency
Source: Pydantic official benchmark comparison
Why This Enables the Most Powerful Agent Pattern
The most powerful agentic pattern is 'code mode'—where LLMs write Python that calls tools as functions rather than making sequential tool calls via API. This approach is dramatically more flexible and token-efficient. An agent could write: results = [analyze_paper(url) for url in arxiv_search("attention mechanisms")] in a single step, rather than sequentially calling arxiv_search, iterating, then calling analyze_paper repeatedly.
This approach was previously untenable without heavy sandboxing overhead. Monty eliminates that overhead while providing stronger security guarantees than OS-level virtualization. Running arbitrary LLM-generated code in production was a compliance nightmare. Monty makes it a solved problem.
The Protocol Layer: The Sleeper Story
Anthropic's MCP (Model Context Protocol) was donated to the Linux Foundation in December 2025 and now has 75+ connectors. Google's A2A (Agent-to-Agent) protocol has 150+ supporting organizations. Both are being adopted across competing SDKs. This means the framework war (OpenAI vs Anthropic vs Google) is a developer experience competition, not a protocol lock-in battle. Agents built on any SDK can interoperate at the tool layer (MCP) and the agent communication layer (A2A).
Six competing frameworks would normally fragment the market into incompatible ecosystems. Instead, protocol convergence creates a situation where the real competition is on orchestration simplicity, observability, and ecosystem breadth—not on whether your agents can talk to your tools.
The Assembled Stack for Q2 2026
Agent orchestration (any of 6 SDKs) provides the control plane. MCP provides the tool integration layer (databases, APIs, file systems). Monty provides the secure code execution layer. A2A provides inter-agent communication. Built-in tracing (all major SDKs) provides the observability layer. Monty's state serialization to bytes enables durable agent workflows that survive process restarts—a critical enterprise requirement.
This is the infrastructure that enterprises need before automating business decisions. The stack is real and shipping.
Market Context: $8.5B in 2026
The market sizing is aggressive but grounded: $8.5B in 2026, $35B by 2030. 56% of teams report improved scalability with agent orchestration. Google ADK alone has 17,800 GitHub stars and 3.3 million monthly downloads. The infrastructure is real and shipping.
AI Agent Market Adoption (2026)
Agent deployment has crossed from experimentation to production at enterprise scale
Source: Gartner / Deloitte / Google / Anthropic
The Contrarian Case: Fragmentation and Limitations
Framework fragmentation is a genuine problem. Six competing production frameworks in 12 months creates 'decision paralysis for enterprise buyers.' The interoperability story (MCP + A2A) is aspirational—in practice, switching costs between SDKs are real. Monty is v0.0.3 with no class definitions, match statements, or standard library modules. HackerNews critics noted that 'the real power of terminal agents depends on network/filesystem access that Monty deliberately removes.'
Enterprise agents that cannot access databases, APIs, or filesystems are toys, not tools.
What Critics Are Missing: The Security-Capability Split
Monty is designed for the code-between-tool-calls pattern, not general-purpose computation. The agent SDK provides the tool layer (via MCP) for filesystem/API/database access. Monty provides the computation layer between tool calls. Together, they create a security model where the LLM can compute freely but can only interact with the outside world through whitelisted, logged, auditable MCP connectors.
This is actually a stronger security posture than Docker containers, where escape vulnerabilities are regularly discovered. The LLM's code is sandboxed. The LLM's tool access is audited. This is the compliance layer enterprises need.
What This Means for ML Engineers
If you're building agent systems, immediately evaluate Monty for the code-between-tool-calls pattern. Adopt MCP for tool integration regardless of which SDK you choose—protocol-level interoperability protects against framework switching costs. For enterprise deployments, Monty's state serialization + SDK tracing provides the audit trail compliance teams require before automating business decisions.
The framework war is a distraction. The real story is protocol convergence at the tool and communication layers. Build your systems expecting that you might switch agent frameworks in 2-3 years. MCP + A2A make that switching cost bearable.