Morning Edition LIVE
Vol. I · No. 1
Est.
MMXXVI

The A.I. Beat

Dispatches from the frontier of machine intelligence
Three
Dollars
← Front page Code May 31, 2026 · 5 min read
Code

Anthropic documents how it sandboxes Claude (finally)

The company published detailed specs on process isolation, VMs, and egress controls across its agent products.
Anthropic documents how it sandboxes Claude (finally)

Anthropic just did something AI companies almost never do: they published detailed documentation on how they actually contain their AI agents.

The new overview covers sandboxing across Claude.ai, Claude Code, and Cowork. It’s the kind of technical transparency that’s been sorely missing from the agent space.

Here’s why this matters. When you give an AI agent access to your terminal or filesystem, you’re trusting that the product has properly isolated what it can touch. But most sandbox implementations are black boxes. You get vague promises about “secure execution environments” and not much else. Without documentation, you’re left guessing whether the sandbox is actually robust or just security theater.

What they’re actually doing

Anthropic’s containment strategy uses multiple layers: process sandboxes, virtual machines, filesystem boundaries, and egress controls. The goal is to limit where and how Claude can act.

The specifics matter here. Process-level sandboxing is your first line of defense, keeping the agent from escaping its immediate execution context. VMs add another boundary, isolating the entire runtime environment. Filesystem restrictions prevent the agent from wandering into directories it shouldn’t touch. And egress controls limit what external services it can reach.

This isn’t just theoretical. When you’re using Claude Code to refactor a codebase, these layers determine whether a mistake (or a deliberate probe) stays contained or escalates into something worse. Same goes for Cowork, where multiple agents might be collaborating on shared resources.

Why documentation matters

Simon Willison, who flagged this on his blog, made a point worth repeating: sandboxing products are rarely documented thoroughly. That’s a problem. In security, obscurity isn’t a feature. It’s a liability.

When a company documents their containment model, it does two things. First, it lets security researchers and users actually evaluate the approach. You can identify weak spots, suggest improvements, or at least make an informed decision about what you’re willing to trust.

Second, it sets a standard. If Anthropic publishes detailed sandbox specs, other AI companies look worse for not doing the same. That’s good pressure.

What’s still missing

Anthropic’s documentation is a solid start, but there are gaps. The overview describes the containment layers but doesn’t dive deep into implementation details. How exactly are filesystem boundaries enforced? What’s the VM escape surface? Which egress controls apply to which product tiers?

Those are the questions that matter when you’re doing a real threat model. And they’re the questions that, right now, you still can’t fully answer from public docs alone.

Still, this is progress. The agent space has been moving fast, with companies shipping tools that touch production systems, personal files, and private codebases. Containment can’t stay a black box forever. Anthropic’s documentation is a step toward the kind of transparency that should be table stakes.

If you’re building AI agents or evaluating them for your team, this is worth reading. Not because Anthropic’s approach is necessarily the best one, but because it’s one of the few you can actually examine.

coding developer tools