72.2% issue resolution on SWE-bench Verified — #1 among GPT-5–based systems.

Read the post →
← Back to Blog

Why isolated sandboxes are a hard requirement for AI agents

Running AI agents on real codebases without proper isolation leads to file collisions, secret leakage, and non-reproducible failures. Isolation isn't an optimization — it's a prerequisite.

Feb 21, 20266 min read
Why isolated sandboxes are a hard requirement for AI agents

When you run a single agent against a repository, isolation seems optional. The agent reads files, makes edits, runs tests. Nothing obviously breaks.

Scale to a team — a researcher, an engineer, a reviewer — and the problems surface immediately. Two agents write to the same path at the same time; one overwrites the other's changes silently. An environment variable injected for the engineer leaks into a subprocess spawned by the reviewer. A test suite starts a service on port 8080; a second agent's integration test expects that port to be free and fails with a connection error that looks like a code bug. Secrets injected into a shared environment are visible to every agent regardless of role — the reviewer can read the engineer's deploy keys, the researcher can access production credentials it has no business touching.

These aren't edge cases. This is what happens by default when agents share an environment.

if agents share state — filesystem, secrets, network, processes — the system is neither reliable nor secure.

Worktrees vs. isolated sandboxes

Git worktrees are the obvious first approach. git worktree add checks out a branch into a separate directory in seconds — no image pull, no container startup, no orchestration. But the isolation is shallow: everything below the repository root is shared. System packages, /tmp, environment variables, running processes, network interfaces — all of it. Two agents installing conflicting dependencies stomp on each other. A secret in the host environment is visible to every subprocess any agent spawns. An agent that binds port 8080 blocks every other agent that needs it.

Isolated containers flip the defaults. Each agent gets its own filesystem root, its own process namespace, its own network interface. There is no shared state unless you explicitly create it. An agent can install any dependency, bind any port, start a real service, run a full integration test suite against it, and tear it all down — without any awareness of what other agents are doing. That's what makes the feedback loop reliable: the agent can trust what it's measuring.

The honest tradeoff is management complexity. Containers require image lifecycle management, cleanup policies, and orchestration across agents. Worktrees are just directories.

Aspect Git worktree Isolated container
Filesystem isolation ✗ Shared host filesystem ✓ Private root per agent
File collision risk ✗ Real — concurrent writes to same path ✓ None — each agent owns its workspace
Secret isolation ✗ Host env inherited by all subprocesses ✓ Per-container injection, least-privilege
Network isolation ✗ Shared host interfaces and ports ✓ Private namespace, no port collisions
Deploy & test services ✗ Port conflicts, zombie processes ✓ Agent owns its process and port space
Failure attribution ✗ Hard — shared state obscures cause ✓ Clear — everything scoped to one container
Management overhead ✓ Simple — just a directory ✗ Complex — image lifecycle, cleanup, orchestration
Best for Sequential, lightweight, no service deps Concurrent agents, real service workloads

For sequential, lightweight tasks on a controlled host, worktrees are fine. For concurrent multi-agent workloads — especially those that deploy and test real services — containers are the only choice that keeps failures attributable.

Takeaway

Full isolation is a hard requirement for any multi-agent system that needs to be reliable and secure. That's why agyn runs every agent in its own sandbox — and handles the container management overhead so you don't have to.

References