Sandboxing Agents with AI Workbench#
Overview#
- You can use AI Workbench to sandbox agents with the project container, file permissions and the agent’s own controls
The project container separates the agent from the host, and only the mounted project repository is shared. In the container the agent runs as
workbenchwithout sudo, so you can set file permissions at build to limit access.You can combine this with configuring the agent’s settings, process sandboxing, and permissions.
- Versioned configuration files provide clarity and control over the agent’s behavior.
The Workbench build process uses versioned files that can be evaluated for risks introduced by an agent.
This helps you check what’s going into your environment before you use it.
- Process sandboxing by agent has gaps in containers that you should be aware of.
Agent process sandboxes rely on OS-level mechanisms that may require privileges, kernel features, or profiles not available inside a container.
When prerequisites are missing, the sandbox may fall back silently or leave surfaces like environment variables unprotected. See the Claude Code and Cursor section below for the specific gaps.
- Sandboxing is a matter of degrees and risk tolerance that requires awareness.
Agent clients (e.g. Claude Code and Cursor) have “out of the box” process sandboxing and behavioral controls through command/file permissions. Behavioral controls work pretty well.
However, process sandboxing is involved and behaves differently across agents, operating systems and containers. You should always evaluate a sandboxing approach for limitations and risks.
Nothing is perfect or guaranteed.
Key Concepts#
- Container Boundary
The isolation layer the agent cannot override. The agent can only reach the mounted project directory on the host. It runs as
workbenchwithout sudo and cannot access anything else on the host filesystem.- Behavioral Controls
Agent-level rules that gate tool calls: file permissions, command restrictions, hook scripts. Enforced by the agent itself, not the OS. Coverage varies — some controls gate the agent’s own tools but not arbitrary shell commands.
- Process Sandbox
Kernel-level enforcement that restricts what a process can access on the filesystem, network, or at the syscall level. Different agent clients use different mechanisms (e.g. bubblewrap vs Landlock/seccomp/AppArmor), and guarantees are not full or uniform.
- Sandbox Configuration Repository
A Git repository with the settings, hooks, skills, and scripts that define agent behavior in the container. Can be cloned at build time and optionally volume-mounted for persistence across restarts.
- Defense in Depth
No single layer is sufficient. The container boundary is the floor; agent-side controls are additive. Treat behavioral controls and process sandboxing as layers on top of the container, not replacements for it.
How It Works#
- Host Isolation: The project container enforces isolation that the agent cannot override.
The agent runs as
workbenchwithout sudo within the container. Furthermore, it cannot access anything on the host outside the mounted project directory.This is a property of the container runtime and you don’t need to do anything to enforce it.
- Agent-Side Controls: Agents have their own frameworks for restricting behavior.
These typically include file access policies, command restrictions, event hooks and process sandboxing. Coverage and enforcement level vary by agent. Some controls gate the agent’s own tools but not arbitrary shell commands.
Others enforce at the OS level but may have reduced capability in container environments. See the how-to page for each agent for what its controls cover and where the gaps are.
- The Container Boundary Is the Floor: Agent-side controls are additive but not guaranteed.
An agent’s own restrictions can be misconfigured, have known limitations in containers, or differ between versions. The container boundary is the one layer the agent cannot override. Treat agent-side controls as defense in depth, not as the primary enforcement mechanism.
- Git Based Monitoring: AI Workbench gives you a review layer.
The Project Tab > Git > Changes section shows file-by-file diffs so you can inspect what the agent changed before committing. Environment configuration lives in versioned files (
spec.yaml, build scripts), so changes to the environment appear in the same diff view as code changes.Making a commit before running the agent gives you a clean baseline to diff against.
What Sandboxing Does Not Cover#
- Process sandboxes in containers do not isolate environment variables.
An agent with shell access can read secrets stored in environment variables regardless of sandbox mode. If your project uses secrets, use hooks to block commands that reference secret variable names. See Configure Claude Code Sandboxing in a Project Container for an example hook script.
- Some agent sandboxes fail silently in containers.
When kernel prerequisites are not met, the sandbox may fall back to unsandboxed execution without warning. You cannot assume a process sandbox is active unless you have verified the prerequisites on the host. See Cursor Sandbox Limitations in Containers for the specific prerequisites.
- Agent guidance does not actually enforce anything, so make sure to use permissions and hooks.
MDC rules, skills, and CLAUDE.md files inform the agent about constraints and conventions, but the agent can ignore them. For constraints that must hold, use hooks or the process sandbox.
Claude Code vs Cursor#
- Claude Code and Cursor take different approaches to process sandboxing in containers.
Claude Code uses bubblewrap, which runs in a reduced capability mode inside containers. Cursor uses kernel primitives (Landlock, seccomp, AppArmor), which require host-level prerequisites that containers may not provide. Neither approach provides full coverage on its own.
- Hooks are the most reliable agent-side control in containers for both agents.
Claude Code hooks fire before a tool call executes and can block it. Cursor hooks can also block actions, but only certain exit codes are enforced reliably. See Configure Claude Code Sandboxing in a Project Container and Agent Sandboxing in Containers for specifics.
- Store agent configurations in a separate repository to version, share, and swap them independently of project code.
Clone the configuration repository into the container at build time using
postBuild.bash. This works for both agents and keeps sandbox settings visible and version-controlled.