Skip to content

Sandbox

All bash commands from the LLM run in isolated Docker containers. The sandbox is mandatory - there is no option to run commands directly on the host.

Overview

The sandbox protects against:

  • Malicious commands - LLM generating harmful commands (intentional or via prompt injection)
  • Accidental damage - Commands that could damage the host system
  • Resource exhaustion - Fork bombs, memory exhaustion, disk filling
  • Data exfiltration - Unauthorized access to host files or secrets
  • Privilege escalation - Attempts to gain root or host access

Architecture

┌─────────────────────────────────────────────────────────────┐
│ HOST SYSTEM │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Ash Agent │ │
│ │ - Runs on host │ │
│ │ - Has access to config (~/.ash/) │ │
│ │ - Has access to SQLite database │ │
│ │ - Communicates with LLM API │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ Tool Execution │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Docker Container (Sandbox) │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Bash commands execute here │ │ │
│ │ │ - Isolated filesystem │ │ │
│ │ │ - Limited resources │ │ │
│ │ │ - Unprivileged user │ │ │
│ │ │ - Optional network access │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Configuration

[sandbox]
image = "ash-sandbox:latest"
timeout = 60
memory_limit = "512m"
cpu_limit = 1.0
runtime = "runc"
network_mode = "bridge"
dns_servers = []
http_proxy = ""
workspace_access = "rw"
sessions_access = "none"

Options

OptionTypeDefaultDescription
imagestring"ash-sandbox:latest"Docker image name
timeoutint60Command timeout in seconds
memory_limitstring"512m"Container memory limit
cpu_limitfloat1.0CPU cores allowed
runtimestring"runc"Container runtime
network_modestring"bridge"Network isolation mode
dns_serverslist[]Custom DNS servers
http_proxystring""HTTP proxy URL
workspace_accessstring"rw"Workspace mount mode
sessions_accessstring"none"Sessions directory mount mode

Security Controls

Container Isolation

ControlImplementationPurpose
Read-only root filesystem--read-onlyPrevent persistent changes
Dropped capabilitiescap_drop: ALLRemove Linux capabilities
No privilege escalationno-new-privilegesPrevent setuid exploitation
Process limitpids_limit: 100Fork bomb protection
Memory limitmem_limit: 512mMemory exhaustion protection
CPU limitcpu_limit: 1.0CPU exhaustion protection
Non-root userUSER sandboxReduced privilege
Removed setuid binariesDockerfile cleanupPrevent privilege escalation

Filesystem Access

PathAccessNotes
/ (root)Read-onlyImmutable base system
/etc, /usr, /binRead-onlySystem directories protected
/workspaceConfigurable (none/ro/rw)Mounted from host workspace
/tmpRead-write (tmpfs, 64MB)Temporary files, noexec
/home/sandboxRead-write (tmpfs, 64MB)User home, noexec
/var/tmpRead-write (tmpfs, 32MB)Temporary files, noexec
/runRead-write (tmpfs, 16MB)Runtime files, noexec
/rootNo accessRoot home inaccessible

Network Isolation

ModeBehavior
noneCompletely isolated, no network
bridgeStandard Docker networking, can reach internet
[sandbox]
network_mode = "none" # Fully isolated

DNS Filtering

Use filtered DNS servers:

[sandbox]
network_mode = "bridge"
dns_servers = ["9.9.9.9", "149.112.112.112"] # Quad9 filtered DNS

HTTP Proxy

Route traffic through a proxy for monitoring:

[sandbox]
http_proxy = "http://localhost:8888"

Workspace Access

Control how the workspace is mounted:

ModeDescription
noneWorkspace not mounted
roRead-only access
rwRead-write access
[sandbox]
workspace_access = "ro" # Read-only for safety

Sessions Access

Control whether the agent can read session history from the sandbox:

ModeDescription
noneSessions not mounted (default)
roRead-only access to session transcripts
[sandbox]
sessions_access = "ro" # Allow reading past conversations

Container Runtime

runc (Default)

Standard OCI runtime:

[sandbox]
runtime = "runc"

gVisor (runsc)

Enhanced isolation with gVisor:

[sandbox]
runtime = "runsc"

Components

Sandbox Manager

Location: src/ash/sandbox/manager.py

Manages container lifecycle:

class SandboxManager:
async def create(self) -> Container:
"""Create a new sandbox container."""
async def execute(
self,
command: str,
timeout: int = 60,
) -> ExecutionResult:
"""Execute command in sandbox."""
async def cleanup(self) -> None:
"""Remove stopped containers."""

Sandbox Executor

Location: src/ash/sandbox/executor.py

Handles command execution:

class SandboxExecutor:
async def run(
self,
command: str,
*,
timeout: int,
working_dir: str = "/workspace",
) -> ExecutionResult:
"""Run command with resource limits."""

Expected Behaviors

MUST Allow

  1. Command execution - Bash commands run and return output
  2. Python execution - python3 available for scripting
  3. Common tools - git, curl, jq, vim, less, tree available
  4. Workspace access - Read/write to /workspace when configured
  5. Temp file creation - Write to /tmp for temporary files
  6. Network requests - HTTP/HTTPS when network_mode: bridge
  7. Exit codes - Non-zero exit codes preserved and reported
  8. Stderr capture - Error output captured and returned

MUST Block

  1. System modification - Writing to /etc, /usr, /bin, etc.
  2. Privilege escalation - sudo, su, setuid binaries
  3. Container escape - Access to host filesystem outside mounts
  4. Resource exhaustion - Fork bombs, memory bombs limited
  5. Persistent malware - Read-only filesystem prevents persistence
  6. Host secret access - No access to host environment variables
  7. Unlimited execution - Commands timeout after configured limit

CLI Commands

Building the Sandbox

Terminal window
uv run ash sandbox build

Managing Containers

Terminal window
# Clean up stopped containers
uv run ash sandbox clean

Verification

Automated Tests

Run the security verification suite with pytest:

Terminal window
uv run pytest tests/test_sandbox_verify.py -v

This runs tests across 5 categories:

  • SECURITY - User isolation, filesystem restrictions
  • RESOURCES - Timeouts, tmpfs, noexec
  • NETWORK - DNS, HTTP, HTTPS connectivity (use -m "not network" to skip)
  • FUNCTIONAL - Available tools and utilities
  • EDGE_CASES - Special characters, output handling

Manual Prompt Tests

Use the /test-sandbox skill for manual verification prompts. Key scenarios:

  1. rm -rf / -> “Read-only file system”
  2. sudo whoami -> “command not found” or “permission denied”
  3. Fork bomb :(){ :|:& };: -> Contained by pids limit
  4. Memory bomb -> Killed by memory limit

Troubleshooting

Container fails to start

Check Docker is running:

Terminal window
docker info

Command timeout

Increase the timeout:

[sandbox]
timeout = 120

Out of memory

Increase memory limit:

[sandbox]
memory_limit = "1g"

Incident Response

If a sandbox escape or security issue is discovered:

  1. Stop the service - ash sandbox clean removes all containers
  2. Review logs - Check what commands were executed
  3. Update image - ash sandbox build --force rebuilds with fixes
  4. Report issue - File security issue in repository