Container Isolation
All agents run in isolated containers for security and consistency. Container isolation is always enabled.How it works
Whenal start runs:
- The base image (
al-agent:latest) is built fromdocker/Dockerfileon first run - Per-agent images are built for any agent that has a custom
Dockerfile - Each agent run launches a fresh container with:
- Read-only root filesystem
- Credentials mounted read-only at
/credentials/ - Writable tmpfs at
/tmp(2GB) - All capabilities dropped, no-new-privileges
- PID, memory, and CPU limits
- Non-root user (uid 1000)
- A unique shutdown secret for the anti-exfiltration kill switch
Container runtime
Each agent run is a short-lived container that boots, runs a single LLM session, and exits. The entry point isnode /app/dist/agents/container-entry.js.
Environment
The container receives everything it needs via environment variables and mounts:| Env var | Description |
|---|---|
AGENT_CONFIG | JSON-serialized agent config (model, credentials, params) plus ACTIONS.md content |
PROMPT | The fully assembled prompt (<agent-config> + <credential-context> + trigger text) |
TIMEOUT_SECONDS | Max runtime in seconds (default: 3600). The container self-terminates if exceeded |
GATEWAY_URL | HTTP URL of the host gateway (local Docker only — used for credential fetch and shutdown) |
SHUTDOWN_SECRET | Unique per-run secret for the anti-exfiltration kill switch (local Docker only) |
| Runtime | Strategy | How it works |
|---|---|---|
| Local Docker | Volume mount | Files staged to a temp dir, mounted read-only at /credentials/<type>/<instance>/<field> |
| Cloud Run | Gateway fetch | Container fetches credentials from GATEWAY_URL/credentials/<secret> on startup |
| ECS Fargate | Env vars | Secrets Manager values injected as AL_SECRET_<type>__<instance>__<field> env vars |
Startup sequence
- Set working directory —
chdir("/tmp") - Start self-termination timer — kills the process with exit code 124 if
TIMEOUT_SECONDSis exceeded - Parse config — reads
AGENT_CONFIG, extractsACTIONS.mdcontent - Load credentials — from volume, env vars, or gateway (see table above)
- Inject env vars from credentials:
GITHUB_TOKEN/GH_TOKENfromgithub_tokencredentialSENTRY_AUTH_TOKENfromsentry_tokencredentialGIT_SSH_COMMANDpointing to the mounted SSH key fromgit_sshcredentialGIT_AUTHOR_NAME/GIT_AUTHOR_EMAIL/GIT_COMMITTER_NAME/GIT_COMMITTER_EMAILfromgit_sshcredential- Git HTTPS credential helper configured if
GITHUB_TOKENis set
- Create pi-coding-agent session — initializes the LLM model, tools, and settings
- Send prompt — delivers the pre-built prompt to the session
Agent session
The prompt is sent to the LLM with rate-limit retry (up to 5 attempts with exponential backoff, 30s to 5min). The LLM runs autonomously — reading files, executing commands, making API calls — until it finishes or hits an error. Unrecoverable error detection: The container watches for repeated auth/permission failures (e.g. “bad credentials”, “permission denied”, “resource not accessible by personal access token”). After 3 such errors, it aborts early rather than burning through retries.Exit codes
| Code | Meaning |
|---|---|
| 0 | Success — agent completed its work |
| 1 | Error — missing config, credential failure, unrecoverable errors, or uncaught exception |
| 124 | Timeout — TIMEOUT_SECONDS exceeded, container self-terminated |
Log protocol
The container communicates with the scheduler via structured JSON lines on stdout. This is how the scheduler tracks progress, surfaces errors in the TUI, and writes log files. Structured log lines have the format:_log: true field distinguishes structured logs from plain text output. The scheduler parses these and forwards them to the logger at the appropriate level.
| Field | Description |
|---|---|
_log | Always true — marker for structured log lines |
level | "debug", "info", "warn", or "error" |
msg | Log message (e.g. "bash", "tool error", "credentials loaded from volume") |
ts | Unix timestamp in milliseconds |
... | Additional fields vary by message (e.g. cmd, tool, error, result) |
| Message | Level | When |
|---|---|---|
"container starting" | info | Boot, includes agentName and modelId |
"credentials loaded from ..." | info | After credential loading (volume, env vars, or gateway) |
"SSH key configured for git" | info | After SSH key setup |
"creating agent session" | info | Before LLM session creation |
"session created, sending prompt" | info | Prompt delivery |
"bash" | info | Every bash tool call, with cmd field |
"tool error" | error | Failed tool call, with tool, cmd, and result fields |
"rate limited, retrying prompt" | warn | Rate limit hit, with attempt and delayMs |
"run completed" | info | Agent finished successfully |
"no work to do" | info | Agent found nothing to act on |
"container timeout reached, self-terminating" | error | Timeout exceeded |
/tmp/bin/ that write to $AL_SIGNAL_DIR:
| Command | Description |
|---|---|
al-rerun | Request an immediate rerun to drain remaining backlog. Without this, the scheduler treats the run as complete and waits for the next scheduled tick. |
al-status "<text>" | Status update shown in the TUI. Example: al-status "reviewing PR #42" |
al-return "<value>" | Return a value to the calling agent. Used when this agent was invoked via al-call. |
al-exit [code] | Terminate with an exit code indicating an unrecoverable error. Defaults to 15. |
al-call <agent>— Call another agent. Pass context via stdin, get back a JSON response with acallId.al-check <callId>— Check the status of a call (never blocks). Returns{"status":"pending|running|completed|error", ...}.al-wait <callId> [callId...] [--timeout N]— Wait for one or more calls to complete (polls every 5s, default timeout 900s).
<agent-call> block with the caller name and context. To return a value, the called agent uses the al-return command:
- An agent cannot call itself (self-calls are rejected)
- If all runners for the target agent are busy, the call is queued (up to
workQueueSizelimit in global config, default: 100) - Call chains are allowed (agent A calls B, B calls C) up to a configurable depth limit (
maxCallDepthinconfig.toml, default: 3) - Called runs do not re-run — they respond to the single call
- These commands require the gateway; they return errors if
GATEWAY_URLis not set
_log: true is treated as plain agent output (the LLM’s final text response).
Base image
The base image (al-agent:latest) is built automatically from the Action Llama package and includes the minimum needed for any agent:
| Package | Why |
|---|---|
node:20-slim | Runs the container entry point and pi-coding-agent SDK |
git | Clone repos, create branches, push commits |
curl | API calls (Sentry, arbitrary HTTP), anti-exfiltration shutdown |
ca-certificates | HTTPS for git, curl, npm |
openssh-client | SSH for GIT_SSH_COMMAND — git clone/push over SSH |
dist/) and installs its npm dependencies. The entry point is node /app/dist/agents/container-entry.js.
Project base image
The projectDockerfile (at the project root) lets you customize the base image for all agents in the project. It is created by al new and checked into git:
FROM al-agent:latest with no customizations. In this state, it is skipped entirely — agents build directly on al-agent:latest with no overhead.
To customize, add RUN, ENV, or other instructions:
FROM, the build pipeline creates an intermediate image (al-project-base:latest) that all per-agent images layer on top of.
Image build order
Custom agent images
Agents that need extra tools beyond what the project base provides can add aDockerfile to their own directory:
Extending the base image
UseFROM al-agent:latest and add what you need. The build pipeline automatically rewrites the FROM line to point at the correct base (either al-project-base:latest or the cloud registry URI). Switch to root to install packages, then back to node:
Dockerfile instead of duplicating it across agent Dockerfiles.
Common additions:
Writing a standalone Dockerfile
If you need full control, you can write a Dockerfile from scratch. It must:- Include Node.js 20+
- Copy the application code from the base image or install it
- Set
ENTRYPOINT ["node", "/app/dist/agents/container-entry.js"] - Use uid 1000 (
USER nodeon node images) for compatibility with the container launcher
/app/dist/agents/container-entry.js exists and can run. The entry point reads AGENT_CONFIG, PROMPT, GATEWAY_URL, and SHUTDOWN_SECRET from environment variables, and credentials from /credentials/.
Build behavior
- The base image (
al-agent:latest) is only built if it doesn’t exist yet - The project base image (
al-project-base:latest) is rebuilt on everyal startif the project Dockerfile has customizations - Agent images are named
al-<agent-name>:latest(e.g.al-dev:latest) and are rebuilt on everyal startto pick up Dockerfile changes - The build context is the Action Llama package root (not the project directory), so
COPYpaths in per-agent Dockerfiles reference the package’sdist/,package.json, etc.
Configuration
| Key | Default | Description |
|---|---|---|
local.image | "al-agent:latest" | Base Docker image name |
local.memory | "4g" | Memory limit per container |
local.cpus | 2 | CPU limit per container |
local.timeout | 3600 | Max container runtime in seconds |
Container filesystem layout
| Path | Mode | Contents |
|---|---|---|
/app | read-only | Action Llama application + node_modules |
/credentials | read-only | Mounted credential files (/<type>/<instance>/<field>) |
/tmp | read-write (tmpfs, 2GB) | Agent working directory — repos, scratch files, SSH keys |
Troubleshooting
“Docker is not running” — Start Docker Desktop or the Docker daemon before runningal start.
Base image build fails — Run docker build -t al-agent:latest -f docker/Dockerfile . from the Action Llama package directory to see the full build output.
Project base image build fails — Check that the project Dockerfile starts with FROM al-agent:latest and that any apk add packages are spelled correctly. The base image uses Alpine Linux.
Agent image build fails — Check that your agent’s Dockerfile starts with FROM al-agent:latest (the build pipeline rewrites this to the correct base) and that any package install commands are correct.
Container exits immediately — Check al logs <agent> for the error. Common causes: missing credentials, missing ACTIONS.md, invalid model config.