Skip to main content
By default, each agent runs one instance at a time. This guide shows how to scale up and use resource locks to prevent duplicate work.

The Problem

With scale = 1, a single agent instance handles all work sequentially. If 5 GitHub issues arrive via webhook while the agent is working on one, those 5 events queue up and wait. For high-volume workloads, this creates a bottleneck.

Increase Scale

In the agent’s config.toml:
# agents/dev/config.toml
scale = 3    # Run up to 3 instances concurrently
Now when 5 issues arrive, up to 3 are processed simultaneously. The remaining 2 wait in the work queue.

Add Locking

With multiple instances, two agents might try to work on the same issue. Add a lock/skip/work/unlock pattern to your SKILL.md:
## Workflow

1. List open issues labeled "agent" in repos from `<agent-config>`
2. For each issue:
   - rlock "github://owner/repo/issues/123"
   - If the lock fails, skip this issue — another instance is handling it
   - Clone the repo, create a branch, implement the fix
   - Open a PR and link it to the issue
   - runlock "github://owner/repo/issues/123"
3. If you completed work and there may be more issues, run `al-rerun`

How lock commands work

When the agent runs rlock "github://owner/repo/issues/123":
  • Lock acquired: {"ok": true} — proceed with work
  • Already held: {"ok": false, "holder": "dev-abc123", ...} — skip this resource
When done: runlock "github://owner/repo/issues/123" releases the lock. If the agent crashes or times out, locks are auto-released.

Monitor with al stat

Check queue depth and running instances:
al stat
al stat -E production
The queue column shows how many events are waiting. If it’s consistently high, consider increasing scale.

Resource Considerations

Each parallel instance:
  • Uses a separate Docker container
  • Consumes memory (local.memory per container, default 4GB)
  • Consumes CPU (local.cpus per container, default 2)
  • Makes independent LLM API calls (watch your rate limits and quota)

Tune work queue size

If events arrive faster than agents can process them, the queue buffers them:
# config.toml
workQueueSize = 200    # default: 100 per agent
When the queue is full, the oldest items are dropped.

Default agent scale

Set the default scale for all agents that don’t have an explicit scale in their config.toml:
# config.toml
defaultAgentScale = 3    # each agent gets 3 runners unless overridden
Without this setting, agents default to 1 runner each.

Project-wide scale cap

Limit total concurrent runners across all agents:
# config.toml
scale = 10    # max 10 runners total across all agents
If defaultAgentScale * agentCount exceeds scale, agents are throttled at startup and a warning is shown.

Example Configuration

Agent runtime config in agents/dev/config.toml:
credentials = ["github_token", "git_ssh"]
schedule = "*/5 * * * *"
models = ["sonnet"]
scale = 3

[[webhooks]]
source = "my-github"
events = ["issues"]
actions = ["labeled"]
labels = ["agent"]

[params]
repos = ["acme/app", "acme/api"]
triggerLabel = "agent"

Next steps