Skip to main content
When you set scale > 1 on an agent, multiple instances run concurrently. Without coordination, two instances might pick up the same GitHub issue, review the same PR, or deploy the same service at the same time. Resource locks prevent this.

Why Locks Exist

Locks let concurrent agent instances claim exclusive ownership of a resource before working on it. If another instance already holds the lock, the agent skips that resource and moves on.

How It Works

  1. Before working on a shared resource, the agent runs rlock "github://acme/app/issues/42".
  2. If the lock is free, the agent gets it and proceeds.
  3. If another instance already holds the lock, the agent gets back the holder’s name and skips that resource.
  4. When done, the agent runs runlock "github://acme/app/issues/42".
The agent learns the lock commands from a preamble injected before the session starts. Agent authors just reference the commands in their SKILL.md workflow — no need to think about HTTP endpoints or authentication.

Commands

CommandDescription
rlock "<uri>"Acquire an exclusive lock. Fails if another instance holds it.
runlock "<uri>"Release a lock. Only the holder can release.
rlock-heartbeat "<uri>"Reset the TTL on a held lock.
See Agent Commands — Locks for the full command reference with response JSON.

Resource Key URIs

Lock keys use URI format. Use a scheme that identifies the resource type, and a path that uniquely identifies the instance:
PatternExample
github://owner/repo/issues/numberrlock "github://acme/app/issues/42"
github://owner/repo/pr/numberrlock "github://acme/app/pr/17"
deploy://service-namerlock "deploy://api-prod"

TTL and Expiry

Locks expire automatically after 30 minutes by default. This prevents deadlocks if an agent crashes or hangs without releasing its lock. The timeout is configurable via resourceLockTimeout in config.toml (value in seconds). For work that takes longer than the timeout, use rlock-heartbeat to extend the TTL. Each heartbeat resets the clock to another full TTL period. If the agent forgets to heartbeat and the lock expires, another instance can claim it.

Heartbeat

During long-running work, periodically run rlock-heartbeat to keep the lock alive:
## Workflow

1. rlock "deploy://api-prod"
2. Run the deployment (may take 45+ minutes)
   - Every 10 minutes, run rlock-heartbeat "deploy://api-prod"
3. runlock "deploy://api-prod"
Each heartbeat resets the expiry to a full TTL period from the current time.

Multiple Locks and Deadlock Detection

An agent instance can hold multiple locks simultaneously when working across related resources. However, this introduces the possibility of circular waits — agent A holds lock X and waits for lock Y, while agent B holds lock Y and waits for lock X. The gateway detects these cycles automatically. When an rlock request would create a circular wait in the wait-for graph, it returns a possible deadlock error with the cycle path instead of blocking forever. The agent can then release its held locks and retry.
# Example deadlock cycle:
# Agent A holds "github://acme/app/pr/10", wants "deploy://api-prod"
# Agent B holds "deploy://api-prod", wants "github://acme/app/pr/10"
# → rlock "deploy://api-prod" returns: possible deadlock detected
Note: The agent preamble constrains agents to one lock at a time for simplicity. Multi-lock is available for advanced use cases where the agent is explicitly instructed to hold multiple locks.

Authentication

Each container gets a unique per-run secret (the same one used for the shutdown API). Lock requests are authenticated with this secret, so only the container that acquired a lock can release or heartbeat it. There is no way for one agent instance to release another’s lock — it must wait for the TTL to expire.

Auto-release on Exit

When a container exits — whether it finishes successfully, hits an error, or times out — all of its locks are released automatically by the scheduler. You don’t need to worry about cleanup in error paths.

Example in SKILL.md

## Workflow

1. List open issues labeled "agent" in repos from `<agent-config>`
2. For each issue:
   - rlock "github://owner/repo/issues/123"
   - If the lock fails, skip this issue — another instance is handling it
   - Clone the repo, create a branch, implement the fix
   - Open a PR and link it to the issue
   - runlock "github://owner/repo/issues/123"
3. If you completed work and there may be more issues, run `al-rerun`

Configuration

SettingLocationDefaultDescription
resourceLockTimeoutconfig.toml1800 (30 min)Default TTL for locks in seconds

See Also