Documentation Index
Fetch the complete documentation index at: https://docs.actionllama.org/llms.txt
Use this file to discover all available pages before exploring further.
Action Llama supports 8 LLM providers. Define named models in config.toml under [models.<name>], then reference them by name in each agent’s config.toml. Agents list models in priority order — the first is the primary, the rest are fallbacks tried automatically on rate limits when using the default pi harness.
[models.<name>] Fields
| Field | Type | Required | Description |
|---|
provider | string | Yes | Provider name (see table below) |
model | string | Yes | Model ID |
authType | string | Yes | "api_key", "oauth_token", or "pi_auth" |
thinkingLevel | string | No | Reasoning budget (Anthropic only) |
Providers
Anthropic
Claude models with optional extended thinking.
[models.sonnet]
provider = "anthropic"
model = "claude-sonnet-4-20250514"
thinkingLevel = "medium"
authType = "api_key"
| Model | Description |
|---|
claude-opus-4-20250514 | Most capable, best for complex multi-step tasks |
claude-sonnet-4-20250514 | Balanced performance and cost (recommended) |
claude-haiku-3-5-20241022 | Fastest and cheapest |
Credential: anthropic_key (field: token)
Auth types:
authType | Token format | Description |
|---|
api_key | sk-ant-api-... | Standard Anthropic API key |
oauth_token | sk-ant-oat-... | OAuth token from claude setup-token |
pi_auth | (none) | Uses existing pi auth credentials (~/.pi/agent/auth.json). No credential file needed. |
Notes:
pi_auth is only available with the pi harness.
pi_auth is not supported in Docker mode. Switch to api_key or oauth_token for containerized runs.
- When using the
claude harness, Action Llama passes Anthropic credentials to Claude CLI as ANTHROPIC_API_KEY or CLAUDE_CODE_AUTH_TOKEN depending on the configured auth type.
Thinking level: Anthropic is the only provider that supports thinkingLevel. Valid values:
| Level | Description |
|---|
off | No extended thinking |
minimal | Minimal reasoning |
low | Light reasoning |
medium | Balanced (recommended) |
high | Deep reasoning |
xhigh | Maximum reasoning budget |
If omitted, thinking is not explicitly configured. For other providers, thinkingLevel is ignored.
OpenAI
[models.gpt4o]
provider = "openai"
model = "gpt-4o"
authType = "api_key"
| Model | Description |
|---|
gpt-4o | Flagship multimodal model (recommended) |
gpt-4o-mini | Smaller, faster, cheaper |
gpt-4-turbo | Previous generation |
o1-preview | Reasoning model |
o1-mini | Smaller reasoning model |
Credential: openai_key (field: token)
Groq
[models.groq-llama]
provider = "groq"
model = "llama-3.3-70b-versatile"
authType = "api_key"
| Model | Description |
|---|
llama-3.3-70b-versatile | Llama 3.3 70B on Groq inference |
Groq runs open-source models at high speed. Check Groq’s docs for the full list of available model IDs.
Credential: groq_key (field: token)
Google Gemini
[models.gemini]
provider = "google"
model = "gemini-2.0-flash-exp"
authType = "api_key"
| Model | Description |
|---|
gemini-2.0-flash-exp | Fast experimental model |
Check Google AI Studio for the full list of available model IDs.
Credential: google_key (field: token)
xAI
[models.grok]
provider = "xai"
model = "grok-beta"
authType = "api_key"
| Model | Description |
|---|
grok-beta | Grok beta |
Credential: xai_key (field: token)
Mistral
[models.mistral]
provider = "mistral"
model = "mistral-large-2411"
authType = "api_key"
| Model | Description |
|---|
mistral-large-2411 | Mistral Large (November 2024) |
Check Mistral’s docs for the full list of available model IDs.
Credential: mistral_key (field: token)
OpenRouter
OpenRouter provides access to models from many providers through a single API.
[models.or-sonnet]
provider = "openrouter"
model = "anthropic/claude-3.5-sonnet"
authType = "api_key"
Model IDs use the provider/model format. See OpenRouter’s model list for all available models.
Credential: openrouter_key (field: token)
Custom
For any provider not listed above. The model ID and API routing are handled by the pi harness.
[models.my-model]
provider = "custom"
model = "your-model-name"
authType = "api_key"
Credential: custom_key (field: token)
Mixing Models
Each agent can use a different model. Define all models in the project’s config.toml, then reference them by name in each agent’s config.toml:
config.toml → [models.sonnet], [models.gpt4o], [models.groq-llama]
agents/dev/config.toml → models = ["sonnet"]
agents/reviewer/config.toml → models = ["gpt4o"]
agents/devops/config.toml → models = ["groq-llama"]
Model Fallback
Agents can list multiple models to create a fallback chain. When the primary model is rate-limited or unavailable, Action Llama automatically tries the next model in the list when using the pi harness:
# agents/<name>/config.toml
models = ["sonnet", "haiku", "gpt4o"]
Fallback switching is instant — there is no delay when moving to the next model. Exponential backoff only kicks in after all models in the chain have been exhausted.
A circuit breaker tracks model availability in memory. When a model returns a rate limit or overload error, it is marked unavailable for 60 seconds. After the cooldown expires, the model is retried.
The claude harness does not currently use the fallback chain. It runs the agent’s primary model directly via Claude CLI.
Credential Setup
Each provider requires a corresponding credential in ~/.action-llama/credentials/. Run al doctor to configure them interactively.
LLM credentials are loaded automatically based on the models referenced in the agent’s models list in config.toml — they do not need to be listed in the agent’s credentials array. The credentials array is for runtime credentials the agent uses during execution (GitHub tokens, SSH keys, etc.).
See Credentials for the full credential reference.