config.toml under [models.<name>], then reference them by name in each agent’s config.toml. Agents list models in priority order — the first is the primary, the rest are fallbacks tried automatically on rate limits.
[models.<name>] Fields
| Field | Type | Required | Description |
|---|---|---|---|
provider | string | Yes | Provider name (see table below) |
model | string | Yes | Model ID |
authType | string | Yes | "api_key", "oauth_token", or "pi_auth" |
thinkingLevel | string | No | Reasoning budget (Anthropic only) |
Providers
Anthropic
Claude models with optional extended thinking.| Model | Description |
|---|---|
claude-opus-4-20250514 | Most capable, best for complex multi-step tasks |
claude-sonnet-4-20250514 | Balanced performance and cost (recommended) |
claude-haiku-3-5-20241022 | Fastest and cheapest |
anthropic_key (field: token)
Auth types:
authType | Token format | Description |
|---|---|---|
api_key | sk-ant-api-... | Standard Anthropic API key |
oauth_token | sk-ant-oat-... | OAuth token from claude setup-token |
pi_auth | (none) | Uses existing pi auth credentials (~/.pi/agent/auth.json). No credential file needed. |
pi_auth is not supported in Docker mode. Switch to api_key or oauth_token for containerized runs.
Thinking level: Anthropic is the only provider that supports thinkingLevel. Valid values:
| Level | Description |
|---|---|
off | No extended thinking |
minimal | Minimal reasoning |
low | Light reasoning |
medium | Balanced (recommended) |
high | Deep reasoning |
xhigh | Maximum reasoning budget |
thinkingLevel is ignored.
OpenAI
| Model | Description |
|---|---|
gpt-4o | Flagship multimodal model (recommended) |
gpt-4o-mini | Smaller, faster, cheaper |
gpt-4-turbo | Previous generation |
o1-preview | Reasoning model |
o1-mini | Smaller reasoning model |
openai_key (field: token)
Groq
| Model | Description |
|---|---|
llama-3.3-70b-versatile | Llama 3.3 70B on Groq inference |
groq_key (field: token)
Google Gemini
| Model | Description |
|---|---|
gemini-2.0-flash-exp | Fast experimental model |
google_key (field: token)
xAI
| Model | Description |
|---|---|
grok-beta | Grok beta |
xai_key (field: token)
Mistral
| Model | Description |
|---|---|
mistral-large-2411 | Mistral Large (November 2024) |
mistral_key (field: token)
OpenRouter
OpenRouter provides access to models from many providers through a single API.provider/model format. See OpenRouter’s model list for all available models.
Credential: openrouter_key (field: token)
Custom
For any provider not listed above. The model ID and API routing are handled by the underlying pi.dev agent harness.custom_key (field: token)
Mixing Models
Each agent can use a different model. Define all models in the project’sconfig.toml, then reference them by name in each agent’s config.toml:
Model Fallback
Agents can list multiple models to create a fallback chain. When the primary model is rate-limited or unavailable, Action Llama automatically tries the next model in the list:Credential Setup
Each provider requires a corresponding credential in~/.action-llama/credentials/. Run al doctor to configure them interactively.
LLM credentials are loaded automatically based on the models referenced in the agent’s models list in config.toml — they do not need to be listed in the agent’s credentials array. The credentials array is for runtime credentials the agent uses during execution (GitHub tokens, SSH keys, etc.).
See Credentials for the full credential reference.