Skip to main content

Model Selection Guide

OpenClaw is model-agnostic — it works with 30+ bundled provider plugins covering both cloud and local models. This page helps you choose the right model for your setup and configure routing across multiple providers.

tip

Model pricing changes frequently. Check OpenRouter pricing or your provider's dashboard for current rates. Prices below are approximate as of June 2026.


Quick Pick

Your priorityModelProviderApprox. cost
Best qualityClaude Opus 4.8Anthropic / OpenRouter~$15/$75 per M tokens
Best balanceClaude Sonnet 4.6Anthropic / OpenRouter~$3/$15 per M tokens
Cheapest cloudDeepSeek V3.2DeepSeek / OpenRouter~$0.27/$1.10 per M tokens
Near-free cloudGemini 2.5 FlashGoogle / OpenRouter~$0.15/$0.60 per M tokens
Free (local)Qwen3 32BOllama / LM Studio$0 (your hardware)
Free (hosted)OpenRouter Free tierOpenRouter$0 (rate-limited)

By Use Case

Heartbeat (runs every 30 min — cost adds up)

Use the cheapest model that can follow instructions reliably:

ModelWhyMonthly Cost (48 cycles/day)
Local model (Qwen3 14B)Zero cost$0
Gemini 2.5 FlashVery cheap, good instruction following$15-60
DeepSeek V3.2Budget cloud, decent quality$15-60
Claude Haiku 4.5More capable but pricier$30-90
~/.openclaw/openclaw.json
{
"heartbeat": {
"model": "claude-haiku-4-5-20251001"
}
}

Complex Reasoning & Planning

ModelWhyCost
Claude Opus 4.8Best reasoning, most capable~$15/$75 per M tokens
Claude Opus 4.6Previous gen, still excellent~$15/$75 per M tokens
Claude Sonnet 4.680% of Opus quality at lower cost~$3/$15 per M tokens

Coding Tasks

ModelWhyCost
Claude Opus 4.8Best code generation and debugging~$15/$75 per M tokens
Claude Sonnet 4.6Great for most coding, much cheaper~$3/$15 per M tokens
DeepSeek V3.2Surprisingly good at code, very cheap~$0.27/$1.10 per M tokens
Qwen3 32B (local)Best local coding model$0
Qwen 2.5 Coder (local)Optimized for coding$0

General Chat & Daily Tasks

ModelWhyCost
DeepSeek V3.2Best quality-per-dollar for general use~$0.27/$1.10 per M tokens
Gemini 2.5 FlashFast, cheap, good for summaries~$0.15/$0.60 per M tokens
Claude Sonnet 4.6Premium quality when needed~$3/$15 per M tokens

Long Context (large files, codebases)

ModelContext WindowCost
Gemini 2.5 Flash1M tokens~$0.15/$0.60 per M tokens
Gemini 2.5 Pro1M tokens~$1.25/$10 per M tokens
Claude Sonnet 4.6200K tokens~$3/$15 per M tokens

Tool Use (Agent Workloads)

Not all models handle OpenClaw's tool-call protocol reliably. Requirements:

  • Function calling / tool use support in the model's training
  • Streaming tool call delta emission
  • Reliable JSON argument formatting
ModelTool Use ReliabilityNotes
Claude Opus/SonnetExcellentPurpose-built for tool use
GPT-5.3-Codex / GPT-4oExcellentStrong function calling
Gemini 2.5 Flash/ProGoodImproving rapidly
DeepSeek V3.2GoodGood for the price
Qwen3 32B (local)GoodBest local option
Llama 3.3 70B (local)GoodNeeds big GPU
Models under 14B (local)UnreliableOften fails multi-step tool chains
caution

Local models smaller than ~14B parameters often struggle with complex multi-step tool calling. For reliable agent behavior, use 30B+ local models or cloud providers.


By Budget

$0/month (Local Models Only)

Run models on your own hardware via Ollama, LM Studio, or vLLM. No API key needed.

Your VRAMRecommended ModelQuality
8 GBQwen3 8B, Llama 3.3 8BBasic tasks
12-16 GBQwen3 14BGood for most tasks
24 GB (RTX 4090)Qwen3 32B (Q4_K_M)Excellent daily driver
40-80 GB (A100)Llama 3.3 70BNear-cloud quality
~/.openclaw/openclaw.json
{
"brain": {
"provider": "local",
"model": "qwen3:32b",
"endpoint": "http://localhost:11434"
}
}

See Local Models Guide for full setup instructions.

$5-30/month (Budget Cloud)

Use cheap cloud models via OpenRouter or direct:

~/.openclaw/openclaw.json
{
"brain": {
"provider": "openrouter",
"model": "deepseek/deepseek-v3.2"
},
"heartbeat": {
"model": "google/gemini-2.5-flash"
}
}

$30-150/month (Premium Cloud)

Use Anthropic models directly for best quality, with a cheap fallback for heartbeat:

~/.openclaw/openclaw.json
{
"brain": {
"provider": "anthropic",
"model": "claude-sonnet-4-6"
},
"heartbeat": {
"model": "claude-haiku-4-5-20251001"
}
}

Hybrid (Best of Both)

Route expensive tasks to cloud, cheap tasks to local:

~/.openclaw/openclaw.json
{
"brain": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"fallback": {
"provider": "local",
"model": "qwen3:32b"
}
},
"heartbeat": {
"provider": "local",
"model": "qwen3:14b"
}
}

See Cost Management for advanced routing strategies.


Model Routing

OpenClaw routes different tasks to different models based on your configuration.

Per-Task Routing

Config KeyControlsTypical Choice
brain.modelDefault model for chat and reasoningSonnet 4.6
heartbeat.modelHeartbeat cyclesHaiku 4.5 or local
agents.list[].modelPer-agent modelVaries by agent role
brain.fallbackFallback when primary is downDifferent provider
~/.openclaw/openclaw.json
{
"brain": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"fallback": {
"provider": "openrouter",
"model": "deepseek/deepseek-v3.2"
}
},
"heartbeat": {
"model": "claude-haiku-4-5-20251001"
},
"agents": {
"list": [
{ "id": "researcher", "model": "claude-opus-4-6" },
{ "id": "monitor", "model": "ollama/qwen3:14b" },
{ "id": "worker", "model": "claude-haiku-4-5-20251001" }
]
}
}

Three Hybrid Strategies

StrategyHow It WorksBest For
Primary with fallbacksCloud primary, local kicks in when cloud is down or rate-limitedReliability
Local-firstLocal primary, cloud safety net for complex tasksCost savings
Merge modeBoth cloud and local models available, route per-taskMaximum flexibility

Merge Mode

Use models.mode: "merge" to add local providers without losing cloud defaults:

{
"models": {
"mode": "merge",
"providers": {
"ollama": {
"models": {
"qwen3:32b": { "contextWindow": 32768 }
}
}
}
}
}

Without merge mode, custom providers replace the defaults entirely.


Provider Comparison

Cloud Providers

ProviderKey ModelsInput/Output CostContextSetup
AnthropicOpus 4.8, Sonnet 4.6, Haiku 4.5$0.25-$15 / $1.25-$75 per M200Kconsole.anthropic.com
OpenAIGPT-5.3-Codex, GPT-4o$2.50-$15 / $10-$60 per M128Kplatform.openai.com
GoogleGemini 2.5 Flash, 2.5 Pro$0.15-$1.25 / $0.60-$10 per M1Maistudio.google.com
DeepSeekV3.2, R1$0.27-$0.55 / $1.10-$2.19 per M128Kplatform.deepseek.com
xAIGrok$5 / $15 per M128Kconsole.x.ai
OpenRouter200+ modelsVaries (pass-through)Variesopenrouter.ai

Local Providers

ProviderSetupGUIMulti-GPUBest For
LM StudioDownload appYesNoBeginners, quick setup
OllamaOne commandNoLimitedCLI users, auto-discovery
vLLMpip installNoYes (tensor parallel)Production, high throughput
SGLangpip installNoYesHigh throughput, RadixAttention

OpenRouter

OpenRouter is a meta-provider — one API key gives you access to 200+ models across all major providers with usage-based billing.

Why Use OpenRouter

  • Single API key for Claude, GPT, Gemini, DeepSeek, open-source models
  • Cost arbitrage — find the cheapest provider for any model
  • Rate limit pooling — automatic fallback across providers
  • Free tier — rate-limited access to select models at $0
  • Usage tracking — detailed per-model spending dashboard

Configuration

~/.openclaw/openclaw.json
{
"brain": {
"provider": "openrouter",
"model": "anthropic/claude-sonnet-4-6",
"api_key": "${OPENROUTER_API_KEY}"
}
}

Advanced Routing Metadata

OpenRouter supports 13 routing fields that OpenClaw passes through (via PR #17148):

{
"models": {
"providers": {
"openrouter": {
"providerRouting": {
"sort": "price",
"allow_fallbacks": true,
"require_parameters": true,
"data_collection": "deny",
"quantizations": ["fp16", "bf16"],
"max_price": { "prompt": 0.001, "completion": 0.005 },
"preferred_max_latency": 10000,
"preferred_min_throughput": 50
}
}
}
}
}
FieldDescription
sortSort providers by price, latency, or throughput
onlyRestrict to specific providers
ignoreExclude specific providers
orderExplicit provider priority order
allow_fallbacksAllow fallback to other providers
require_parametersOnly use providers that support all parameters
data_collectiondeny to opt out of training data
quantizationsPreferred quantization levels
max_priceMaximum price per token (prompt/completion)
preferred_max_latencyTarget latency in milliseconds
preferred_min_throughputTarget tokens per second

Cost Optimization Tips

The 97% Reduction Strategy

Combine five changes to cut costs from ~$1,200/month to ~$36/month:

ChangeSavingsHow
Switch heartbeat from Opus to Haiku~90% of heartbeat costheartbeat.model: "claude-haiku-4-5-20251001"
Increase heartbeat interval to 60 min50% of remaining heartbeatheartbeat.interval: 3600
Enable quiet hours (8h/night)33% of remaining heartbeatheartbeat.quiet_hours
Use local model for heartbeat100% of heartbeat costheartbeat.model: "ollama/qwen3:14b"
Route sub-agents to Haiku~80% of sub-agent costagents.defaults.model

General Tips

  1. Heartbeat is the biggest cost driver — it fires every 30 min, 24/7. Use the cheapest model that works
  2. Increase heartbeat interval — 60 min instead of 30 cuts heartbeat costs in half
  3. Set quiet hours — no heartbeat while you sleep (saves ~33%)
  4. Keep sessions short — context accumulates; each message gets more expensive
  5. Route by task — use Opus only for complex work, cheap models for everything else
  6. Use OpenRouter — compare prices, leverage free tiers, pool rate limits
  7. Use max_context_tokens — limit memory loaded per message (default 2,000 tokens)
  8. Monitor spending — check provider dashboards regularly

See Cost Management for real-world case studies and the Performance Tuning guide for advanced optimization.


Known Issues

IssueStatusWorkaround
Ollama /v1 streaming breaks tool callsFixed (v2026.3.2+)Use native /api/chat endpoint (default)
Fallback permanently overwrites primary config (#47705)OpenUpdate to latest, report if persists
Stale merge data in provider config (#30395)OpenRestart gateway after config changes
Missing api field gives vague error (#6054)Won't fixAlways include "api": "openai-completions"
Timeout ignored for slow modelsIntermittentSet timeoutSeconds: 300 as safety net

See Also