Skip to content

Latest commit

 

History

History
283 lines (200 loc) · 14.6 KB

File metadata and controls

283 lines (200 loc) · 14.6 KB

Models

This agent supports multiple model providers. By default, it uses models from the OpenCode Zen subscription service. Additionally, you can use models directly from providers like Groq by setting the appropriate API key.

Supported Providers

Provider Format API Key Env Variable Documentation
OpenCode Zen opencode/<model-id> N/A (public for free models) OpenCode Zen
Kilo Gateway kilo/<model-id> N/A (public for free models) Kilo Gateway Documentation
Anthropic anthropic/<model-id> ANTHROPIC_API_KEY Anthropic Docs
Claude OAuth claude-oauth/<model-id> CLAUDE_CODE_OAUTH_TOKEN Claude OAuth Documentation
Groq groq/<model-id> GROQ_API_KEY Groq Documentation
OpenRouter openrouter/<provider>/<model> OPENROUTER_API_KEY OpenRouter Documentation

Claude OAuth: The claude-oauth provider allows using your Claude Pro/Max subscription. Authenticate with agent auth claude or use existing Claude Code CLI credentials with --use-existing-claude-oauth.

Available Models

All models are accessed using the format <provider>/<model-id>. Use the --model option to specify which model to use:

echo "hi" | agent --model opencode/gpt-5-nano

OpenCode Zen Pricing

Below are the prices per 1M tokens for OpenCode Zen models. Models are sorted by output price (lowest first) for best pricing visibility.

Model Model ID Input Output Cached Read Cached Write
Free Models (Output: $0.00)
Nemotron 3 Super Free (default) opencode/nemotron-3-super-free Free Free Free -
MiniMax M2.5 Free opencode/minimax-m2.5-free Free Free Free -
GPT 5 Nano opencode/gpt-5-nano Free Free Free -
Big Pickle opencode/big-pickle Free Free Free -
Discontinued Free Models
Qwen 3.6 Plus Free opencode/qwen3.6-plus-free Free Free Free -
Kimi K2.5 Free opencode/kimi-k2.5-free Free Free Free -
Grok Code Fast 1 opencode/grok-code Free Free Free -
MiniMax M2.1 Free opencode/minimax-m2.1-free Free Free Free -
GLM 4.7 Free opencode/glm-4.7-free Free Free Free -
Paid Models (sorted by output price)
Qwen3 Coder 480B opencode/qwen3-coder-480b $0.45 $1.50 - -
GLM 4.6 opencode/glm-4-6 $0.60 $2.20 $0.10 -
Kimi K2 opencode/kimi-k2 $0.60 $2.50 $0.36 -
Claude Haiku 3.5 opencode/claude-haiku-3-5 $0.80 $4.00 $0.08 $1.00
Claude Haiku 4.5 opencode/haiku $1.00 $5.00 $0.10 $1.25
GPT 5.1 opencode/gpt-5-1 $1.25 $10.00 $0.125 -
GPT 5.1 Codex opencode/gpt-5-1-codex $1.25 $10.00 $0.125 -
GPT 5 opencode/gpt-5 $1.25 $10.00 $0.125 -
GPT 5 Codex opencode/gpt-5-codex $1.25 $10.00 $0.125 -
Gemini 3 Pro (≤ 200K tokens) opencode/gemini-3-pro $2.00 $12.00 $0.20 -
Claude Sonnet 4.5 (≤ 200K tokens) opencode/sonnet $3.00 $15.00 $0.30 $3.75
Claude Sonnet 4 (≤ 200K tokens) opencode/claude-sonnet-4 $3.00 $15.00 $0.30 $3.75
Gemini 3 Pro (> 200K tokens) opencode/gemini-3-pro $4.00 $18.00 $0.40 -
Claude Sonnet 4.5 (> 200K tokens) opencode/sonnet $6.00 $22.50 $0.60 $7.50
Claude Sonnet 4 (> 200K tokens) opencode/claude-sonnet-4 $6.00 $22.50 $0.60 $7.50
Claude Opus 4.1 opencode/opus $15.00 $75.00 $1.50 $18.75

Default Model

The default model is Nemotron 3 Super Free (opencode/nemotron-3-super-free), which is completely free and offers strong reasoning capabilities with a ~262K token context window (NVIDIA hybrid Mamba-Transformer architecture).

Note: Qwen 3.6 Plus Free (opencode/qwen3.6-plus-free) was previously the default free model, but OpenCode Zen ended the free promotion in April 2026. The model now requires an OpenCode Go subscription. See issue #242.

Note: MiniMax M2.5 Free (opencode/minimax-m2.5-free) was previously the default free model. See issue #232.

Note: Kimi K2.5 Free (opencode/kimi-k2.5-free) was previously the default free model, but it was removed from the OpenCode Zen provider in March 2026. See Case Study #208 for details.

Note: Grok Code Fast 1 (opencode/grok-code) was previously the default free model, but xAI ended the free tier for this model on OpenCode Zen in January 2026. The grok-code model is no longer included as a free model in OpenCode Zen subscription. See Case Study #133 for details.

Free Models (in order of recommendation)

  1. Nemotron 3 Super Free (opencode/nemotron-3-super-free) - Default free model, NVIDIA hybrid Mamba-Transformer (~262K context, strong reasoning)
  2. MiniMax M2.5 Free (opencode/minimax-m2.5-free) - Strong general-purpose performance (~200K context)
  3. GPT 5 Nano (opencode/gpt-5-nano) - Reliable OpenAI-powered free option (~400K context)
  4. Big Pickle (opencode/big-pickle) - Stealth model, free during evaluation (~200K context)

Note: opencode/qwen3.6-plus-free, opencode/kimi-k2.5-free, opencode/minimax-m2.1-free, and opencode/glm-4.7-free are no longer available as free models on OpenCode Zen. See OpenCode Zen Documentation for the current list of free models.

Usage Examples

Using the Default Model (Free)

# Uses opencode/nemotron-3-super-free by default
echo "hello" | agent

Using Other Free Models

# Nemotron 3 Super Free (default)
echo "hello" | agent --model opencode/nemotron-3-super-free

# MiniMax M2.5 Free
echo "hello" | agent --model opencode/minimax-m2.5-free

# GPT 5 Nano
echo "hello" | agent --model opencode/gpt-5-nano

# Big Pickle
echo "hello" | agent --model opencode/big-pickle

Using Paid Models

# Claude Sonnet 4.5 (best quality)
echo "hello" | agent --model opencode/sonnet

# Claude Haiku 4.5 (fast and affordable)
echo "hello" | agent --model opencode/haiku

# Claude Opus 4.1 (most capable)
echo "hello" | agent --model opencode/opus

# Gemini 3 Pro
echo "hello" | agent --model opencode/gemini-3-pro

# GPT 5.1
echo "hello" | agent --model opencode/gpt-5-1

# Qwen3 Coder (specialized for coding)
echo "hello" | agent --model opencode/qwen3-coder-480b

More Information

For complete details about OpenCode Zen subscription and pricing, visit the OpenCode Zen Documentation.

Notes

  • All prices are per 1 million tokens
  • Cache pricing applies when using prompt caching features
  • Token context limits vary by model
  • Free models have no token costs but may have rate limits

Groq Provider

Groq provides ultra-fast inference for open-source large language models. To use Groq models, set your API key:

export GROQ_API_KEY=your_api_key_here

Groq Models

Model Model ID Context Window Tool Use
Llama 3.3 70B Versatile groq/llama-3.3-70b-versatile 131,072 tokens Yes
Llama 3.1 8B Instant groq/llama-3.1-8b-instant 131,072 tokens Yes
GPT-OSS 120B groq/openai/gpt-oss-120b 131,072 tokens Yes
GPT-OSS 20B groq/openai/gpt-oss-20b 131,072 tokens Yes
Qwen3 32B groq/qwen/qwen3-32b 131,072 tokens Yes
Compound groq/groq/compound 131,072 tokens Yes
Compound Mini groq/groq/compound-mini 131,072 tokens Yes

Groq Usage Examples

# Using Llama 3.3 70B (recommended)
echo "hello" | agent --model groq/llama-3.3-70b-versatile

# Using faster Llama 3.1 8B
echo "hello" | agent --model groq/llama-3.1-8b-instant

# Using Compound (agentic with server-side tools)
echo "hello" | agent --model groq/groq/compound

For more details, see the Groq Documentation.


OpenRouter Provider

OpenRouter provides unified access to hundreds of AI models from multiple providers including OpenAI, Anthropic, Google, Meta, and more. To use OpenRouter models, set your API key:

export OPENROUTER_API_KEY=your_api_key_here

OpenRouter Models

Model Model ID Context Window Tool Use
Claude Sonnet 4 openrouter/anthropic/claude-sonnet-4 200,000 tokens Yes
Claude Sonnet 4.5 openrouter/anthropic/claude-sonnet-4-5 200,000 tokens Yes
GPT-4o openrouter/openai/gpt-4o 128,000 tokens Yes
GPT-4o Mini openrouter/openai/gpt-4o-mini 128,000 tokens Yes
Llama 3.3 70B openrouter/meta-llama/llama-3.3-70b 131,072 tokens Yes
Gemini 2.0 Flash openrouter/google/gemini-2.0-flash 1,000,000 tokens Yes
DeepSeek V3 openrouter/deepseek/deepseek-chat 64,000 tokens Yes

OpenRouter Usage Examples

# Using Claude Sonnet 4 via OpenRouter
echo "hello" | agent --model openrouter/anthropic/claude-sonnet-4

# Using GPT-4o via OpenRouter
echo "hello" | agent --model openrouter/openai/gpt-4o

# Using Llama 3.3 70B via OpenRouter
echo "hello" | agent --model openrouter/meta-llama/llama-3.3-70b

# Using free models (with rate limits)
echo "hello" | agent --model openrouter/meta-llama/llama-3.1-8b:free

For more details, see the OpenRouter Documentation.


Kilo Gateway Provider

Kilo is an open-source AI coding agent platform providing access to 500+ AI models through the Kilo Gateway. The gateway uses an OpenAI-compatible API, making it easy to integrate with existing tools.

Free Models (No API Key Required)

Kilo offers several free models that work without setting up an API key:

Model Model ID Context Window Description
GLM-5 (recommended) kilo/glm-5-free 202,752 tokens Z.AI flagship model, matches Opus 4.5 on many tasks
GLM 4.5 Air kilo/glm-4.5-air-free 131,072 tokens Free Z.AI model with agent-centric capabilities
MiniMax M2.5 kilo/minimax-m2.5-free 204,800 tokens Strong general-purpose performance (upgraded from M2.1)
DeepSeek R1 kilo/deepseek-r1-free 163,840 tokens Advanced reasoning model
Giga Potato kilo/giga-potato-free 256,000 tokens Free evaluation model
Trinity Large Preview kilo/trinity-large-preview 131,000 tokens Arcee AI preview model

Note: kilo/glm-4.7-free and kilo/minimax-m2.1-free are no longer the recommended free models. Use kilo/glm-4.5-air-free and kilo/minimax-m2.5-free instead.

Note: GLM-5 is currently free for a limited time. See GLM-5 Announcement for details.

GLM-5 Specifications

GLM-5 is Z.AI's (Zhipu AI) flagship model with enhanced reasoning and coding capabilities:

Property Value
Model ID kilo/glm-5-free
Context Window 202,752 tokens
Max Output Tokens 131,072 tokens
Function Calling Yes
Tool Choice Yes
Structured Outputs Yes (JSON schema)
Reasoning Tokens Yes

Using Paid Models

For paid models, set your Kilo API key:

export KILO_API_KEY=your_api_key_here

Get your API key at app.kilo.ai.

Kilo Usage Examples

# Using GLM-5 (recommended free model)
echo "hello" | agent --model kilo/glm-5-free

# Using GLM 4.5 Air (free, agent-centric)
echo "hello" | agent --model kilo/glm-4.5-air-free

# Using MiniMax M2.5 (free)
echo "hello" | agent --model kilo/minimax-m2.5-free

# Using DeepSeek R1 (free, reasoning)
echo "hello" | agent --model kilo/deepseek-r1-free

# Using Giga Potato (free evaluation)
echo "hello" | agent --model kilo/giga-potato-free

For more details, see the Kilo Gateway Documentation.