Validate OpenRouter-compatible custom base_url reasoning/cache support

## Full feature parity comparison (tested 2026-05-24)

I tested three configurations against `deepseek/deepseek-v4-flash` and `deepseek/deepseek-v4-pro`:

| Feature | DeepSeek native | OpenRouter | ZenMux (sk-ai-v1) | ZenMux (sk-ss-v1) |
|---|---|---|---|---|
| Chat (v4-pro) | ✅ | ✅ | ✅ | ✅ |
| Chat (v4-flash) | ✅ | ✅ | ✅ | ✅ |
| Reasoning in `message.reasoning` | ✅ (`reasoning_content`) | ✅ | ✅ | ✅ |
| Reasoning tokens in usage | ✅ | ✅ `reasoning_tokens` | ✅ | ✅ |
| Streaming (SSE) | ✅ | ✅ | ✅ | ✅ |
| Reasoning in stream | ✅ | ✅ | ✅ | ✅ |
| `prompt_cache_hit_tokens` | ✅ | ✅ | ✅ | ❌ absent |
| `prompt_cache_miss_tokens` | ✅ | ✅ | ✅ | ❌ absent |
| `prompt_tokens_details.cached_tokens` | — | — | ✅ | ❌ absent |
| Model naming | `deepseek-v4-pro` | `deepseek/deepseek-v4-pro` | `deepseek/deepseek-v4-pro` | `deepseek/deepseek-v4-pro` |

### Key findings

1. **ZenMux IS OpenRouter-compatible.** ZenMux uses identical model naming (`deepseek/` prefix), the same `reasoning` field in messages, and the same `reasoning_tokens` in usage. Streaming includes reasoning in the same format.

2. **Cache metrics are key-type-dependent on ZenMux.** The `sk-ai-v1-...` key format returns `prompt_cache_hit_tokens`/`prompt_cache_miss_tokens`. The `sk-ss-v1-...` format does not. This suggests cache reporting may be an account-tier feature, not a provider-level limitation.

3. **The gap is purely in codewhale's provider routing.** ZenMux returns everything the OpenRouter provider needs. The only reason it doesn't work is that codewhale routes it through the `openai` provider (which doesn't parse reasoning/cache) instead of the `openrouter` provider (which does).

### Sharpened proposal

Since ZenMux speaks OpenRouter's API format byte-for-byte, the simplest and most impactful change is:

**Make the `openrouter` provider accept a custom `base_url`.**

```toml
[providers.openrouter]
api_key = "sk-ss-v1-..."
base_url = "https://zenmux.ai/api/v1"     # ← one new config key
```

This single change unlocks ZenMux and any future OpenRouter-compatible proxy (there are several already) with zero new provider code. The reasoning parser, cache parser, model lister, and stream handler all stay the same — only the HTTP base URL changes.

This is also simpler than the existing workaround of hijacking the `openai` provider with an `OPENAI_BASE_URL` header override, because:
- It uses the correct response parser (reasoning + cache)
- It doesn't confuse users about which provider they're using
- It keeps the `openai` provider available for actual OpenAI API usage

### ZenMux DeepSeek model list

```
deepseek/deepseek-v4-pro
deepseek/deepseek-v4-flash
deepseek/deepseek-v3.2
deepseek/deepseek-chat
deepseek/deepseek-reasoner
deepseek/deepseek-r1-0528
deepseek/deepseek-chat-v3.1
deepseek/deepseek-v3.2-exp
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate OpenRouter-compatible custom base_url reasoning/cache support #1978

Full feature parity comparison (tested 2026-05-24)

Key findings

Sharpened proposal

ZenMux DeepSeek model list

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature	DeepSeek native	OpenRouter	ZenMux (sk-ai-v1)	ZenMux (sk-ss-v1)
Chat (v4-pro)	✅	✅	✅	✅
Chat (v4-flash)	✅	✅	✅	✅
Reasoning in `message.reasoning`	✅ (`reasoning_content`)	✅	✅	✅
Reasoning tokens in usage	✅	✅ `reasoning_tokens`	✅	✅
Streaming (SSE)	✅	✅	✅	✅
Reasoning in stream	✅	✅	✅	✅
`prompt_cache_hit_tokens`	✅	✅	✅	❌ absent
`prompt_cache_miss_tokens`	✅	✅	✅	❌ absent
`prompt_tokens_details.cached_tokens`	—	—	✅	❌ absent
Model naming	`deepseek-v4-pro`	`deepseek/deepseek-v4-pro`	`deepseek/deepseek-v4-pro`	`deepseek/deepseek-v4-pro`

Validate OpenRouter-compatible custom base_url reasoning/cache support #1978

Description

Full feature parity comparison (tested 2026-05-24)

Key findings

Sharpened proposal

ZenMux DeepSeek model list

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions