[meta] Expand LLM provider catalog — Grok / Deepseek / OpenRouter / Gemini (in-core)

## Decision (locked)

Expand the LLM provider catalog in core (not as separate `atomic-agents-<provider>` packages) to add **Grok / xAI, Deepseek, OpenRouter, Gemini**. Reasons:

- Home-user experience stays "single `pip install atomic-agents-stack` and pick a provider." Splitting providers across N packages multiplies install friction for the audience the wizard was built for.
- The wizard `--provider` flag enum (per #324) needs all supported providers to be real at import time so env-var autodetect + Q0 question can enumerate them. Out-of-core packages would require extension-discovery machinery (probably entry-points) that doesn't exist today.
- Conformance suite at `tests/test_llm_protocol_conformance.py` catches breakage across all backends in CI. Out-of-core packages would need their own CI matrices and would lag the framework's own contract changes.

**Counter-argument acknowledged:** spec/31 §"Reference implementations" explicitly framed the framework as "ships three reference backends; everything else is a 200-line third-party package." That position is preserved for **future** providers (Bedrock, Vertex, Ollama, vLLM, Azure OpenAI, etc.) where the user base is more specialized. The 4 in this issue are sufficiently common in 2026 LLM workflows that core hosting is justified.

## Architectural shape

The 4 providers split into two work classes:

### OpenAI-compatible (Grok / Deepseek / OpenRouter)

All three expose `POST /v1/chat/completions` matching OpenAI's request/response shape. Implementation pattern: thin factory over `atomic_agents/llm/openai_compat.py`, same as `moonshot.py`. Each is ~50-150 LOC:

- `make_<provider>_backend(base_url, model_catalog) -> OpenAICompatibleLLMBackend` factory
- Pricing table in `_costs.py`
- `_get_<provider>_key()` resolver in `_llm.py`
- Conformance test parametrization
- spec/31 amendment

### Native protocol (Gemini)

Gemini has its own API shape, tool-use format, content-block structure, and streaming semantics. Implementation pattern: full `atomic_agents/llm/gemini.py` mirroring `anthropic.py`. ~300-500 LOC with translation at the `call()` boundary. Deserves full adversarial review cadence (3 rounds Opus minimum per the project's discipline rule 11).

### OpenRouter is special

OpenAI-compatible at the wire level, but strategically a **meta-provider** — one API key unlocks Anthropic + OpenAI + Gemini + Llama + many others via prefixed model strings (`anthropic/claude-opus-4-7`, `google/gemini-2.0-flash`, `meta-llama/llama-3.3-70b-instruct`).

**Open design question for the OpenRouter arc's plan-eng-review:** how does `model.md` represent meta-routing? Options:

- **A. Single field.** `provider: openrouter` + `model: anthropic/claude-opus-4-7`. Simpler. Aligns with OpenRouter's native model-string convention.
- **B. Split fields.** `provider: openrouter` + `target_provider: anthropic` + `model: claude-opus-4-7`. More explicit routing intent; tooling can introspect target.
- **C. Both supported.** Single field is primary, split is opt-in for operators who want routing visibility.

Recommendation: A for v1 simplicity; revisit if operators ask for B. Decide in the OpenRouter arc, not here.

**Pricing strategy for OpenRouter:** options are (1) hardcode every routed upstream model's price in `_costs.py` (high maintenance, accurate), (2) defer to OpenRouter's billing-side accounting and emit cost events with `cost_usd: unknown` (low maintenance, breaks the cost-guardrail discipline rule 4 from CLAUDE.md), (3) cache OpenRouter's `/models` endpoint pricing once per session (medium maintenance, accurate). Recommendation: option 3 with manual override hook. Decide in OpenRouter arc.

## Cross-cutting work per arc

Every new provider arc touches:

- **Pricing table** in `atomic_agents/_costs.py` — input/output USD per 1M tokens per model variant
- **Key resolver** `_get_<provider>_key()` in `atomic_agents/_llm.py` mirroring the existing three (`ATOMIC_AGENTS_<UPPER>_KEY` env var first, then `<UPPER>_API_KEY` fallback, then Keychain, then `~/.config/atomic_agents/keys.json`)
- **Doctor check** — `check_provider_keys` in `atomic_agents/doctor.py` extends to enumerate the registered backend + verify its credential resolves with PASS/WARN/FAIL ladder + redacted URL output
- **Conformance suite parametrization** at `tests/test_llm_protocol_conformance.py` so the SyncLLMBackend Protocol contract is enforced across all 7 backends
- **spec/31 amendment** — update §"Reference implementations" to enumerate the new shipped backend + §"Default model" if recommendation changes
- **CHANGELOG entry** per arc

## Ordering (recommended)

1. **Grok + Deepseek paired (or in parallel).** OpenAI-shape, low risk. Warms up the cross-cutting work (pricing tables, key resolvers, doctor extension, conformance parametrization) before harder arcs.
2. **OpenRouter.** Strategic value is highest (one key, many models). Wire impl is OpenAI-shape so medium effort. Load-bearing design question is the `model.md` shape resolution.
3. **Gemini.** Biggest scope. Full adversarial cadence including tool-use translation correctness verification.
4. **#324 wizard update.** After all 4 land. The `--provider` enum becomes real; env-var autodetect scans all 7 credential patterns; Q0 question grows from 3 options to 7 (or splits into "common providers" / "all providers" if 7 feels overwhelming — decide in #324's plan-eng-review).
5. **#142 OPERATOR_NOTES.md amendment.** Folds in alongside #324.

## Issues filed under this meta

- **#326** — Grok / xAI LLMBackend
- **#327** — Deepseek LLMBackend
- **#328** — OpenRouter LLMBackend
- **#329** — Gemini LLMBackend

## Amendments queued for after the 4 arcs land

- **#324** — wizard `--provider` flag enum 3 → 7; env-var autodetect logic; Q0 question expansion.
- **#142** — OPERATOR_NOTES.md "LLM provider" line accepts expanded enum.

## What this does NOT do

- Does NOT add Bedrock, Vertex, Azure OpenAI, Ollama, vLLM, LiteLLM-as-core, or any other provider beyond the 4 named.
- Does NOT change the SyncLLMBackend Protocol surface itself (that's #87's territory; locked at v0.13).
- Does NOT change `agent.call()` flow, cost-gate dispatch, or tool-loop semantics — backends translate at their own boundaries per spec/31.
- Does NOT add streaming or async backend variants (those are reserved Protocols per spec/31 §"Reserved future Protocols").

## References

- spec/31 — LLMBackend Protocol; the architectural contract every new backend satisfies.
- #87 — original LLMBackend arc that established the Protocol.
- #324 — wizard multi-provider support (downstream consumer of this meta).
- #142 — OPERATOR_NOTES.md (downstream amendment).
- CLAUDE.md taste rule 2 (Protocols not subclassing) and rule 4 (cost is first-class) — both binding on every per-provider arc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[meta] Expand LLM provider catalog — Grok / Deepseek / OpenRouter / Gemini (in-core) #325

Decision (locked)

Architectural shape

OpenAI-compatible (Grok / Deepseek / OpenRouter)

Native protocol (Gemini)

OpenRouter is special

Cross-cutting work per arc

Ordering (recommended)

Issues filed under this meta

Amendments queued for after the 4 arcs land

What this does NOT do

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[meta] Expand LLM provider catalog — Grok / Deepseek / OpenRouter / Gemini (in-core) #325

Description

Decision (locked)

Architectural shape

OpenAI-compatible (Grok / Deepseek / OpenRouter)

Native protocol (Gemini)

OpenRouter is special

Cross-cutting work per arc

Ordering (recommended)

Issues filed under this meta

Amendments queued for after the 4 arcs land

What this does NOT do

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions