Skip to content

[meta] Expand LLM provider catalog — Grok / Deepseek / OpenRouter / Gemini (in-core) #325

@dep0we

Description

@dep0we

Decision (locked)

Expand the LLM provider catalog in core (not as separate atomic-agents-<provider> packages) to add Grok / xAI, Deepseek, OpenRouter, Gemini. Reasons:

  • Home-user experience stays "single pip install atomic-agents-stack and pick a provider." Splitting providers across N packages multiplies install friction for the audience the wizard was built for.
  • The wizard --provider flag enum (per init wizard: multi-provider support — Anthropic / OpenAI / Moonshot via flag, env-var autodetect, or question #324) needs all supported providers to be real at import time so env-var autodetect + Q0 question can enumerate them. Out-of-core packages would require extension-discovery machinery (probably entry-points) that doesn't exist today.
  • Conformance suite at tests/test_llm_protocol_conformance.py catches breakage across all backends in CI. Out-of-core packages would need their own CI matrices and would lag the framework's own contract changes.

Counter-argument acknowledged: spec/31 §"Reference implementations" explicitly framed the framework as "ships three reference backends; everything else is a 200-line third-party package." That position is preserved for future providers (Bedrock, Vertex, Ollama, vLLM, Azure OpenAI, etc.) where the user base is more specialized. The 4 in this issue are sufficiently common in 2026 LLM workflows that core hosting is justified.

Architectural shape

The 4 providers split into two work classes:

OpenAI-compatible (Grok / Deepseek / OpenRouter)

All three expose POST /v1/chat/completions matching OpenAI's request/response shape. Implementation pattern: thin factory over atomic_agents/llm/openai_compat.py, same as moonshot.py. Each is ~50-150 LOC:

  • make_<provider>_backend(base_url, model_catalog) -> OpenAICompatibleLLMBackend factory
  • Pricing table in _costs.py
  • _get_<provider>_key() resolver in _llm.py
  • Conformance test parametrization
  • spec/31 amendment

Native protocol (Gemini)

Gemini has its own API shape, tool-use format, content-block structure, and streaming semantics. Implementation pattern: full atomic_agents/llm/gemini.py mirroring anthropic.py. ~300-500 LOC with translation at the call() boundary. Deserves full adversarial review cadence (3 rounds Opus minimum per the project's discipline rule 11).

OpenRouter is special

OpenAI-compatible at the wire level, but strategically a meta-provider — one API key unlocks Anthropic + OpenAI + Gemini + Llama + many others via prefixed model strings (anthropic/claude-opus-4-7, google/gemini-2.0-flash, meta-llama/llama-3.3-70b-instruct).

Open design question for the OpenRouter arc's plan-eng-review: how does model.md represent meta-routing? Options:

  • A. Single field. provider: openrouter + model: anthropic/claude-opus-4-7. Simpler. Aligns with OpenRouter's native model-string convention.
  • B. Split fields. provider: openrouter + target_provider: anthropic + model: claude-opus-4-7. More explicit routing intent; tooling can introspect target.
  • C. Both supported. Single field is primary, split is opt-in for operators who want routing visibility.

Recommendation: A for v1 simplicity; revisit if operators ask for B. Decide in the OpenRouter arc, not here.

Pricing strategy for OpenRouter: options are (1) hardcode every routed upstream model's price in _costs.py (high maintenance, accurate), (2) defer to OpenRouter's billing-side accounting and emit cost events with cost_usd: unknown (low maintenance, breaks the cost-guardrail discipline rule 4 from CLAUDE.md), (3) cache OpenRouter's /models endpoint pricing once per session (medium maintenance, accurate). Recommendation: option 3 with manual override hook. Decide in OpenRouter arc.

Cross-cutting work per arc

Every new provider arc touches:

  • Pricing table in atomic_agents/_costs.py — input/output USD per 1M tokens per model variant
  • Key resolver _get_<provider>_key() in atomic_agents/_llm.py mirroring the existing three (ATOMIC_AGENTS_<UPPER>_KEY env var first, then <UPPER>_API_KEY fallback, then Keychain, then ~/.config/atomic_agents/keys.json)
  • Doctor checkcheck_provider_keys in atomic_agents/doctor.py extends to enumerate the registered backend + verify its credential resolves with PASS/WARN/FAIL ladder + redacted URL output
  • Conformance suite parametrization at tests/test_llm_protocol_conformance.py so the SyncLLMBackend Protocol contract is enforced across all 7 backends
  • spec/31 amendment — update §"Reference implementations" to enumerate the new shipped backend + §"Default model" if recommendation changes
  • CHANGELOG entry per arc

Ordering (recommended)

  1. Grok + Deepseek paired (or in parallel). OpenAI-shape, low risk. Warms up the cross-cutting work (pricing tables, key resolvers, doctor extension, conformance parametrization) before harder arcs.
  2. OpenRouter. Strategic value is highest (one key, many models). Wire impl is OpenAI-shape so medium effort. Load-bearing design question is the model.md shape resolution.
  3. Gemini. Biggest scope. Full adversarial cadence including tool-use translation correctness verification.
  4. init wizard: multi-provider support — Anthropic / OpenAI / Moonshot via flag, env-var autodetect, or question #324 wizard update. After all 4 land. The --provider enum becomes real; env-var autodetect scans all 7 credential patterns; Q0 question grows from 3 options to 7 (or splits into "common providers" / "all providers" if 7 feels overwhelming — decide in init wizard: multi-provider support — Anthropic / OpenAI / Moonshot via flag, env-var autodetect, or question #324's plan-eng-review).
  5. [init] OPERATOR_NOTES.md + opt-in judges/mandates/audits scaffolding + operating mode + location questions #142 OPERATOR_NOTES.md amendment. Folds in alongside init wizard: multi-provider support — Anthropic / OpenAI / Moonshot via flag, env-var autodetect, or question #324.

Issues filed under this meta

Amendments queued for after the 4 arcs land

What this does NOT do

  • Does NOT add Bedrock, Vertex, Azure OpenAI, Ollama, vLLM, LiteLLM-as-core, or any other provider beyond the 4 named.
  • Does NOT change the SyncLLMBackend Protocol surface itself (that's [backend] LLMBackend Protocol + canonical types + native reference implementations #87's territory; locked at v0.13).
  • Does NOT change agent.call() flow, cost-gate dispatch, or tool-loop semantics — backends translate at their own boundaries per spec/31.
  • Does NOT add streaming or async backend variants (those are reserved Protocols per spec/31 §"Reserved future Protocols").

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendProtocol-pattern backend abstractions (memory, logs, locks, etc.)enhancementNew feature or requestspecImplementation of an Atomic Agents spec doc

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions