Skip to content

[feat] Enable OpenAnt to work with non-Claude LLMs (Qwen, Kimi, local models, etc.) #65

@ar7casper

Description

@ar7casper

Problem

OpenAnt is hardcoded to Anthropic's Claude end-to-end. Every step in the scan pipeline — Stage 1 detect, context enhancement (agentic and single-shot), Stage 1 consistency check, Stage 2 verification, JSON correction, and report remediation — calls a hardcoded claude-... model ID against the default Anthropic API endpoint. There is no supported way to:

  • Point OpenAnt at a non-Anthropic provider (OpenRouter, a self-hosted proxy, etc.)
  • Use an open-weights model (Qwen, Kimi, MiniMax, DeepSeek, Llama, GLM, ...) for any step in the pipeline
  • Run OpenAnt against a local LLM server (Ollama, LM Studio, vLLM)

Why this matters

  • Cost. Open-weights models on OpenRouter are 10-50x cheaper than Opus for comparable code-analysis tasks. Users who scan large monorepos hit the price wall fast.
  • Quality comparison. Several open-weights coding models (Qwen 3 Coder, Kimi K2, DeepSeek V3) now rival Claude on code understanding benchmarks. Users want to compare verdict quality before committing to a provider.
  • Privacy / on-prem. Some teams cannot send code to a hosted LLM at all. Local model support unlocks those users entirely.
  • Vendor independence. Hardcoding a single LLM vendor into a security tool is a long-term liability — model deprecations, pricing changes, and regional availability all become outage risks.

End state

A user should be able to run:

export OPENANT_LLM_BASE_URL=https://openrouter.ai/api/v1
export OPENANT_LLM_API_KEY=sk-or-v1-...
openant scan /path/to/repo --model qwen/qwen-3-coder-480b

…and have every LLM call in the pipeline route to that endpoint with that model. Same shape for local servers (OPENANT_LLM_BASE_URL=http://localhost:11434/v1) or for any other Anthropic-compatible provider.

Acceptance criteria

  • Single --model flag controls every LLM step end-to-end (detect, enhance, consistency, verify, JSON correction, report remediation). No silent fallback to a hardcoded Claude ID.
  • Provider/endpoint is configurable via environment variables (and/or a CLI flag) without touching code. Default behavior for users who set neither is unchanged — they keep talking to Anthropic the way they do today.
  • Cost tracking is honest for non-Claude models: known pricing from a built-in or override table, unknown models report $0 with a clear warning rather than silently estimating with Claude rates.
  • Documented in README with at least three concrete examples: OpenRouter, a local LLM server, and a custom Anthropic-compat proxy.
  • Test coverage proves end-to-end model propagation — i.e., a scan with --model X actually sends model X to every messages.create() call site.
  • Existing Claude users see zero behavior change. Same defaults, same model IDs, same costs.

Out of scope

  • Multi-provider per scan (e.g. Opus for verify, Qwen for detect). The current goal is one provider per scan.
  • A second SDK (OpenAI-SDK code path). The Anthropic-compat endpoint approach via base_url keeps the dependency surface small; an OpenAI path can be a follow-up if a non-compat provider becomes important.
  • Routing logic (smart provider selection based on cost/load). Out of scope for this issue.

Notes

This issue intentionally describes the goal and acceptance criteria only — it does not prescribe an implementation. There are several in-flight PRs touching adjacent slices of this surface; coordination on those happens on the PRs themselves once this umbrella is filed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions