You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenAnt is hardcoded to Anthropic's Claude end-to-end. Every step in the scan pipeline — Stage 1 detect, context enhancement (agentic and single-shot), Stage 1 consistency check, Stage 2 verification, JSON correction, and report remediation — calls a hardcoded claude-... model ID against the default Anthropic API endpoint. There is no supported way to:
Point OpenAnt at a non-Anthropic provider (OpenRouter, a self-hosted proxy, etc.)
Use an open-weights model (Qwen, Kimi, MiniMax, DeepSeek, Llama, GLM, ...) for any step in the pipeline
Run OpenAnt against a local LLM server (Ollama, LM Studio, vLLM)
Why this matters
Cost. Open-weights models on OpenRouter are 10-50x cheaper than Opus for comparable code-analysis tasks. Users who scan large monorepos hit the price wall fast.
Quality comparison. Several open-weights coding models (Qwen 3 Coder, Kimi K2, DeepSeek V3) now rival Claude on code understanding benchmarks. Users want to compare verdict quality before committing to a provider.
Privacy / on-prem. Some teams cannot send code to a hosted LLM at all. Local model support unlocks those users entirely.
Vendor independence. Hardcoding a single LLM vendor into a security tool is a long-term liability — model deprecations, pricing changes, and regional availability all become outage risks.
…and have every LLM call in the pipeline route to that endpoint with that model. Same shape for local servers (OPENANT_LLM_BASE_URL=http://localhost:11434/v1) or for any other Anthropic-compatible provider.
Acceptance criteria
Single --model flag controls every LLM step end-to-end (detect, enhance, consistency, verify, JSON correction, report remediation). No silent fallback to a hardcoded Claude ID.
Provider/endpoint is configurable via environment variables (and/or a CLI flag) without touching code. Default behavior for users who set neither is unchanged — they keep talking to Anthropic the way they do today.
Cost tracking is honest for non-Claude models: known pricing from a built-in or override table, unknown models report $0 with a clear warning rather than silently estimating with Claude rates.
Documented in README with at least three concrete examples: OpenRouter, a local LLM server, and a custom Anthropic-compat proxy.
Test coverage proves end-to-end model propagation — i.e., a scan with --model X actually sends model X to every messages.create() call site.
Existing Claude users see zero behavior change. Same defaults, same model IDs, same costs.
Out of scope
Multi-provider per scan (e.g. Opus for verify, Qwen for detect). The current goal is one provider per scan.
A second SDK (OpenAI-SDK code path). The Anthropic-compat endpoint approach via base_url keeps the dependency surface small; an OpenAI path can be a follow-up if a non-compat provider becomes important.
Routing logic (smart provider selection based on cost/load). Out of scope for this issue.
Notes
This issue intentionally describes the goal and acceptance criteria only — it does not prescribe an implementation. There are several in-flight PRs touching adjacent slices of this surface; coordination on those happens on the PRs themselves once this umbrella is filed.
Problem
OpenAnt is hardcoded to Anthropic's Claude end-to-end. Every step in the scan pipeline — Stage 1 detect, context enhancement (agentic and single-shot), Stage 1 consistency check, Stage 2 verification, JSON correction, and report remediation — calls a hardcoded
claude-...model ID against the default Anthropic API endpoint. There is no supported way to:Why this matters
End state
A user should be able to run:
…and have every LLM call in the pipeline route to that endpoint with that model. Same shape for local servers (
OPENANT_LLM_BASE_URL=http://localhost:11434/v1) or for any other Anthropic-compatible provider.Acceptance criteria
--modelflag controls every LLM step end-to-end (detect, enhance, consistency, verify, JSON correction, report remediation). No silent fallback to a hardcoded Claude ID.--model Xactually sends modelXto everymessages.create()call site.Out of scope
base_urlkeeps the dependency surface small; an OpenAI path can be a follow-up if a non-compat provider becomes important.Notes
This issue intentionally describes the goal and acceptance criteria only — it does not prescribe an implementation. There are several in-flight PRs touching adjacent slices of this surface; coordination on those happens on the PRs themselves once this umbrella is filed.