Skip to content

Dynamic model fetching, pricing scraping, and context window metadata#5

Open
manmeet0409 wants to merge 3 commits into
dev2k6:mainfrom
manmeet0409:feat/context-windows-reasoning
Open

Dynamic model fetching, pricing scraping, and context window metadata#5
manmeet0409 wants to merge 3 commits into
dev2k6:mainfrom
manmeet0409:feat/context-windows-reasoning

Conversation

@manmeet0409

@manmeet0409 manmeet0409 commented Jun 18, 2026

Copy link
Copy Markdown

Summary

This PR makes the Command Code proxy self-maintaining: it discovers models and pricing dynamically, so changes upstream don't require code updates.

What's new

  1. Dynamic model fetching — On startup and every 6 hours, the proxy fetches the live model catalog from https://api.commandcode.ai/provider/v1/models. Replaces the original 12-model hardcoded list with the real, up-to-date catalog.

  2. Model probing — Each upstream model is probed with a minimal chat request to verify it actually exists. Models returning 403 ("not recognized") are filtered out. This catches stale entries that upstream doesn't actually serve.

  3. Pricing/deal scraping — The public pricing page (https://commandcode.ai/docs/resources/pricing-limits) is scraped every 24 hours for active deals. Each model can carry a pricing field showing the deal multiplier, description, and status (permanent / expiration date).

  4. Context window metadata — Each model reports its context_length (131K to 1M tokens) so clients like Hermes can determine compression thresholds.

  5. Tomorrow-proof design:

    • Model list refreshes every 6 hours
    • Deals refresh every 24 hours
    • Lenient HTML parser with fallback warnings
    • Fuzzy matching handles name normalization (dots, dashes, prefixes)
    • Graceful fallback to static list when upstream is unreachable
    • Default context window (128K) for unknown models

Files changed

  • internal/proxy/contextmap.go (new) — local override map for context windows
  • internal/proxy/modelfetch.go (rewritten) — dynamic fetch + probe + pricing integration
  • internal/proxy/pricing.go (new) — pricing page scraper + fuzzy model-deal matching
  • internal/proxy/proxy.goHandleModels now reads from cache with fallback
  • internal/api/openai.goOpenAIModel includes context_length and pricing fields
  • internal/api/commandcode.go — added reasoning field for effort passthrough
  • internal/proxy/model.go — added short aliases (sonnet, opus, haiku, glm-5.2, etc.)
  • .gitignore — ignore local-only scripts with API tokens
  • bin/command-code-proxy.exe — rebuilt Windows binary

Removed

  • Static injection of taste-1 (CLI-only, not on Provider API)
  • Static injection of MiniMax-M3-Promo (not served upstream)
  • Hardcoded 12-model list (replaced by dynamic fetch)

Total models exposed

33 models matching the full Command Code catalog, with up to 6 active deals visible in /v1/models responses.

Example response

{
  "id": "deepseek/deepseek-v4-pro",
  "object": "model",
  "owned_by": "deepseek",
  "context_length": 1000000,
  "pricing": {
    "multiplier": "4x",
    "description": "4× usage. Every dollar of credit goes 4× further on this model.",
    "status": "permanent"
  }
}

Test plan

  • /v1/models returns all models with context_length and pricing
  • Probing filters non-existent models
  • Pricing scrape finds all 6 current deals
  • Fuzzy matching handles name variations (Qwen3.7-Max ↔ qwen-3.7-max)
  • Graceful fallback when upstream is down
  • No API tokens leaked in commit

- Add context_length to OpenAIModel struct so Hermes can detect context windows and trigger compression at 50%
- Add reasoning effort passthrough (low/medium/high) to OpenAIChatRequest and CCChatParams
- Expand HandleModels from 12 to 33 Command Code models with accurate context windows
- Add short aliases (sonnet, opus, haiku, etc.) to MapModel
- Add missing models: GLM-5.2, claude-opus-4-6, MiniMax-M3-Promo

This enables Hermes to use the proxy with proper context awareness and reasoning control.
- Add contextmap.go: local override map for context windows (upstream API
  doesn't expose them). Single source of truth used by both dynamic and
  static paths.
- Add modelfetch.go: ModelCache with thread-safe access. Fetches from
  https://api.commandcode.ai/provider/v1/models on startup and every 6 hours.
  Graceful fallback to static list when upstream is unreachable.
- Update proxy.go: HandleModels now reads from cache; HandleModels kicks off
  background refresh if cache is stale. Extract getStaticModels() as fallback.
  Inject Command Code-internal models not in upstream response: taste-1,
  claude-haiku-4-5 (bare name), claude-opus-4-6, MiniMax-M3-Promo.
- Update .gitignore: ignore locally-built Linux ELF (command-code-proxy-updated)
  and dev-only launcher scripts (run-proxy.bat, run-proxy.vbs) that contain
  API tokens.
- Rebuild bin/command-code-proxy.exe with -ldflags "-s -w" (stripped symbols,
  ~30% smaller).

Total models exposed: 36 (matches full Command Code catalog).
- Add internal/proxy/pricing.go: scrapes Command Code's public pricing page
  for active deals (multiplier, description, status). Updates every 24 hours.
  Fuzzy matching connects deal keys (e.g. 'qwen-3.7-max') to upstream model
  IDs (e.g. 'Qwen/Qwen3.7-Max') despite naming differences.

- Update internal/proxy/modelfetch.go: probe each upstream model with a
  minimal chat request to verify it exists. Models returning 403
  ('not recognized') are filtered out. This catches stale entries like
  MiniMax-M3-Promo that were injected but don't actually work.

- Add ModelPricing struct to api.OpenAIModel so /v1/models responses include
  deal information per model.

- Remove static injections of 'taste-1', 'claude-haiku-4-5' (bare),
  'claude-opus-4-6', and 'MiniMax-M3-Promo' — these were either CLI-only
  or upstream-inaccessible. The dynamic fetch + probe handles this now.

- Rebuild bin/command-code-proxy.exe with stripped symbols.

Tomorrow-proof: model list refreshes every 6h, deals every 24h. If upstream
changes HTML structure, the parser is lenient and falls back to logging
a warning rather than failing.
@manmeet0409 manmeet0409 changed the title Add dynamic model fetching, taste-1 model, and context window metadata Dynamic model fetching, pricing scraping, and context window metadata Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant