Dynamic model fetching, pricing scraping, and context window metadata#5
Open
manmeet0409 wants to merge 3 commits into
Open
Dynamic model fetching, pricing scraping, and context window metadata#5manmeet0409 wants to merge 3 commits into
manmeet0409 wants to merge 3 commits into
Conversation
- Add context_length to OpenAIModel struct so Hermes can detect context windows and trigger compression at 50% - Add reasoning effort passthrough (low/medium/high) to OpenAIChatRequest and CCChatParams - Expand HandleModels from 12 to 33 Command Code models with accurate context windows - Add short aliases (sonnet, opus, haiku, etc.) to MapModel - Add missing models: GLM-5.2, claude-opus-4-6, MiniMax-M3-Promo This enables Hermes to use the proxy with proper context awareness and reasoning control.
- Add contextmap.go: local override map for context windows (upstream API doesn't expose them). Single source of truth used by both dynamic and static paths. - Add modelfetch.go: ModelCache with thread-safe access. Fetches from https://api.commandcode.ai/provider/v1/models on startup and every 6 hours. Graceful fallback to static list when upstream is unreachable. - Update proxy.go: HandleModels now reads from cache; HandleModels kicks off background refresh if cache is stale. Extract getStaticModels() as fallback. Inject Command Code-internal models not in upstream response: taste-1, claude-haiku-4-5 (bare name), claude-opus-4-6, MiniMax-M3-Promo. - Update .gitignore: ignore locally-built Linux ELF (command-code-proxy-updated) and dev-only launcher scripts (run-proxy.bat, run-proxy.vbs) that contain API tokens. - Rebuild bin/command-code-proxy.exe with -ldflags "-s -w" (stripped symbols, ~30% smaller). Total models exposed: 36 (matches full Command Code catalog).
- Add internal/proxy/pricing.go: scrapes Command Code's public pricing page
for active deals (multiplier, description, status). Updates every 24 hours.
Fuzzy matching connects deal keys (e.g. 'qwen-3.7-max') to upstream model
IDs (e.g. 'Qwen/Qwen3.7-Max') despite naming differences.
- Update internal/proxy/modelfetch.go: probe each upstream model with a
minimal chat request to verify it exists. Models returning 403
('not recognized') are filtered out. This catches stale entries like
MiniMax-M3-Promo that were injected but don't actually work.
- Add ModelPricing struct to api.OpenAIModel so /v1/models responses include
deal information per model.
- Remove static injections of 'taste-1', 'claude-haiku-4-5' (bare),
'claude-opus-4-6', and 'MiniMax-M3-Promo' — these were either CLI-only
or upstream-inaccessible. The dynamic fetch + probe handles this now.
- Rebuild bin/command-code-proxy.exe with stripped symbols.
Tomorrow-proof: model list refreshes every 6h, deals every 24h. If upstream
changes HTML structure, the parser is lenient and falls back to logging
a warning rather than failing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes the Command Code proxy self-maintaining: it discovers models and pricing dynamically, so changes upstream don't require code updates.
What's new
Dynamic model fetching — On startup and every 6 hours, the proxy fetches the live model catalog from
https://api.commandcode.ai/provider/v1/models. Replaces the original 12-model hardcoded list with the real, up-to-date catalog.Model probing — Each upstream model is probed with a minimal chat request to verify it actually exists. Models returning 403 ("not recognized") are filtered out. This catches stale entries that upstream doesn't actually serve.
Pricing/deal scraping — The public pricing page (
https://commandcode.ai/docs/resources/pricing-limits) is scraped every 24 hours for active deals. Each model can carry apricingfield showing the deal multiplier, description, and status (permanent / expiration date).Context window metadata — Each model reports its
context_length(131K to 1M tokens) so clients like Hermes can determine compression thresholds.Tomorrow-proof design:
Files changed
internal/proxy/contextmap.go(new) — local override map for context windowsinternal/proxy/modelfetch.go(rewritten) — dynamic fetch + probe + pricing integrationinternal/proxy/pricing.go(new) — pricing page scraper + fuzzy model-deal matchinginternal/proxy/proxy.go—HandleModelsnow reads from cache with fallbackinternal/api/openai.go—OpenAIModelincludescontext_lengthandpricingfieldsinternal/api/commandcode.go— addedreasoningfield for effort passthroughinternal/proxy/model.go— added short aliases (sonnet, opus, haiku, glm-5.2, etc.).gitignore— ignore local-only scripts with API tokensbin/command-code-proxy.exe— rebuilt Windows binaryRemoved
taste-1(CLI-only, not on Provider API)MiniMax-M3-Promo(not served upstream)Total models exposed
33 models matching the full Command Code catalog, with up to 6 active deals visible in
/v1/modelsresponses.Example response
{ "id": "deepseek/deepseek-v4-pro", "object": "model", "owned_by": "deepseek", "context_length": 1000000, "pricing": { "multiplier": "4x", "description": "4× usage. Every dollar of credit goes 4× further on this model.", "status": "permanent" } }Test plan
/v1/modelsreturns all models withcontext_lengthandpricing