🤖 feat: add Fast/Slow service-tier control to chat input#3476
🤖 feat: add Fast/Slow service-tier control to chat input#3476ammar-agent wants to merge 5 commits into
Conversation
Add a lightning-bolt service-tier control to the chat input (right of the thinking slider) for models that support service_tier (OpenAI/GPT today). The bolt glows orange for Fast (priority), turns blue for Slow (flex), and is neutral grey for Auto/default. Clicking opens a per-chat override menu. Also add /fast and /slow one-shot slash commands that reuse the existing model-oneshot send path, with TipsCarousel discovery. Uses Fast/Slow wording to stay provider-agnostic for future models.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23d1497908
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses Codex P2: non-passthrough gateway-routed OpenAI models (e.g. openrouter:openai/gpt-5) canonicalize to 'openai' but the backend drops providerOptions.openai.serviceTier on those routes. supportsServiceTier now only returns true for direct openai:* models and passthrough gateways (mux-gateway), so the UI never advertises a no-op Fast/Slow override.
|
@codex review Addressed the P2 (commit 8c7fdf9):
Non-passthrough gateways like |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8c7fdf93cb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses Codex P2: syncCreationPreferences() copied model/agent/thinking from the project scope to the new workspace but not the service-tier override, so a Fast/Slow choice made during creation reverted to Auto after the workspace opened. Now copies getServiceTierKey(projectScope) -> getServiceTierKey(workspace), mirroring thinking level. Extends the creation success test to assert the sync.
|
@codex review Addressed the creation-flow P2 (commit 615aa48): |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
The /fast and /slow slash commands were standalone service-tier-only one-shots that could not compose with the existing /<model>[+thinking] one-shot syntax. Remove them entirely (parser, suggestions, tips, the ParsedCommand serviceTier field, the index.tsx one-shot guard/merge, and the command-key helpers) while keeping the bolt UI control and the per-chat service-tier override plumbing.
|
@codex review Removed the |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6f75162084
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Re-merge the persisted Fast/Slow tier against the effective (one-shot) model at send time. useSendMessageOptions bakes the tier against the saved model, so a /<model> one-shot to an OpenAI model previously dropped a tier set while the saved model was non-OpenAI. withServiceTierOverride is now authoritative: it attaches the tier for supported models and strips any stale tier when the effective model can't honor it.
|
@codex review Addressed the composition gap. The per-chat Fast/Slow tier is now re-merged against the effective (one-shot) model at send time in |
|
@codex review Friendly re-ping: the previous P2 (per-chat Fast/Slow tier not composing with |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Adds a service-tier (speed) control to the chat input: a lightning-bolt to the right of the thinking slider that appears only for models supporting
service_tier(OpenAI/GPT today). The bolt glows orange for Fast, turns blue for Slow, and is neutral grey for Auto/default. Clicking it opens a small menu to set a chat-specific override.The UI deliberately says Fast/Slow (not provider wire values) so it generalizes to future providers, even though only OpenAI GPT is supported for now.
Background
service_tieralready existed end-to-end as a global, per-provider config setting (config.openai.serviceTier), threaded throughproviderOptions.openai.serviceTier. There was no per-chat override and no way to set it from the chat input. This adds that affordance.Mapping (single source of truth in
serviceTier.ts): Fast →priority(low latency), Slow →flex(cheaper, higher latency), Auto → no override (provider/global default applies).Implementation
src/common/utils/ai/serviceTier.ts:supportsServiceTier(model)(OpenAI-gated, and only for direct/passthrough routes since non-passthrough gateways drop the field), Fast/Slow ↔ wire mappings,getServiceTierSpeed, andwithServiceTierOverride(providerOptions, tier, model)— the one place every send path merges the tier (returns options unchanged for no-override or unsupported models, never mutating input).useServiceTier(scopeId)backed byusePersistedState(localStorage, keyed by workspace/project scope, cross-component synced). Like the existing Anthropic 1M-context toggle, it is intentionally not persisted to workspace metadata — it rides along per request viaproviderOptions.ServiceTierPicker(lightning-bolt + keyboard-navigable menu, conditional-rendered for happy-dom testability). Colors use new--color-service-tier-fast/-slowtokens inglobals.css. Returnsnull(occupies no layout) for unsupported models.useSendMessageOptionsand the non-ReactgetSendOptionsFromStorageboth apply the override via the shared helper. The creation flow carries a tier chosen pre-workspace into the new workspace scope.Validation
serviceTierhelpers (mapping/gating/merge/no-mutation, including passthrough-gateway routing).ServiceTierPickergating, open, Fast/Slow/Auto selection + persistence (5 cases).make static-checkgreen; targeted suites pass.Risks
Low. The change is additive and OpenAI-gated. The shared
withServiceTierOverridenever attaches a tier for unsupported models, so non-OpenAI requests are unaffected. The bolt only renders for supporting models, so existing layouts/snapshots for non-OpenAI models are unchanged; ChatInput stories using a GPT model will now show the bolt (expected).Generated with
mux• Model:anthropic:claude-opus-4-8• Thinking:xhigh• Cost:$41.61