Skip to content

fix(reasoning): don't persist request-scoped reasoning_effort as an operator disable (#10622)#10623

Open
Anai-Guo wants to merge 1 commit into
mudler:masterfrom
Anai-Guo:fix/reasoning-effort-none-persist
Open

fix(reasoning): don't persist request-scoped reasoning_effort as an operator disable (#10622)#10623
Anai-Guo wants to merge 1 commit into
mudler:masterfrom
Anai-Guo:fix/reasoning-effort-none-persist

Conversation

@Anai-Guo

@Anai-Guo Anai-Guo commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What

Fixes #10622.

A model that sets reasoning_effort: none (or any effort default) in its YAML without an explicit reasoning.disable loses the ability to enable thinking on a per-request basis after the first request.

Root cause

  1. On a request, ApplyReasoningEffort resolves the effective effort (config default none when the request omits it) and sets ReasoningConfig.DisableReasoning = true on the request-scoped config copy (core/config/model_config.go).
  2. After the backend loads, the thinking/media-marker probe runs and calls UpdateModelConfig, which copied c.ReasoningConfig.DisableReasoning back into the loader's persistent config (core/backend/llm.go).
  3. DetectThinkingSupportFromBackend only fills reasoning slots that are still nil, so the probe never actually produced that value — it was the request-time none default. But it is now persisted as if the operator had explicitly set reasoning.disable: true.
  4. On subsequent requests, ApplyReasoningEffort sees the (now non-nil) persisted disable and treats it as an operator's explicit disable, so a request-level reasoning_effort can no longer re-enable thinking.

Fix

Snapshot which reasoning slots were still nil before the probe. Only persist a slot if the probe was actually allowed to fill it (i.e. it was nil). This keeps the probe's genuine backend detection (and the media marker) persisted, while request-time reasoning_effort values never leak into the persistent config.

Result: reasoning_effort: none remains the per-request default, but clients can still request extra thinking via the reasoning_effort request param — exactly the expected behavior from the issue. An operator's explicit reasoning.disable is unaffected (it starts non-nil, so it is preserved and still wins).

Notes

Minimal, non-behavioral for the explicit-disable and no-effort paths. No new probe calls; the gRPC detection still runs exactly once, outside the loader lock, as before.

🤖 Generated with Claude Code

…del config

When a model sets `reasoning_effort: none` (or any default) in its YAML
without an explicit `reasoning.disable`, ApplyReasoningEffort resolves that
default at request time and sets ReasoningConfig.DisableReasoning on the
request-scoped config copy. The post-load thinking/marker probe then wrote
that request-scoped value back into the loader's persistent config via
UpdateModelConfig, making it look as though the operator had explicitly set
reasoning.disable=true. From then on, per-request `reasoning_effort` overrides
were silently ignored (an explicit operator disable wins over a request
asking to think).

DetectThinkingSupportFromBackend only fills reasoning slots that are still
nil, so a slot already set here came from ApplyReasoningEffort, not the probe.
Snapshot which slots were nil before the probe and only persist those, so the
probe's genuine backend detection is still saved while request-time reasoning
effort never leaks into the persistent config.

Fixes mudler#10622

Signed-off-by: Tai An <antai12232931@outlook.com>
@Anai-Guo Anai-Guo force-pushed the fix/reasoning-effort-none-persist branch from 106efda to 30fc379 Compare July 1, 2026 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model default reasoning_effort: none does not work

1 participant