Skip to content

[Bugfix] Default Qwen3 reasoning parser to prompt-has-open-think#52

Open
rivetphilbot wants to merge 1 commit into
1CatAI:mainfrom
rivetphilbot:fix/qwen3-default-prompt-has-open-think
Open

[Bugfix] Default Qwen3 reasoning parser to prompt-has-open-think#52
rivetphilbot wants to merge 1 commit into
1CatAI:mainfrom
rivetphilbot:fix/qwen3-default-prompt-has-open-think

Conversation

@rivetphilbot
Copy link
Copy Markdown

Summary

The standard Qwen3 chat template injects <think>\n into the assistant turn opener whenever enable_thinking is not explicitly False. That means completion tokens only contain </think> followed by the answer — never an opening <think>.

The current parser default (prompt_has_open_think=False) forces every client to pass chat_template_kwargs={"enable_thinking": True} on each request. When they don't (the common case for OpenAI-compatible clients), the parser sees no opening <think>, falls through to the both-tags-required branch, and dumps the entire completion — reasoning prose, the close tag, and the answer — into content with reasoning left null.

Flip the default to True so the parser matches the chat template's actual behavior out of the box. An explicit enable_thinking=False still works correctly: in that case the template emits a closed <think></think> pair into the prompt, so neither tag appears in the completion and the existing no-prompt-open-think branch is the right code path.

Reproduction

Live on V100 TP=2 with a Qwen3-derived model (deckard-40b-w4a16-hermes), serving config:

--reasoning-parser qwen3 --tool-call-parser qwen3_coder --enable-auto-tool-choice

Before (no chat_template_kwargs):

reasoning: null
content: "\n\n # Train Meeting Problem\n## Step 1: ...\n## Final Answer\nThey meet ..."

The CoT framing leaks into content because the parser couldn't find an opening <think>.

With explicit chat_template_kwargs={"enable_thinking": true} (workaround):

reasoning: "This is a classic relative speed problem. Let me work through it..."
content: "\n\n # Train Meeting Problem\n..."

After this patch (no chat_template_kwargs):

reasoning: "This is a classic relative speed problem. Let me work through it..."  (1377 chars)
content: "\n\n # Train Meeting Problem\n..."  (clean)

Test plan

  • Live A/B on V100 TP=2, deckard-40b-w4a16-hermes, before/with-kwargs/after — see reproduction above
  • enable_thinking=False path: still hits the no-prompt-open-think branch (template emits closed <think></think>, neither tag in completion)
  • Streaming path verification (uses same flag)

Co-Authored-By: RivetOS Claude noreply@rivetos.dev

The standard Qwen3 chat template injects <think>\n into the assistant
turn opener whenever enable_thinking is not explicitly False. That means
completion tokens only contain </think> followed by the answer, never an
opening <think>.

The current parser default (prompt_has_open_think=False) means clients
must explicitly pass chat_template_kwargs={"enable_thinking": True} in
every request, otherwise the parser sees no opening <think>, falls
through to the fully-tagged branch, and dumps the entire completion
(reasoning + close-tag + answer) into the content field with reasoning
left null.

Flip the default to True so the parser matches the template's actual
behavior out of the box. An explicit enable_thinking=False still
correctly disables open-think handling: the template injects a closed
<think></think> pair in that case, so neither token appears in the
completion and the no-prompt-open-think branch is the right code path.

Verified live on V100 TP=2 with Qwen3-based model: without the fix,
reasoning=null and 700+ chars of CoT leaked into content; with the fix,
reasoning correctly contains the CoT and content is the clean answer,
all without any client-side chat_template_kwargs.

Co-Authored-By: RivetOS Claude <noreply@rivetos.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant