Regression: gpt-oss Jinja crash "Cannot pass both content   and thinking" — fix from #19704 lost in #18675

Name and Version:
Current master (983df142a), regression introduced in 566059a26
$llama-cli --version
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 126976 MiB):
  Device 0: Radeon 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, VRAM: 126976 MiB (49626 MiB free)
version: 8324 (983df142a)
built with GNU 15.2.0 for Linux x86_64

Operating systems: 
All? (template logic, not platform-specific)

GGML backends: 
All? (not backend-specific) Im using HIP/ROCm 7.2

Hardware: 
AMD Ryzen AI Max+ 395, Radeon 8060S (gfx1151), 128GB Unified Memory

Models:
 gpt-oss-120b (any gpt-oss model using
openai-gpt-oss-120b.jinja)

Problem description:
PR #19704 (39e4b1dc9) fixed a Jinja template crash by adding
adjusted_message.erase("content") in
common_chat_params_init_gpt_oss(). This fix was lost when #18675
(566059a26, "Autoparser refactoring") rewrote the function without
 carrying over the erase call.

Additionally, the Anthropic Messages API path
(convert_anthropic_to_oai() in server-common.cpp) was never fixed
— it sets content = "" on assistant messages with
reasoning_content + tool_calls, triggering the same crash.

Reproducer:
Send a multi-turn /v1/messages request to llama-server
 running a gpt-oss model, where assistant history contains
thinking + tool_use blocks.

First Bad Commit: 566059a26 (Autoparser - complete refactoring of
parser architecture #18675)

Relevant log output:
Cannot pass both content and thinking in an assistant message with
 tool calls! Put the analysis message in one or the other, but not
 both.

Both issues were patched locally and verified working with Claude Code CLI and OpenCode agentic workflows. 

Edit: had a quick check for my local changes..
```
**common/chat.cpp line 943 — add adjusted_message.erase("content");:**
          
  if (has_reasoning_content && has_tool_calls) {
      auto adjusted_message        = msg;
      adjusted_message["thinking"] = msg.at("reasoning_content");
      adjusted_message.erase("content");  **// <- added**
      adjusted_messages.push_back(adjusted_message);
  [...]

**tools/server/server-common.cpp ~line 1553 — skip setting content when tool_calls + reasoning_content:**
  if (!converted_content.empty()) {
      new_msg["content"] = converted_content;
  } else if (has_tool_calls && !reasoning_content.empty()) {
      // Don't set content - gpt-oss rejects content +  thinking with tool_calls
  } else if (has_tool_calls || !reasoning_content.empty()) {
      new_msg["content"] = "";
 }
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: gpt-oss Jinja crash "Cannot pass both content and thinking" — fix from #19704 lost in #18675 #20500

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regression: gpt-oss Jinja crash "Cannot pass both content and thinking" — fix from #19704 lost in #18675 #20500

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions