Skip to content

Eval bug: reasoning off gives reasoning medium for gpt-oss #20458

@thomas-0816

Description

@thomas-0816

Name and Version

./llama-cli --version
load_backend: loaded RPC backend from /home/tb/code/llama-b8287/libggml-rpc.so
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon 780M Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
load_backend: loaded Vulkan backend from /home/tb/code/llama-b8287/libggml-vulkan.so
load_backend: loaded CPU backend from /home/tb/code/llama-b8287/libggml-cpu-zen4.so
version: 8287 (acb7c7906)
built with GNU 11.4.0 for Linux x86_64

Using --reasoning off for gpt-oss gives "Reasoning: medium", maybe it should give "Reasoning: low" ?

Using --reasoning-budget 0 for gpt-oss gives "Reasoning: medium", maybe it should give "Reasoning: low" ?

Using --chat-template-kwargs '{"reasoning_effort": "low"}' for gpt-oss gives "Reasoning: low" (ok).

./llama-server -hf unsloth/gpt-oss-20b-GGUF:F16 --threads -1 --parallel 1 --ctx-size 16384 --temp 1.0 --min-p 0.0 --top-p 1.0 --top-k 0 --n_predict 4096 --reasoning off --direct-io 2>&1 | grep -i reasoning
Reasoning: medium

maybe related to #20297 @pwilkin

Operating systems

Linux

GGML backends

Vulkan

Hardware

AMD Ryzen 7 PRO 7840U

Models

No response

Problem description & steps to reproduce

use "--reasoning off" with "unsloth/gpt-oss-20b-GGUF:F16"

First Bad Commit

No response

Relevant log output

Logs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions