Eval bug: SYCL: Qwen3.5 spitting garbage on the second prompt

### Name and Version

```
root@cb8f619a68ce:/app# ./llama-cli --version
load_backend: loaded SYCL backend from /app/libggml-sycl.so
load_backend: loaded CPU backend from /app/libggml-cpu-haswell.so
version: 8699 (4eb19514d)
built with IntelLLVM 2025.3.2 for Linux x86_64
```

### Operating systems

Linux

### GGML backends

SYCL

### Hardware

Ryzen 7 5700X3D + B580

### Models

Any Qwen3.5 that is not F16 or BF16

### Problem description & steps to reproduce

Just send two prompts and watch the second one being responded with garbage

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console
root@cb8f619a68ce:/app# ./llama-cli --model /models/Qwen3.5-35B-A3B-Q4_K_M.gguf
load_backend: loaded SYCL backend from /app/libggml-sycl.so
load_backend: loaded CPU backend from /app/libggml-cpu-haswell.so

Loading model...  


▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b8699-4eb19514d
model      : Qwen3.5-35B-A3B-Q4_K_M.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read <file>        add a text file
  /glob <pattern>     add text files using globbing pattern


> hi

[Start thinking]
Thinking Process:

1.  **Analyze the Request:**
    *   Input: "hi"
    *   Intent: Greeting.
    *   Context: Start of a conversation.
    *   Tone: Friendly, helpful, casual.

2.  **Determine the appropriate response:**
    *   Acknowledge the greeting.
    *   Offer assistance.
    *   Keep it concise but warm.

3.  **Draft potential responses:**
    *   "Hello! How can I help you today?"
    *   "Hi there! What's on your mind?"
    *   "Hello! Nice to meet you. How are you doing?"
    *   "Hi! 👋 Need any help with something specific?"

4.  **Select the best option:** A combination of friendly greeting and offer of help is standard and effective. "Hello! How can I help you today?" is clear and professional yet friendly. Adding an emoji can make it warmer.

5.  **Final Polish:** "Hello! 👋 How can I help you today?" or "Hi there! How's it going? What can I do for you?" Let's go with a simple, welcoming response.

6.  **Final Output Generation:** (Matches the selected draft)
    "Hello! 👋 How can I help you today?" or similar.

    *Self-Correction during thought process:* Keep it natural. Don't overthink a simple "hi".

    *Final Decision:* "Hello! 👋 How can I help you today?" or "Hi there! How's it going?" Let's use a friendly greeting + offer help.

    "Hello! 👋 How can I assist you today?" (Professional yet friendly)
    "Hi! How's it going? Anything I can help you with?" (More casual)

    Let's go with: "Hello! 👋 How can I help you today?" (Balanced)

    *Wait, checking system instructions:* I am an AI assistant. I should be helpful.

    *Revised Plan:* Just say hi back and ask how to help.

    *Final Output:* "Hello! 👋 How can I help you today?" or "Hi there! What can I do for you?"

    Let's pick: "Hello! 👋 How can I help you today?" (Simple and effective).

    *Wait, one more consideration:* Sometimes users just want to chat. So "How are you?" is also good.
    "Hi there! How's it going? Is there something specific you'd like to talk about or work on?"

    Let's keep it open-ended but inviting.

    "Hello! 👋 How's it going? Is there anything I can help you with today?"

    Actually, brevity is often better for a "hi".
    "Hi there! 👋 How can I help you today?"

    Okay, ready to output.cw

[End thinking]

Hello! 👋 How's it going? Is there anything I can help you with today?

[ Prompt: 48.2 t/s | Generation: 28.1 t/s ]

> hi

[Start thinking]
-s55=G,G9+125H89-*C"2@'8+=H53CAG%&4H8E<>6'<<.%F."4-<"-B)G1E8/'8+%D*@/2!F7>*C1DD2,(4=)CF;*E;D<CA6AHGA;F(HH<+&$&0GF+4H@'(C0>!%G55:,$E2GB2$'6=9,:,#"/E3G':+$+BBE+5C7B24.4E+-:6$>/E)F,:D63)F(19>AGB(4:88;C/&$5H$68C+50&56:<:!(":<D2A*<0H1&<G4.1E*0HF!4@A9/(BFG1G>!/!7/A#'-*CB,7:60;H"72+;6';#$$(#C#1:4E-*5'#&E)+E!72B6*,$5.G.6$*-H#F!2=:E.E09=!;#$>ABA,2:2G'HE;</)@=-#=*&@C'5DHHE:!5&/+&+))6$F(@-G,/H7/1;%E>$EB(7>#0+D=-C;@<<E,B=7-8#"G2+9DG($<!<*0.<''@&$'5B76"B-9*>8H@CH0C1+H6>7)E9;<A6&<;8(83(!%@15B,8-FEB87=3CG,#4+8>$$&:7H+91&6!F2%DF8>9+8D-88)-)>*!.*6-$!D#4HC4>&>DH7FA:9:&-186-,9&%C&;='/4E>E!7.C3"5%5<'5CEC=*;"!H"'+6%%/<3/>,>2*7='5<*E+,!G@AD4@5H"B2<C,AGBH9G"(#%1E,$6*

[ Prompt: 51.1 t/s | Generation: 29.1 t/s ]

> 

Exiting...
llama_memory_breakdown_print: | memory breakdown [MiB]                     | total   free     self   model   context   compute    unaccounted |
llama_memory_breakdown_print: |   - SYCL0 (Intel(R) Arc(TM) B580 Graphics) | 12216 = 1023 + (10922 = 10282 +     142 +     497) +         270 |
llama_memory_breakdown_print: |   - Host                                   |                 19830 = 19814 +       0 +      16                |
root@cb8f619a68ce:/app# 
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: SYCL: Qwen3.5 spitting garbage on the second prompt #21589

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: SYCL: Qwen3.5 spitting garbage on the second prompt #21589

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions