Misc. bug: Autoparser throws on final parse after streaming succeeds

### Name and Version

$ ./llama-server --version
version: 8455 (58c81f7e8)
built with AppleClang 17.0.0.17000319 for Darwin arm64

### Operating systems

Windows, BSD, Mac, Linux

### Which llama.cpp modules do you know to be affected?

libllama (core library), llama-server, llama-cli

### Command line

```shell
# Repro 1: Llama 3.2
llama-server -m Llama-3.2-3B-Instruct-Q4_K_M.gguf --jinja
curl http://localhost:8080/v1/chat/completions -d '{"messages":[{"role":"user","content":"Write a hello world C program. Just the code, no explanation."}],"tools":[{"type":"function","function":{"name":"get_weather","description":"Get weather","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}],"temperature":0,"max_tokens":200}'

# Repro 2: GPT-OSS
llama-server -m gpt-oss-20b-mxfp4.gguf --jinja
curl http://localhost:8080/v1/chat/completions -d '{"messages":[{"role":"user","content":"Give me a person"}],"response_format":{"type":"json_schema","json_schema":{"name":"p","schema":{"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}},"required":["name","age"]}}},"temperature":0,"max_tokens":200}'
```

### Problem description & steps to reproduce

Since #18675 (autoparser refactoring), `common_chat_peg_parse` throws `std::runtime_error` (chat.cpp L1792) when the PEG parser can't consume the entire model output. The models below produce well-formed, useful output. The parser successfully extracts it during streaming (partial parse). On the final parse of the same text, `result.fail()` is true and the server throws.

`result.fail()` here does not mean the output is malformed. It means the parser stopped consuming input before reaching end-of-input. The content up to that point was already successfully parsed and, in streaming mode, delivered to the client. The throw discards a valid response and returns HTTP 500.

In streaming mode, this is a major server malfunction: SSE chunks are delivered to the client, then the stream terminates without a `finish_reason` chunk or `data: [DONE]`. Clients expecting a well-formed SSE sequence get a raw JSON error object injected into the event stream instead.

This also affects library consumers calling `common_chat_templates_apply` / `common_chat_parse` with tools directly (e.g. [fllama](https://github.com/Telosnex/fllama), which calls the same library code path, not HTTP).

### First Bad Commit

566059a26

### Relevant log output

[gptoss_repro_server.log](https://github.com/user-attachments/files/26147549/gptoss_repro_server.log)
[llama_server_log.txt](https://github.com/user-attachments/files/26147552/llama_server_log.txt)
[gptoss_streaming.txt](https://github.com/user-attachments/files/26147551/gptoss_streaming.txt)
[llama_streaming.txt](https://github.com/user-attachments/files/26147550/llama_streaming.txt)

<details>
<summary>Logs</summary>


```console

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: Autoparser throws on final parse after streaming succeeds #20814

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Autoparser throws on final parse after streaming succeeds #20814

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions