Skip to content

fix(opencode): filter empty assistant content for @ai-sdk/openai-compatible#27914

Open
feanor5555 wants to merge 1 commit into
anomalyco:devfrom
feanor5555:pr1-empty-content-filter
Open

fix(opencode): filter empty assistant content for @ai-sdk/openai-compatible#27914
feanor5555 wants to merge 1 commit into
anomalyco:devfrom
feanor5555:pr1-empty-content-filter

Conversation

@feanor5555
Copy link
Copy Markdown

@feanor5555 feanor5555 commented May 16, 2026

Issue for this PR

Closes #27920

Related upstream: ggml-org/llama.cpp#20861, ggml-org/llama.cpp#21889, mastra-ai/mastra#15234 (same symptom in a different framework).

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

llama.cpp returns HTTP 400 "Assistant response prefill is incompatible with enable_thinking" whenever the last message is role:"assistant" while the chat template has enable_thinking on. message-v2.toModelMessagesEffect sometimes emits an assistant UIMessage whose only parts are [step-start, reasoning("")]; convertToModelMessages collapses that to content:"" and triggers the 400.

opencode already drops empty assistant content for @ai-sdk/anthropic and @ai-sdk/amazon-bedrock (two near-identical map+filter chains in transform.ts). This PR:

  • Extracts a single filterEmptyContent(msgs, signatureKey?) helper driven by two lookup tables (EMPTY_CONTENT_FILTER_NPM set, REASONING_SIGNATURE_KEY map). Single-pass for/of replaces map().filter() — two array allocations and one iteration saved per call.
  • Adds @ai-sdk/openai-compatible to the filter set, fixing the 400.
  • Preserves the existing signature/redactedData round-trip for anthropic/bedrock (reasoning parts with empty text but an opaque token are kept).

This is one class of trailing-assistant 400. The non-empty case (e.g. the MAX_STEPS wrap-up prefill in session/prompt.ts) still hits the same template and is covered by a separate PR introducing a Model.prefill capability.

How did you verify your code works?

  • bun test test/provider/transform.test.ts: 231 pass, 0 fail.
  • Replay against a live llama-server with 8 unique 400 bodies extracted from a real opencode session: raw replay → 8/8 still 400 (control); with the filter applied → 5/8 return 200 (all the empty-trailing cases). Remaining 3 are non-empty trailing-assistant, out of scope here.

Screenshots / recordings

N/A — backend change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

llama.cpp's /v1/chat/completions returns HTTP 400 "Assistant response
prefill is incompatible with enable_thinking" whenever the last message
is role:assistant while the chat template has enable_thinking set.
opencode's message-v2.toModelMessagesEffect sometimes emits an assistant
UIMessage whose only parts are [step-start, reasoning("")];
convertToModelMessages collapses that to content:"", which triggers the
400.

This is a class problem, not a Qwen quirk:

- ggml-org/llama.cpp#20861 (exact symptom, opencode + Qwen3.5)
- ggml-org/llama.cpp#21889
- mastra-ai/mastra#15234 (same issue in a different framework)

Every 2025-2026 open-weight thinking family (Qwen3 hybrid/3.5/3.6/
Thinking-2507, QwQ, DeepSeek-R1, GLM-4.6/4.7 thinking, Kimi-K2-Thinking,
MiniMax-M2) has the same incompatibility.

The existing empty-content filter already covers @ai-sdk/anthropic and
@ai-sdk/amazon-bedrock. The two near-identical map().filter() branches
are unified into filterEmptyContent(msgs, signatureKey?), driven by two
small lookup tables (EMPTY_CONTENT_FILTER_NPM set, REASONING_SIGNATURE_KEY
map). Single-pass for/of replaces map().filter(), saving two array
allocations per call. @ai-sdk/openai-compatible is added to the filter
set.

Signature/redactedData round-trip preserved for anthropic/bedrock
(reasoning parts with empty text but opaque token are kept).

Tests:
  - 5 new openai-compatible cases (string-empty, parts-empty,
    reasoning-only, keep tool-call, keep non-empty reasoning)
  - 2 regression guards for anthropic-signature + bedrock-redactedData
    round-trip
@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. needs:title labels May 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Hey! Your PR title transform: filter empty assistant content for @ai-sdk/openai-compatible doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@feanor5555 feanor5555 changed the title transform: filter empty assistant content for @ai-sdk/openai-compatible fix(opencode): filter empty assistant content for @ai-sdk/openai-compatible May 16, 2026
@github-actions github-actions Bot added needs:issue and removed needs:compliance This means the issue will auto-close after 2 hours. needs:title labels May 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trailing-assistant 400 on llama.cpp/vLLM with thinking-on templates (Qwen3, DeepSeek-R1, GLM-thinking, Kimi-K2-Thinking, MiniMax-M2)

1 participant