feat(core): add Model.prefill capability for trailing-assistant support#27915
Open
feanor5555 wants to merge 1 commit into
Open
feat(core): add Model.prefill capability for trailing-assistant support#27915feanor5555 wants to merge 1 commit into
feanor5555 wants to merge 1 commit into
Conversation
Anthropic-style providers accept (and rely on) an assistant message as
the last turn in a conversation ("response continuation" / "prefill"
for tool-use continuation). Most other thinking-on-by-default templates
reject it outright — llama.cpp returns HTTP 400 "Assistant response
prefill is incompatible with enable_thinking" on Qwen3-family templates,
and vLLM/TGI have equivalent behaviour for DeepSeek-R1, GLM-4.6 thinking,
Kimi-K2-Thinking, etc.
A first-class `prefill: boolean` on Model lets every host (opencode,
mastra, others) consult one canonical source of truth instead of
guessing from npm package + reasoning flag.
- packages/core/src/models.ts: add optional prefill field on Model
with a per-family list of templates known to reject prefill
(Qwen3 hybrid/3.5/3.6/Thinking-2507/VL, QwQ, DeepSeek-R1/R1-0528/V4,
GLM-4.6/4.7-thinking, Kimi-K2-Thinking, MiniMax-M2).
- packages/opencode/src/config/provider.ts: mirror the field on the
user-facing config schema with an annotation describing when to set
it (and what the auto-default is for openai-compatible+reasoning).
Default (undefined) is treated as `true` to keep all existing models
unaffected. Consumer-side logic lives in a follow-up PR.
Sister-PR to a sst/models.dev data PR that will populate prefill: false
on the affected per-model entries.
Contributor
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
6 tasks
Contributor
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #27920
Type of change
What does this PR do?
Adds an optional
prefill: booleanon theModelschema (and on the user-facing provider config schema) without any consumer-side wiring. A first-class capability lets every host consult one canonical source instead of guessing from npm package + reasoning flag.Anthropic-style providers accept (and rely on) an assistant message as the last turn in a conversation ("response continuation" / "prefill"). Most thinking-on-by-default templates reject it outright — llama.cpp returns
HTTP 400 "Assistant response prefill is incompatible with enable_thinking"on Qwen3-family templates, and vLLM/TGI have equivalent behaviour for DeepSeek-R1, GLM-4.6 thinking, Kimi-K2-Thinking, etc.Default (
undefined) is treated astrue, so all existing models are unaffected by this PR alone.Models intended to carry
prefill: falsein a follow-upsst/models.devdata PR:Unchanged (still allow prefill): Qwen2.5, Qwen3-Coder, Qwen3-Instruct-2507, all Anthropic / OpenAI / Azure / Google / Bedrock-Anthropic.
Files:
packages/core/src/models.ts— addprefill: Schema.optional(Schema.Boolean)with a per-family list in the schema comment.packages/opencode/src/config/provider.ts— mirror the field on the user-facing config schema with an annotation describing when to set it.The consumer is split into a separate PR (
ProviderTransform.canAcceptTrailingAssistant, MAX_STEPS role routing insession/prompt.ts, and a runtime probe of llama.cpp's/propsendpoint).How did you verify your code works?
bun run typecheckclean. No runtime behaviour change in this PR — the field is read by code that arrives in the follow-up.Screenshots / recordings
N/A — schema-only change.
Checklist