-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix][CI] Fix Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
tests/distributed/test_torchrun_example_moe.py
bug
#40349
opened Apr 20, 2026 by
NickLucche
Collaborator
Loading…
[Bugfix][Reasoning] Strip grouped think markers from streaming deltas
bug
Something isn't working
#40348
opened Apr 20, 2026 by
wuyingjun-lucky
Contributor
Loading…
[Bugfix][Gemma4] Fix vision fp16 overflow causing <pad> output
bug
Something isn't working
#40347
opened Apr 20, 2026 by
wenqiangire-commits
Loading…
[KV Offload] Offload all KV blocks when doing prefill in P/D
kv-connector
#40346
opened Apr 20, 2026 by
omerpaz95
Contributor
Loading…
[Fix] Resolve MoRI connector hangs at high concurrency
kv-connector
v1
#40344
opened Apr 20, 2026 by
simondanielsson
Contributor
•
Draft
4 tasks
[Docs] [Misc] add sig list table in community governance process
documentation
Improvements or additions to documentation
#40342
opened Apr 20, 2026 by
pacoxu
Contributor
Loading…
1 of 4 tasks
[Perf][MoE][ROCm][Kimi-K2.5] Remove a redundant per-forward-pass dtype conversion of the routing bias parameter in DeepSeek-V2/V3 MoE
deepseek
#40341
opened Apr 20, 2026 by
xaguilar-amd
Contributor
Loading…
[Bugfix] Normalize malformed dict prompts that carry token IDs in Something isn't working
verified
Run pre-commit for new contributors without triggering other tests
prompt
bug
#40339
opened Apr 20, 2026 by
Alchuang22-dev
Loading…
[Perf] Integrate flash-maxsim Triton kernels for late-interaction scoring
v1
verified
Run pre-commit for new contributors without triggering other tests
#40337
opened Apr 20, 2026 by
roipony
Loading…
5 of 6 tasks
Qwen 3 VL: Track and use buffer correctly
qwen
Related to Qwen models
#40336
opened Apr 20, 2026 by
wdhongtw
Contributor
Loading…
3 of 4 tasks
[MM][Misc] Support image+video mixed inputs (per prompt) for VLM examples
documentation
Improvements or additions to documentation
#40335
opened Apr 20, 2026 by
shen-shanshan
Contributor
Loading…
3 of 4 tasks
[Model] fix(dflash): dtype mismatch in combine_hidden_states
qwen
Related to Qwen models
#40334
opened Apr 20, 2026 by
ciphernaut
Loading…
3 of 4 tasks
[ROCm] Allow Triton MXFP4 MoE support checks on gfx11xx
gpt-oss
Related to GPT-OSS models
rocm
Related to AMD ROCm
#40333
opened Apr 20, 2026 by
wangrui6
Loading…
3 of 4 tasks
[Feat][KVConnector] Prepend offloaded blocks on offloading complete for lazy mode in simple cpu offloader
v1
#40332
opened Apr 20, 2026 by
cblmemo
Loading…
4 tasks
[Startup] Parallelize torch/transformers import + weight prefetch + forkserver prewarm
frontend
#40331
opened Apr 20, 2026 by
simon-mo
Collaborator
Loading…
4 tasks
[Startup] Import hygiene for api_server hot path
frontend
#40328
opened Apr 20, 2026 by
simon-mo
Collaborator
Loading…
5 tasks
attention: add USE_TD constexpr for tensor descriptor Q/K/V load/store
v1
#40327
opened Apr 20, 2026 by
afierka-intel
Loading…
[Doc] Sync CLI guide with actual help modes and launch subcommand
documentation
Improvements or additions to documentation
#40326
opened Apr 20, 2026 by
wangrui6
Loading…
4 tasks
[vLLM IR] Update the pre commit to enforce imports of vllm
#40325
opened Apr 20, 2026 by
R3hankhan123
Contributor
Loading…
4 tasks
Fix Gemma 4 + BitsAndBytes startup failure reported in #38884
#40321
opened Apr 20, 2026 by
SouthWest7
Contributor
Loading…
5 tasks
[vLLM IR] Add vllm ir lowering pass e2e test
#40319
opened Apr 20, 2026 by
Alex-ai-future
•
Draft
4 tasks
[Docs] [QeRL] Layerwise Reloading Documentation
documentation
Improvements or additions to documentation
#40317
opened Apr 20, 2026 by
kylesayrs
Contributor
Loading…
Revert "Fix MoE backend selection for LoRA (unquantized MoE)" (#40273)
#40313
opened Apr 20, 2026 by
vllm-agent
•
Draft
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.