-
-
Notifications
You must be signed in to change notification settings - Fork 16k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix][ROCm] Fix gemm_a4w4 call to use updated AITER API signature
bug
Something isn't working
rocm
Related to AMD ROCm
#40754
opened Apr 24, 2026 by
chelnnexy
Loading…
4 tasks
[vllm IR] Enable rms norm gate IR on XPU
intel-gpu
Related to Intel GPU
needs-rebase
nvidia
rocm
Related to AMD ROCm
#40753
opened Apr 24, 2026 by
chaojun-zhang
Contributor
•
Draft
4 tasks
Enable Expert Parallel Load Balancing (EPLB) for Kimi K2.5/2.6 and Marlin Kernel
#40752
opened Apr 24, 2026 by
0xjunhao
Contributor
Loading…
[Attention] Enable TRITON_MLA MTP full CUDA graphs for Kimi on Blackwell
nvidia
speculative-decoding
v1
#40750
opened Apr 24, 2026 by
voipmonitor
Contributor
•
Draft
fix amd Basic Models Tests (Other)
rocm
Related to AMD ROCm
#40745
opened Apr 23, 2026 by
Concurrensee
Contributor
•
Draft
[Frontend] Delegate to vLLM Omni When
--omni Passed
frontend
#40744
opened Apr 23, 2026 by
alex-jw-brooks
Contributor
Loading…
[Test] Fix test_dynamic_shapes_compilation for torch 2.12
#40743
opened Apr 23, 2026 by
angelayi
Contributor
Loading…
[Bugfix] Fix device mismatch triggering in testing
bug
Something isn't working
#40739
opened Apr 23, 2026 by
Lucaskabela
Contributor
Loading…
3 of 4 tasks
[Bugfix] Fix degenerate KV cache stride causing TMA cudaErrorIllegalInstruction
bug
Something isn't working
nvidia
v1
#40737
opened Apr 23, 2026 by
the-david-oy
Loading…
[MoE Refactor] Introduce RoutedExperts alias for FusedMoE and don't store SharedExperts in MK
nvidia
ready
ONLY add when PR is ready to merge/full CI is needed
#40735
opened Apr 23, 2026 by
bnellnm
Collaborator
Loading…
4 tasks
[Bugfix] Fix max_num_batched_token not captured in cuda graph
bug
Something isn't working
nvidia
#40734
opened Apr 23, 2026 by
wzhao18
Contributor
Loading…
4 tasks
[RFC][EPLB][#32028] Remove dead torch.accelerator.synchronize() from sync path
#40733
opened Apr 23, 2026 by
SandishKumarHN
Contributor
Loading…
4 tasks
[Doc] fix capitalization consistency in README (vLLM, Hugging Face)
documentation
Improvements or additions to documentation
#40729
opened Apr 23, 2026 by
VinayakMishra95
Loading…
4 tasks
[Perf][Bugfix] Update dflash aux layer indexing
bug
Something isn't working
v1
#40727
opened Apr 23, 2026 by
benchislett
Collaborator
Loading…
[Bugfix] Fix codegen for unqualified names
bug
Something isn't working
#40726
opened Apr 23, 2026 by
Lucaskabela
Contributor
Loading…
3 of 4 tasks
Fix Nano Nemotron VL static image inputs
ready
ONLY add when PR is ready to merge/full CI is needed
#40724
opened Apr 23, 2026 by
milesial
Contributor
Loading…
[torch.compile] Add hierarchical module trace dump for FX graphs
#40721
opened Apr 23, 2026 by
LeoYangXY
Loading…
3 of 4 tasks
feat: Enable Improvements or additions to documentation
frontend
v1
prompt_embeds Content Part Support in vLLM Chat Completions API
documentation
#40720
opened Apr 23, 2026 by
LuisRobaina
Loading…
7 tasks done
[fix] mismatch dim during capture graph if with --gpu-memory-utilization
v1
#40719
opened Apr 23, 2026 by
ir1ka
Contributor
Loading…
3 of 4 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.