Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[KV Transfer] Add MooncakeStoreConnector for KV cache offloading via Mooncake distributed store documentation Improvements or additions to documentation kv-connector v1
#40900 opened Apr 26, 2026 by LCAIZJ Contributor Loading…
DeepSeek V4 support on SM12x with Triton sparse MLA fallback ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models kv-connector new-model Requests to new models nvidia speculative-decoding tool-calling v1
#40899 opened Apr 26, 2026 by jasl Contributor Loading…
[Core] Added Sleep 3 frontend v1
#40897 opened Apr 26, 2026 by kaleox-dev Loading…
4 tasks done
[Metrics] Export parallel config info v1
#40895 opened Apr 26, 2026 by voipmonitor Contributor Draft
feat: integrate builtin_structural_tag to support more models' tool-calling. deepseek Related to DeepSeek models needs-rebase qwen Related to Qwen models tool-calling
#40894 opened Apr 26, 2026 by Seven-Streams Loading…
3 of 5 tasks
[Bugfix] Size FlashInfer NVLink MNNVL workspace to EP group bug Something isn't working deepseek Related to DeepSeek models kv-connector ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#40893 opened Apr 26, 2026 by Dao007forever Loading…
[ROCm][DSv4] Make AITER sparse MLA decode cudagraph-clean (follow-up to #40889) ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models kv-connector needs-rebase new-model Requests to new models nvidia performance Performance-related issues rocm Related to AMD ROCm speculative-decoding tool-calling v1
#40892 opened Apr 26, 2026 by ChuanLi1101 Collaborator Loading…
[Core] Avoid using extra thread in UniProcExecutor v1
#40891 opened Apr 26, 2026 by njhill Member Loading…
[ROCm] Add AITER-accelerated MLA decode for DeepSeek V4 on MI355X ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models kv-connector needs-rebase new-model Requests to new models nvidia performance Performance-related issues rocm Related to AMD ROCm speculative-decoding tool-calling v1
#40889 opened Apr 25, 2026 by ChuanLi1101 Collaborator Loading…
3 of 5 tasks
[Bugfix][Model] Qwen3-VL-MoE NVFP4 (ModelOpt) per-expert weight loading bug Something isn't working qwen Related to Qwen models
#40888 opened Apr 25, 2026 by Code4me2 Contributor Loading…
5 tasks done
[Bugfix] Run FlashInfer autotuning before KV cache allocation bug Something isn't working v1
#40887 opened Apr 25, 2026 by bhoomit Contributor Loading…
[Doc] Clarify Qwen3-Omni OpenAI transcription client and docs (#29405) documentation Improvements or additions to documentation qwen Related to Qwen models
#40884 opened Apr 25, 2026 by happybhati Loading…
[vLLM IR] Fixes for Triton implementations
#40883 opened Apr 25, 2026 by ProExpertProg Collaborator Draft
elastic_ep: stage/commit MoE prepare/finalize on reconfigure
#40881 opened Apr 25, 2026 by itayalroy Contributor Loading…
[V1][Scheduler] Use list-slice compare in _has_repeating_pattern v1
#40879 opened Apr 25, 2026 by aaronagent Loading…
3 tasks done
[New Model][ROCm] Add AMD support for DeepSeek V4 ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models kv-connector needs-rebase new-model Requests to new models nvidia performance Performance-related issues rocm Related to AMD ROCm speculative-decoding tool-calling v1
#40871 opened Apr 25, 2026 by whx-sjtu Contributor Draft
4 tasks
[LoRA] Initial EP support for LoRA gpt-oss Related to GPT-OSS models qwen Related to Qwen models
#40867 opened Apr 25, 2026 by jeejeelee Collaborator Draft
4 tasks
ProTip! Follow long discussions with comments:>50.