Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix XLA Compilation warning ready ONLY add when PR is ready to merge/full CI is needed
#2402 opened Apr 25, 2026 by kyuyeunk Collaborator Loading…
Add gather allreduce pipeline ready ONLY add when PR is ready to merge/full CI is needed
#2401 opened Apr 25, 2026 by kyuyeunk Collaborator Draft
Add DCP sharding axis and KV cache support
#2398 opened Apr 25, 2026 by weiyu0824 Collaborator Draft
[Qwen3.5] Enable jittable vision tower for Qwen3.5 ready ONLY add when PR is ready to merge/full CI is needed
#2396 opened Apr 24, 2026 by lk-chen Collaborator Loading…
Optimize GDN conv1d ready ONLY add when PR is ready to merge/full CI is needed
#2394 opened Apr 24, 2026 by helloworld1 Collaborator Loading…
[DEBUG ]Bisect vllm
#2393 opened Apr 24, 2026 by patrickji2014 Collaborator Loading…
Extend attn_dp_expert to emulate attn_dp.
#2392 opened Apr 24, 2026 by NicoGrande Collaborator Loading…
feat: custom traces, flow events, kv cache metadata
#2391 opened Apr 24, 2026 by rushabh-46 Loading…
[TPU KV Offloading] [Feat] KV cache offloading to host memory
#2390 opened Apr 24, 2026 by juncgu-google Collaborator Loading…
[CI] Enhance pipeline metadata validation logic
#2389 opened Apr 24, 2026 by meiyeh123 Collaborator Draft
[CI] Implement interactive wizard for CI model and feature onboarding ready ONLY add when PR is ready to merge/full CI is needed
#2385 opened Apr 24, 2026 by boe20211 Collaborator Loading…
unify dp inputs ready ONLY add when PR is ready to merge/full CI is needed
#2364 opened Apr 22, 2026 by pv97 Collaborator Draft
Fix forward n-d buffer with jitted unpack ready ONLY add when PR is ready to merge/full CI is needed
#2362 opened Apr 22, 2026 by pv97 Collaborator Loading…
update libs - fix sc kernel ready ONLY add when PR is ready to merge/full CI is needed
#2356 opened Apr 22, 2026 by clee1994 Collaborator Loading…
[Kernel][Batched RPA] Increase prefill batch size ready ONLY add when PR is ready to merge/full CI is needed
#2355 opened Apr 22, 2026 by kyuyeunk Collaborator Draft
[DeepSeek] Adding torchax e2e MMLU test ready ONLY add when PR is ready to merge/full CI is needed
#2350 opened Apr 21, 2026 by gpolovets1 Collaborator Loading…
create and opensource kernel tuning infra ready ONLY add when PR is ready to merge/full CI is needed
#2346 opened Apr 21, 2026 by patrickji2014 Collaborator Loading…
Append MoE expert IDs when enable_return_routed_experts is enabled ready ONLY add when PR is ready to merge/full CI is needed
#2343 opened Apr 21, 2026 by pv97 Collaborator Loading…
Add env variable for overriding rpa block sizes ready ONLY add when PR is ready to merge/full CI is needed
#2338 opened Apr 20, 2026 by wenxindongwork Collaborator Loading…
[Disagg/qwen3.5] disagg support for qwen3.5 (4/n bench script) ready ONLY add when PR is ready to merge/full CI is needed
#2337 opened Apr 20, 2026 by wyzhang Collaborator Loading…
[Draft] qwen25 vl refactor ready ONLY add when PR is ready to merge/full CI is needed
#2320 opened Apr 19, 2026 by lk-chen Collaborator Draft
Add support for Int4-CompressedTensors MoE ready ONLY add when PR is ready to merge/full CI is needed
#2306 opened Apr 17, 2026 by dmmolitor Contributor Loading…
ProTip! Filter pull requests by the default branch with base:main.