forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 0
Pull requests: doublewordai/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: in-VRAM compressed weight storage (VLLM_COMPRESS_WEIGHTS)
#5
opened Apr 17, 2026 by
fergusfinn
Loading…
4 of 7 tasks
perf: add is_reasoning_end_streaming() override to GptOssReasoningParser
#4
opened Mar 2, 2026 by
fergusfinn
Loading…
4 tasks done
perf: faster float embedding serialization (~1.7x)
#3
opened Feb 15, 2026 by
fergusfinn
Loading…
3 of 4 tasks
fix: preserve parameter attrs during weight reload (FP8 block + MXFP4 MoE)
#2
opened Feb 11, 2026 by
fergusfinn
Loading…
4 tasks
NCCL suspend/resume for cuda-checkpoint at TP>1
#1
opened Feb 10, 2026 by
fergusfinn
Loading…
5 tasks done
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.