docker: rebase vllm.patch onto vLLM 0.24.0 (drop sleep hunk, add abort+usage) by aoshen02 · Pull Request #308 · vllm-project/vime

aoshen02 · 2026-07-01T11:50:17Z

Summary

Base image moved to vllm/vllm-openai:latest-ubuntu2404 (v0.24.0). Rebase docker/patch/latest/vllm.patch accordingly:

Drop the vllm/v1/engine/core.py partial-wake / dummy-batch hunk — the if not self.model_executor.is_sleeping guard is now upstream in 0.24.0, so patching it is unnecessary and would fail to apply.
Keep the all2all_utils.py (max_num_tokens = moe.max_num_tokens) hunk.
Add /abort_requests endpoint (api_router.py, from fix(rollout): abort vLLM rollout via delete-type /abort_requests #296) and GenerateResponse.usage field (disagg/protocol.py, from [Bugfix][Rollout] Wire prefix_cache_hit_rate through vLLM usage #303).

Verification

All three hunks checked with git apply --check --allow-empty against a real vllm/vllm-openai:v0.24.0-ubuntu2404 container → RC=0 (minor line offsets absorbed by git apply's context matching).

…t+usage) The base image moved to vllm/vllm-openai:latest-ubuntu2404 (v0.24.0). - Drop the vllm/v1/engine/core.py partial-wake/dummy-batch hunk: the `if not self.model_executor.is_sleeping` guard is now upstream in 0.24.0, so patching it is unnecessary (and would fail to apply). - Keep all2all_utils.py (max_num_tokens) hunk. - Add the /abort_requests endpoint (api_router.py, vllm-project#296) and the GenerateResponse.usage field (disagg/protocol.py, vllm-project#303). All three hunks verified with `git apply --check` against a real vllm/vllm-openai:v0.24.0-ubuntu2404 container (RC=0; minor line offsets absorbed by git apply's context matching). Signed-off-by: aoshen02 <aoshen@inferact.ai> Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

read-the-docs-community · 2026-07-01T11:51:17Z

Documentation build overview

📚 vime | 🛠️ Build #33392245 | 📁 Comparing 26feba4 against latest (5532763)

🔍 Preview build

31 files changed · ± 17 modified · - 14 deleted

± Modified

- Deleted

gemini-code-assist · 2026-07-01T12:04:10Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docker: rebase vllm.patch onto vLLM 0.24.0 (drop sleep hunk, add abort+usage)#308

docker: rebase vllm.patch onto vLLM 0.24.0 (drop sleep hunk, add abort+usage)#308
aoshen02 wants to merge 1 commit into
vllm-project:mainfrom
aoshen02:docker/vllm-patch-024

aoshen02 commented Jul 1, 2026

Uh oh!

read-the-docs-community Bot commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

aoshen02 commented Jul 1, 2026

Summary

Verification

Uh oh!

read-the-docs-community Bot commented Jul 1, 2026

Documentation build overview

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant