Skip to content

docker: rebase vllm.patch onto vLLM 0.24.0 (drop sleep hunk, add abort+usage)#308

Open
aoshen02 wants to merge 1 commit into
vllm-project:mainfrom
aoshen02:docker/vllm-patch-024
Open

docker: rebase vllm.patch onto vLLM 0.24.0 (drop sleep hunk, add abort+usage)#308
aoshen02 wants to merge 1 commit into
vllm-project:mainfrom
aoshen02:docker/vllm-patch-024

Conversation

@aoshen02

@aoshen02 aoshen02 commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Base image moved to vllm/vllm-openai:latest-ubuntu2404 (v0.24.0). Rebase docker/patch/latest/vllm.patch accordingly:

Verification

All three hunks checked with git apply --check --allow-empty against a real vllm/vllm-openai:v0.24.0-ubuntu2404 container → RC=0 (minor line offsets absorbed by git apply's context matching).

…t+usage)

The base image moved to vllm/vllm-openai:latest-ubuntu2404 (v0.24.0).

- Drop the vllm/v1/engine/core.py partial-wake/dummy-batch hunk: the
  `if not self.model_executor.is_sleeping` guard is now upstream in
  0.24.0, so patching it is unnecessary (and would fail to apply).
- Keep all2all_utils.py (max_num_tokens) hunk.
- Add the /abort_requests endpoint (api_router.py, vllm-project#296) and the
  GenerateResponse.usage field (disagg/protocol.py, vllm-project#303).

All three hunks verified with `git apply --check` against a real
vllm/vllm-openai:v0.24.0-ubuntu2404 container (RC=0; minor line offsets
absorbed by git apply's context matching).

Signed-off-by: aoshen02 <aoshen@inferact.ai>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant