[Frontend] Allow users to modify the scheduler configuration online in dev mode. #30316

noooop · 2025-12-09T08:06:23Z

Purpose

Allow users to modify part of the scheduler configuration online, which will greatly simplify the benchmark process.

e.g.

vllm bench sweep only needs to be started once

Test Plan

benchmark demo:
offline: https://github.com/noooop/snippet/blob/main/benchmarks/embed5/v1_offline.py
online: https://github.com/noooop/snippet/blob/main/benchmarks/embed5/v1_online.py

Test Result

nan

Known Issues

cudagraph_capture_sizes are different, the results will be slightly different.

[1, 2]
[1, 2, 4, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96, 104, 112, 120, 128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 208, 216, 224, 232, 240, 248, 256]

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

gemini-code-assist

Code Review

The pull request introduces functionality to dynamically reconfigure scheduler parameters (max_num_seqs and max_num_batched_tokens) online. This is a valuable addition for benchmarking and dynamic resource management. The changes involve defining a SchedulerReconfigure data structure, adding reconfigure_scheduler methods across the engine components (LLM, EngineCoreClient, EngineCore), and implementing the reconfigure method in the Scheduler class. I've identified a critical bug in the fallback logic for max_num_batched_tokens and a high-severity design constraint regarding the modification limits.

vllm/v1/engine/core.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

DarkLight1337 · 2025-12-09T12:07:52Z

Supporting this will introduce some constraints on the scheduler since it cannot assume that these parameters are constant anymore. That being said, I definitely see the value of being able to adjust the scheduling parameters on the fly as we don't have to restart the server each time the parameters change during benchmarking.

@WoosukKwon @njhill WDYT about this?

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

mergify · 2025-12-11T08:22:49Z

Hi @noooop, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

lengrongfu · 2025-12-11T14:24:54Z

vllm/engine/protocol.py

        raise NotImplementedError
+
+    def reconfigure(
+        self, max_num_seqs: int | None, max_num_batched_tokens: int | None


Could this parameter be designed to be more universal? If other parameters need to be modified in the future, there's no need to add new parameters. It's suggested to pass a structure or dic instead.

I also agree that passing a structure or dict would be better. Initially, the first version created a structure, but since there are only two parameters that can be modified, adding a structure just for that would make the code overly verbose. That’s why it was changed to the current approach.

offline part

e4c50d9

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

mergify bot added frontend v1 labels Dec 9, 2025

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

vllm/v1/engine/core.py Outdated Show resolved Hide resolved

vllm/v1/engine/core.py Outdated Show resolved Hide resolved

noooop and others added 2 commits December 9, 2025 16:25

Update vllm/v1/engine/core.py

8432f09

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>

+ comment

fb3019f

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

noooop changed the title ~~[Frontend] Allow users to modify part of the scheduler configuration online.~~ [Frontend] Allow users to modify the scheduler configuration online. Dec 9, 2025

noooop changed the title ~~[Frontend] Allow users to modify the scheduler configuration online.~~ [Frontend] Allow users to modify the scheduler configuration online in dev mode. Dec 9, 2025

noooop added 5 commits December 11, 2025 10:42

Merge branch 'main' into online_modify_config

cd2dc95

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

- SchedulerReconfigure

4be9154

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

Merge branch 'main' into online_modify_config

5406b5a

+ online part

df83ded

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

Merge branch 'main' into online_modify_config

0c17b7d

noooop marked this pull request as ready for review December 11, 2025 07:57

noooop requested review from ApostaC, ProExpertProg, WoosukKwon, aarnphm, alexm-redhat, heheda12345, hmellor, houseroad, mgoin, njhill, robertgshaw2-redhat, tlrmchlsmth, yewentao256, youkaichao and ywang96 as code owners December 11, 2025 07:57

noooop requested a review from chaunceyjiang as a code owner December 11, 2025 07:57

mypy

70b5c0b

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

lengrongfu reviewed Dec 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Frontend] Allow users to modify the scheduler configuration online in dev mode. #30316

[Frontend] Allow users to modify the scheduler configuration online in dev mode. #30316

Uh oh!

noooop commented Dec 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented Dec 9, 2025

Uh oh!

mergify bot commented Dec 11, 2025

Uh oh!

lengrongfu Dec 11, 2025

Uh oh!

noooop Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Frontend] Allow users to modify the scheduler configuration online in dev mode. #30316

Are you sure you want to change the base?

[Frontend] Allow users to modify the scheduler configuration online in dev mode. #30316

Uh oh!

Conversation

noooop commented Dec 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Known Issues

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented Dec 9, 2025

Uh oh!

mergify bot commented Dec 11, 2025

Uh oh!

lengrongfu Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

noooop Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

noooop commented Dec 9, 2025 •

edited by github-actions bot

Loading