feat: Add SGLang rollout backend and tests #1674

RolaoDenthu · 2025-12-21T02:58:50Z

What does this PR do ?

Add comprehensive test coverage for SGLang generation backend, including functional tests, unit tests, and nightly tests.

Functional Test (tests/functional/grpo_sglang.sh): Quick validation of SGLang-based GRPO training
Unit Tests (tests/unit/models/generation/test_sglang_generation.py): unit tests covering:
- Basic configuration validation
- Policy generation and tensor parallelism
- Worker seed behavior for RLHF diversity
- HTTP server direct API access
- Weight updates with DTensor policy (colocated mode)
- Prefix cache reset after weight updates
Nightly Test (tests/test_suites/llm/grpo-qwen3-0.6b-1n8g-sglang.sh): End-to-end convergence test for SGLang backend

Usage

You can potentially add a usage example below

# Run functional test
uv add coverage
bash tests/functional/grpo_sglang.sh

# Run unit tests
uv sync --extra sglang --group test
uv run python -m pytest tests/unit/models/generation/test_sglang_generation.py -v --sglang-only

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Summary by CodeRabbit

Release Notes

New Features
- Distributed generation engine using SGLang backend with HTTP weight streaming and multi-GPU support.
Configuration
- New YAML configuration templates for SGLang-based experiments with customizable generation parameters.
Tests
- Comprehensive test coverage for SGLang generation, including tensor parallelism, batching, and dynamic weight updates.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

…a server Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

…p servers Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: add 1B example Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

- Convert SGLangConfig from regular class to TypedDict inheriting GenerationConfig - Align structure with VllmConfig pattern for consistency - Mark all fields as NotRequired for backward compatibility - Add sglang_kwargs field for additional ServerArgs parameters - Add type casting in grpo.py for type safety This maintains backward compatibility while aligning with the existing generation config structure pattern. Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Night <32424487+PrinsYin@users.noreply.github.com>

Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>

github-actions · 2026-01-19T23:29:55Z

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

RolaoDenthu · 2026-01-20T03:34:39Z

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters: These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py

Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency

If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

SGLang supports the weight update function only for DTensor v2, while the original DTensor worker does not. Therefore, this change is intentionally applied only to dtensor_policy_worker_v2.py.

guyueh1 · 2026-01-20T03:37:55Z

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.
Why this matters: These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.
Action required:

Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py

Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency

If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

SGLang supports the weight update function only for DTensor v2, while the original DTensor worker does not. Therefore, this change is intentionally applied only to dtensor_policy_worker_v2.py.

I think this is fine to ignore, the API is defined in the base worker as "not implemented", so there is no risk that calling this method with dtensor (v1) object will cause a crash, but it will be caught by a not implemented error.

PrinsYin and others added 30 commits December 6, 2025 21:12

sglang support:initial commit

d9cf489

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang:manually set cuda visible to let localran=0 to manage gpus of …

3eace5f

…a server Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: add sglang setup in grpo.py, add find available port to set u…

6fbbbb7

…p servers Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: add shutdown

242612c

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang server: fix gpu allocation when tp =1

a3d8ad6

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

generate only first request

88971e3

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

fix : choose the correct gpu using base gpu id

db8b07b

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

asyncio to roolout all saples

dd0e54f

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

fix new event loop for rollout

21c54e3

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

added mem_fraction

5e24fab

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

modified build_sampling_paras and stop token handling

50189a9

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

temp: prevent server overlaod with semaphore

ec35b6b

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: refactor, move async loop position

f099caa

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: fix total length in generate

a03eba8

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: env setup

e08cfd6

sglang: add 1B example Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

from tensor:

ccc66f6

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang refit: fix sglang import

2ce928b

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

fix: match fsdp ranks correctly with sglang

4aa1e74

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

flush cache before update begins

9098077

Signed-off-by: Ryan <yzr1914001753@gmail.com> Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Fix SGLang compatibility: add hasattr checks for vLLM-specific methods

9900a33

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

sglang: modified config (increase mem_fration, enable wandb)

5cb78e3

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

refactor(grpo): extract init logic for generation backends

03d9d0c

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

refactor: generalize logger metrics for all generation backends

f1c26dd

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

refactor sglang config loading to make it consistent with other backendw

255dcc6

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

resolved ai comments

ee01f91

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

changed print to using loging

e25e573

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>

Merge branch 'main' into sglang_server

e93699f

Update nemo_rl/models/generation/sglang/sglang_worker.py

85d6a92

Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Night <32424487+PrinsYin@users.noreply.github.com>

Merge branch 'main' into sglang_server

be1ae27

guyueh1 added the CI:L2 Run doctests, unit tests, functional tests, and convergence tests label Jan 18, 2026

guyueh1 had a problem deploying to nemo-ci January 18, 2026 22:32 — with GitHub Actions Failure

update uv.lock

9118b1d

Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026

guyueh1 temporarily deployed to nemo-ci January 19, 2026 01:11 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 19, 2026 03:43 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 19, 2026 05:37 — with GitHub Actions Inactive

add more tests

556c42c

Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026

guyueh1 temporarily deployed to nemo-ci January 19, 2026 18:13 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 19, 2026 18:17 — with GitHub Actions Inactive

add copyright

72b8fd1

Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026

guyueh1 temporarily deployed to nemo-ci January 19, 2026 20:47 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 19, 2026 20:50 — with GitHub Actions Inactive

chtruong814 added the needs-follow-up Issue needs follow-up label Jan 19, 2026

fix

d037f71

Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026

guyueh1 temporarily deployed to nemo-ci January 19, 2026 23:37 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 19, 2026 23:55 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 20, 2026 01:59 — with GitHub Actions Inactive

guyueh1 approved these changes Jan 20, 2026

View reviewed changes

chtruong814 removed the needs-follow-up Issue needs follow-up label Jan 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add SGLang rollout backend and tests #1674

feat: Add SGLang rollout backend and tests #1674

Uh oh!

RolaoDenthu commented Dec 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

RolaoDenthu commented Jan 20, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

guyueh1 commented Jan 20, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat: Add SGLang rollout backend and tests #1674

Are you sure you want to change the base?

feat: Add SGLang rollout backend and tests #1674

Uh oh!

Conversation

RolaoDenthu commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Usage

Before your PR is "Ready for review"

Summary by CodeRabbit

Release Notes

Uh oh!

github-actions bot commented Jan 19, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

RolaoDenthu commented Jan 20, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

guyueh1 commented Jan 20, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

RolaoDenthu commented Dec 21, 2025 •

edited

Loading