Skip to content

Conversation

@RolaoDenthu
Copy link

@RolaoDenthu RolaoDenthu commented Dec 21, 2025

What does this PR do ?

Add comprehensive test coverage for SGLang generation backend, including functional tests, unit tests, and nightly tests.

  • Functional Test (tests/functional/grpo_sglang.sh): Quick validation of SGLang-based GRPO training
  • Unit Tests (tests/unit/models/generation/test_sglang_generation.py): unit tests covering:
    • Basic configuration validation
    • Policy generation and tensor parallelism
    • Worker seed behavior for RLHF diversity
    • HTTP server direct API access
    • Weight updates with DTensor policy (colocated mode)
    • Prefix cache reset after weight updates
  • Nightly Test (tests/test_suites/llm/grpo-qwen3-0.6b-1n8g-sglang.sh): End-to-end convergence test for SGLang backend

Usage

  • You can potentially add a usage example below
# Run functional test
uv add coverage
bash tests/functional/grpo_sglang.sh

# Run unit tests
uv sync --extra sglang --group test
uv run python -m pytest tests/unit/models/generation/test_sglang_generation.py -v --sglang-only

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Summary by CodeRabbit

Release Notes

  • New Features

    • Distributed generation engine using SGLang backend with HTTP weight streaming and multi-GPU support.
  • Configuration

    • New YAML configuration templates for SGLang-based experiments with customizable generation parameters.
  • Tests

    • Comprehensive test coverage for SGLang generation, including tensor parallelism, batching, and dynamic weight updates.

✏️ Tip: You can customize this high-level summary in your review settings.

PrinsYin and others added 30 commits December 6, 2025 21:12
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
…a server

Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
…p servers

Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
sglang: add 1B example
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Ryan <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
- Convert SGLangConfig from regular class to TypedDict inheriting GenerationConfig
- Align structure with VllmConfig pattern for consistency
- Mark all fields as NotRequired for backward compatibility
- Add sglang_kwargs field for additional ServerArgs parameters
- Add type casting in grpo.py for type safety

This maintains backward compatibility while aligning with the existing
generation config structure pattern.

Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Signed-off-by: Zhuoran Yin <yzr1914001753@gmail.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Signed-off-by: Night <32424487+PrinsYin@users.noreply.github.com>
@guyueh1 guyueh1 added the CI:L2 Run doctests, unit tests, functional tests, and convergence tests label Jan 18, 2026
Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>
@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026
Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>
@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026
Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>
@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026
@chtruong814 chtruong814 added the needs-follow-up Issue needs follow-up label Jan 19, 2026
Signed-off-by: RolaoDenthu <xinyis10@illinois.edu>
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 19, 2026
@RolaoDenthu
Copy link
Author

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters: These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

SGLang supports the weight update function only for DTensor v2, while the original DTensor worker does not. Therefore, this change is intentionally applied only to dtensor_policy_worker_v2.py.

@guyueh1
Copy link
Contributor

guyueh1 commented Jan 20, 2026

⚠️ File Consistency Check

Check based on commit: d037f71 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.
Why this matters: These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.
Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

SGLang supports the weight update function only for DTensor v2, while the original DTensor worker does not. Therefore, this change is intentionally applied only to dtensor_policy_worker_v2.py.

I think this is fine to ignore, the API is defined in the base worker as "not implemented", so there is no risk that calling this method with dtensor (v1) object will cause a crash, but it will be caught by a not implemented error.

@chtruong814 chtruong814 removed the needs-follow-up Issue needs follow-up label Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L2 Run doctests, unit tests, functional tests, and convergence tests community-request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants