-
Notifications
You must be signed in to change notification settings - Fork 219
feat: Support lora in dtensor grpo workflow[3/3]: async vllm #1752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ruit/lora_grpo_sync_non_colocated
Are you sure you want to change the base?
Conversation
3f5b2b5 to
ee34dcb
Compare
9bc5186 to
f92d968
Compare
e37c9d9 to
ee92a4d
Compare
62eaaaf to
ab6f639
Compare
|
ab6f639 to
b600e5f
Compare
|
9a1b189 to
2cb73ba
Compare
b600e5f to
3aba604
Compare
|
3aba604 to
8e1312c
Compare
517ab01 to
0bf11eb
Compare
8e1312c to
32d76b9
Compare
0bf11eb to
2436d92
Compare
32d76b9 to
3fdb505
Compare
2436d92 to
cfb4f10
Compare
cfb4f10 to
b880394
Compare
Signed-off-by: ruit <ruit@nvidia.com>
…mode' across multiple interfaces Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
3fdb505 to
cb7c69b
Compare
Signed-off-by: ruit <ruit@nvidia.com>
What does this PR do ?
Support async config for dtensor lora grpo.
TODOs
Issues
[3/3] of #1597
closes #1597
Usage
# Add a code snippet demonstrating how to use thisResult
Async
Qwen/Qwen3-0.6B
Llama-3.2-3B-Instruct
Llama-3.1-8B
Before your PR is "Ready for review"
Pre checks:
Additional Information