Commit dba4208
committed
feat: Add PRESHARDED LoadFormat for zero-disk P2P RDMA weight loading
Add LoadFormat.PRESHARDED for loading model weights that are already
sharded per TP rank, enabling zero-disk P2P RDMA weight transfers
where each MPI worker receives only its own shard directly into GPU
memory via ModelExpress.
Changes:
- llm_args.py: Add PRESHARDED = 3 to LoadFormat enum
- model_loader.py: PRESHARDED branch with _weights_presharded flag,
publish hook before post_load_weights (auto-detect via MODEL_EXPRESS_URL)
- linear.py: Override tp_size to 1 when _weights_presharded=True
- worker.py: publish_from_worker hook in setup_engine (auto-detect)
Source publishes weights before post_load_weights so targets receive
pre-processed weights and run their own transforms independently.
Auto-detects source role when MODEL_EXPRESS_URL is set and
MODEL_EXPRESS_TARGET is not set.
Validated: Kimi K2.5 (TP=8, MoE, nvfp4) on GCP GB200 at 365-509 Gbps.
Signed-off-by: Kavin Krishnan <kavink@nvidia.com>
Made-with: Cursor1 parent 889b81c commit dba4208
4 files changed
Lines changed: 81 additions & 14 deletions
File tree
- tensorrt_llm
- _torch
- modules
- pyexecutor
- executor
- llmapi
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
161 | 160 | | |
162 | 161 | | |
163 | 162 | | |
164 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
165 | 171 | | |
166 | 172 | | |
167 | 173 | | |
| |||
183 | 189 | | |
184 | 190 | | |
185 | 191 | | |
| 192 | + | |
| 193 | + | |
186 | 194 | | |
187 | | - | |
| 195 | + | |
188 | 196 | | |
189 | 197 | | |
190 | 198 | | |
| |||
201 | 209 | | |
202 | 210 | | |
203 | 211 | | |
204 | | - | |
| 212 | + | |
205 | 213 | | |
206 | 214 | | |
207 | 215 | | |
| |||
224 | 232 | | |
225 | 233 | | |
226 | 234 | | |
| 235 | + | |
| 236 | + | |
227 | 237 | | |
228 | | - | |
| 238 | + | |
229 | 239 | | |
230 | 240 | | |
231 | | - | |
| 241 | + | |
232 | 242 | | |
233 | 243 | | |
234 | | - | |
| 244 | + | |
235 | 245 | | |
236 | 246 | | |
237 | 247 | | |
238 | 248 | | |
239 | | - | |
| 249 | + | |
240 | 250 | | |
241 | 251 | | |
242 | | - | |
| 252 | + | |
243 | 253 | | |
244 | 254 | | |
245 | | - | |
| 255 | + | |
246 | 256 | | |
247 | 257 | | |
248 | 258 | | |
| |||
277 | 287 | | |
278 | 288 | | |
279 | 289 | | |
| 290 | + | |
| 291 | + | |
280 | 292 | | |
281 | | - | |
| 293 | + | |
282 | 294 | | |
283 | 295 | | |
284 | | - | |
| 296 | + | |
285 | 297 | | |
286 | 298 | | |
287 | 299 | | |
288 | | - | |
| 300 | + | |
289 | 301 | | |
290 | 302 | | |
291 | | - | |
| 303 | + | |
292 | 304 | | |
293 | 305 | | |
294 | 306 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
410 | 410 | | |
411 | 411 | | |
412 | 412 | | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
413 | 439 | | |
414 | 440 | | |
415 | 441 | | |
| |||
428 | 454 | | |
429 | 455 | | |
430 | 456 | | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
431 | 468 | | |
432 | 469 | | |
433 | 470 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
300 | 300 | | |
301 | 301 | | |
302 | 302 | | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
303 | 316 | | |
304 | 317 | | |
305 | 318 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3306 | 3306 | | |
3307 | 3307 | | |
3308 | 3308 | | |
| 3309 | + | |
| 3310 | + | |
| 3311 | + | |
| 3312 | + | |
| 3313 | + | |
3309 | 3314 | | |
3310 | 3315 | | |
3311 | 3316 | | |
| |||
0 commit comments