You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -144,8 +144,8 @@ A schedule is defined as a list of tuples, where each tuple consists of:
144
144
In the example above, `add_z3_gather_release` is applied at step 0 to minimize memory usage. After a warmup phase (e.g., after the first few training iterations), additional optimizations such as prefetching and selective unsharding are applied based on profiled memory usage.
145
145
Each optimization pass takes a standardized set of arguments provided by DeepCompile. For details, please refer to the implementation of each pass:
146
146
147
-
-[ZeRO3 (All-gather and reduce-scatter insertion)](https://github.com/deepspeedai/DeepSpeed/blob/tohtana/deepcompile/deepspeed/compile/passes/zero3_compile.py)
0 commit comments