Skip to content

Add Cosmos3-Super inference support to transfer cookbook#206

Open
trungtpham wants to merge 1 commit into
NVIDIA:mainfrom
trungtpham:feature/cosmos3-transfer-super-clean
Open

Add Cosmos3-Super inference support to transfer cookbook#206
trungtpham wants to merge 1 commit into
NVIDIA:mainfrom
trungtpham:feature/cosmos3-transfer-super-clean

Conversation

@trungtpham

Copy link
Copy Markdown
Contributor
  • Update notebook to support both Nano (single GPU) and Super (multi-GPU, 32B) via COSMOS3_MODEL env-var; consolidate §9-13 inference cells with if/else bash logic that switches launcher (python vs torchrun) and checkpoint path accordingly.
  • Add user-editable config cell for HF_TOKEN, cache root, and output root; auto-detect available GPUs when COSMOS3_NUM_GPUS is unset.
  • Route outputs to model-namespaced sub-dirs (e.g. outputs/.../Cosmos3-Super/) to prevent Nano and Super results from overwriting each other.
  • Update preview_helpers.py to resolve output path from COSMOS3_MODEL.
  • Update README with Super quickstart commands, model comparison table, and notebook usage instructions.
  • Add .gitignore to exclude generated previews and outputs/.

@trungtpham trungtpham force-pushed the feature/cosmos3-transfer-super-clean branch 8 times, most recently from c91a838 to 4635682 Compare June 11, 2026 20:21
Restructure the transfer notebook to follow the audiovisual cookbook
pattern — dedicated sections for each model instead of a single
if/else-gated flag:

- §9–§13 Cosmos3-Nano: python launcher, latency preset, single GPU
- §14–§18 Cosmos3-Super: torchrun launcher, throughput preset, multi-GPU

Other changes:
- COSMOS3_NUM_GPUS auto-detects available GPUs (no hardcoded default).
- CUDA_VISIBLE_DEVICES defaults to all detected GPUs so both model
  sections work without reconfiguration.
- Each preview cell passes model= explicitly (no env-var dependency).
- Update preview_helpers.py to route outputs to model-namespaced dirs
  (Cosmos3-Nano/ vs Cosmos3-Super/).
- Update README with Super quickstart, model comparison table, and
  notebook instructions.
- Add .gitignore to exclude generated previews and outputs/.
- Clear cell outputs.
@trungtpham trungtpham force-pushed the feature/cosmos3-transfer-super-clean branch from 4635682 to 925d94c Compare June 11, 2026 20:23

@lfengad lfengad left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM


*model* selects which output directory to read (``Cosmos3-Nano`` uses
``<output_root>/<control>/…``; ``Cosmos3-Super`` uses
``<output_root>/<control>_super/…``). Defaults to the

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this name consistent with the name in the code? seems no <control>_super such pattern found in the paths used in the code?

@trungtpham

Copy link
Copy Markdown
Contributor Author

@lfengad please help merge the PR if there is no more concerns. thanks

@rwagwani

Copy link
Copy Markdown
  • Setup: Lepton node with 8 x A100 80GB.
  • Merged Add Cosmos3-Super inference support to transfer cookbook #206 locally with https://github.com/NVIDIA/cosmos-framework/ to check inference of default samples and commands with 5 provided modalities individually with Super and then with an input that is not present in the repo for a few control nets.
  • We did not see the bug 6304872 after the above fix was merged and Super runs without error. Bug is closed.
  • Attempt 1:
    • Edge and blur: Worked as expected but seg, depth and WSM generated complete black colored videos. QA Input videos also resulted in same black video. And a rerun of the working edge and vis also surprised us as edge generate d a black vide while vis worked.
    • HTML: http://picasso-tests-vm.nvidia.com:8236/
  • Attempt 2:
    • The black-video outputs were resolved after flushing the stale torch-compile (inductor + triton) caches and rerunning on a clean 8-GPU setup with re-verified Cosmos3-Super weights (seed 2026). Found all the 5 control nets default inputs and 2 user input videos for depth and seg to be working. On the fly conversion of RGB video to edge video failed with an error “Video data is not in the range [-1, 1]. get data range [nan, nan] “ . This error was seen in both the attempts. Will check with few more videos, early next week and file a bug if issue is commonly seen in all.
    • HTML: http://picasso-tests-vm.nvidia.com:8237/
  • We will start a detailed test run with qa input videos in next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants