SwinUNETR + HyenaND: subquadratic alternative to windowed self-attention#8958
SwinUNETR + HyenaND: subquadratic alternative to windowed self-attention#8958farhadrgh wants to merge 15 commits into
Conversation
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
for more information, see https://pre-commit.ci
📝 WalkthroughWalkthroughAdds HyenaND blocks to MONAI, wires them into Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (6)
docs/source/networks.rst (1)
132-148: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider documenting
is_nvsubquadratic_available().The function is public API (
__all__) and lets users detect the optional dependency at runtime. Add anautofunctionentry near the Hyena block classes:`Hyena Utilities` ~~~~~~~~~~~~~~~~~ .. autofunction:: monai.networks.blocks.hyena.is_nvsubquadratic_available🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/source/networks.rst` around lines 132 - 148, Add documentation for the public helper is_nvsubquadratic_available() in the Hyena networks docs, since it is part of the exposed API and users need a runtime way to detect the optional dependency. Update the Hyena section in networks.rst near HyenaMixer and HyenaTransformerBlock by adding a Hyena Utilities subsection with an autofunction entry for monai.networks.blocks.hyena.is_nvsubquadratic_available, keeping it grouped with the other Hyena-related APIs.CHANGELOG.md (1)
7-11: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider adding
is_nvsubquadratic_available()andload_fromskip behavior to the changelog.The entry is accurate but omits two user-visible behaviors:
monai.networks.blocks.hyena.is_nvsubquadratic_available()for runtime optional-dependency detection.SwinUNETR.load_fromnow skips Hyena stages with a warning rather than failing.Both are worth noting for users integrating Hyena conditionally.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` around lines 7 - 11, The Hyena changelog entry is missing two user-visible behaviors that should be called out for users integrating the optional dependency conditionally. Update the `CHANGELOG.md` “Added” section to mention `monai.networks.blocks.hyena.is_nvsubquadratic_available()` as the runtime availability check, and note that `SwinUNETR.load_from` now skips Hyena stages with a warning instead of failing. Keep the wording concise and tie both items to the existing Hyena additions so readers can find the relevant APIs easily.monai/networks/blocks/hyena.py (1)
95-135: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd the required Google-style docstrings.
Several new definitions lack docstrings or Args/Returns/Raises sections. As per path instructions, “Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings.”
Also applies to: 149-177, 189-217, 389-413, 517-548
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/networks/blocks/hyena.py` around lines 95 - 135, Add Google-style docstrings to the newly introduced definitions in hyena.py, especially the forward methods and related helpers referenced by the diff, so each function/class clearly documents its purpose, all parameters/variables, return value, and any raised exceptions. Update the affected symbols in the Hyena block implementations and any other new definitions called out in the review to include the required Args and Returns/Raises sections, matching the repository’s docstring conventions.Source: Path instructions
tests/networks/blocks/test_hyena_block.py (1)
327-337: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAssert the FFT short-conv wiring.
These tests still pass if
HyenaMixerignoresuse_fft_short_convorshort_conv_fft_chunk_size, because plainnn.Conv3dpreserves the same shape. Add a type/config assertion on the constructed short-conv module so the branch inmonai/networks/blocks/hyena.py:274-368is actually covered. As per path instructions,Ensure new or modified definitions will be covered by existing or new unit tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/networks/blocks/test_hyena_block.py` around lines 327 - 337, The HyenaMixer 3D tests only verify output shape, so they still pass even if the FFT short-conv path is never used. Update the tests around HyenaMixer construction to assert the short-conv module type/config when use_fft_short_conv and short_conv_fft_chunk_size are set, so the branch in HyenaMixer’s short-conv wiring is explicitly exercised and covered by unit tests.Source: Path instructions
tests/networks/nets/test_hyena_nd_unetr.py (1)
40-95: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd the missing no-dependency constructor test.
Every constructor-contract case is skipped when
nvsubquadraticis absent, butHyenaNDUNETRdocuments anImportErrorpath inmonai/networks/nets/hyena_nd_unetr.py:59-165. Add a@skipUnless(not HAS_NVSUBQ, ...)case so that contract stays covered. As per path instructions,Ensure new or modified definitions will be covered by existing or new unit tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/networks/nets/test_hyena_nd_unetr.py` around lines 40 - 95, Add a missing constructor-contract test for the no-dependency path in TestHyenaNDUNETRConstructorContract so coverage still exists when HAS_NVSUBQ is false. Create a new test alongside the existing HyenaNDUNETR constructor tests that is skipped unless nvsubquadratic is absent and asserts the documented ImportError behavior from HyenaNDUNETR/__init__. Keep the focus on the constructor symbols HyenaNDUNETR and TestHyenaNDUNETRConstructorContract, and ensure this path is exercised by a unit test rather than only the dependency-present cases.Source: Path instructions
monai/networks/nets/swin_unetr.py (1)
331-340: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd Google-style sections to
load_from.This public method now has custom checkpoint-loading behavior, but
weights, return value, and expected exceptions are not documented.Proposed docstring update
def load_from(self, weights): """Load pretrained Swin weights into the matching submodules. When a stage uses :class:`HyenaTransformerBlock` instead of :class:`SwinTransformerBlock`, the per-block ``load_from`` call is skipped for that stage and a warning is issued -- HyenaND has a different parameter layout and there are no compatible attention weights to copy. PatchMerging downsample weights are still loaded for all stages (the downsample layer is the same in both code paths). + + Args: + weights: Checkpoint mapping containing a ``"state_dict"`` with Swin + pretrained parameter tensors. + + Returns: + None. + + Raises: + KeyError: If an expected checkpoint key is absent. + RuntimeError: If a checkpoint tensor shape is incompatible. """As per path instructions, “Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/networks/nets/swin_unetr.py` around lines 331 - 340, The public method load_from in SwinUNETR needs a full Google-style docstring update because it now has custom checkpoint-loading behavior without documented parameters, return value, or exceptions. Expand the existing docstring to add Args for weights (and any other inputs used by load_from), a Returns section if it returns a value, and a Raises section for any expected exceptions, while keeping the current summary about Swin and HyenaTransformerBlock loading behavior.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/source/installation.md`:
- Around line 263-266: Update the `hyena` extra documentation to point the
`nvsubquadratic` hyperlink at the correct NVIDIA-BioNeMo repository instead of
the dead NVIDIA URL. Locate the `hyena` installation description in the docs and
replace the existing link target so readers using `HyenaNDUNETR`, `HyenaMixer`,
and `HyenaTransformerBlock` are directed to the valid project page.
In `@monai/networks/blocks/hyena.py`:
- Around line 51-56: The availability check in the Hyena module currently
depends only on the LazyConfig import, so it can report nvsubquadratic as
available even when instantiate, Hyena, CKConvND, SIRENKernelND, or
GaussianModulationND fail to import. Update the import setup in hyena.py so each
optional_import contributes a flag, and change is_nvsubquadratic_available() to
require all of those symbols to be present before returning true.
- Around line 95-135: `Hyena.forward` currently hardcodes the FFT crop around
`kernel_shape // 2`, so `DepthwiseFFTConv{2,3}d` ignores its `self.padding`
setting and can produce incorrect outputs for non-default padding or even
kernels. Update the crop logic in `forward()` (including the chunked path) to
use the module’s padding value when building `slices`, and keep the FFT result
aligned with Conv{2,3}d semantics for both output shape and values.
In `@monai/networks/nets/swin_unetr.py`:
- Around line 363-367: The warning emitted in `load_from` for
`HyenaTransformerBlock` should set an explicit stacklevel so the caller sees the
warning at their call site instead of inside this helper. Update the existing
warnings.warn call in `swin_unetr.py` to include a stacklevel argument, keeping
the same message and using the `load_from` path where `layer_name` and
`HyenaTransformerBlock` are handled.
- Around line 956-1001: The Hyena path still pays the cost of attention-mask
construction even though HyenaTransformerBlock.forward ignores mask_matrix.
Update the stage forward logic that uses self.use_hyena and compute_mask(...) so
mask generation is skipped entirely for Hyena stages, while preserving the
existing SwinTransformerBlock path for non-Hyena stages. Keep the change
localized to the stage module that initializes self.blocks and dispatches to the
block forward calls.
In `@tests/networks/nets/test_hyena_nd_unetr.py`:
- Around line 76-84: The duplicate hyena_stages kwarg assertion in
test_duplicate_hyena_stages_kwarg_rejected is unreachable because Python raises
TypeError at the call site before HyenaNDUNETR.__init__ can inspect kwargs.
Remove this test or rewrite it to cover a reachable duplicate-argument path, and
if keeping a kwargs validation check in HyenaNDUNETR.__init__, add a separate
test that passes hyena_stages only through kwargs so the branch can actually
execute.
In `@tests/networks/nets/test_swin_unetr.py`:
- Around line 170-187: The golden check in test_default_path_unchanged is too
strict because it hashes raw CUDA bytes, so replace the SHA256 comparison with a
tolerance-based validation such as torch.testing.assert_close against a stored
reference tensor or another numeric invariant. Keep the test focused on the
SwinUNETR default forward path and preserve the existing setup in
test_default_path_unchanged, but avoid byte-level equality that can fail from
harmless GPU/PyTorch drift.
---
Nitpick comments:
In `@CHANGELOG.md`:
- Around line 7-11: The Hyena changelog entry is missing two user-visible
behaviors that should be called out for users integrating the optional
dependency conditionally. Update the `CHANGELOG.md` “Added” section to mention
`monai.networks.blocks.hyena.is_nvsubquadratic_available()` as the runtime
availability check, and note that `SwinUNETR.load_from` now skips Hyena stages
with a warning instead of failing. Keep the wording concise and tie both items
to the existing Hyena additions so readers can find the relevant APIs easily.
In `@docs/source/networks.rst`:
- Around line 132-148: Add documentation for the public helper
is_nvsubquadratic_available() in the Hyena networks docs, since it is part of
the exposed API and users need a runtime way to detect the optional dependency.
Update the Hyena section in networks.rst near HyenaMixer and
HyenaTransformerBlock by adding a Hyena Utilities subsection with an
autofunction entry for monai.networks.blocks.hyena.is_nvsubquadratic_available,
keeping it grouped with the other Hyena-related APIs.
In `@monai/networks/blocks/hyena.py`:
- Around line 95-135: Add Google-style docstrings to the newly introduced
definitions in hyena.py, especially the forward methods and related helpers
referenced by the diff, so each function/class clearly documents its purpose,
all parameters/variables, return value, and any raised exceptions. Update the
affected symbols in the Hyena block implementations and any other new
definitions called out in the review to include the required Args and
Returns/Raises sections, matching the repository’s docstring conventions.
In `@monai/networks/nets/swin_unetr.py`:
- Around line 331-340: The public method load_from in SwinUNETR needs a full
Google-style docstring update because it now has custom checkpoint-loading
behavior without documented parameters, return value, or exceptions. Expand the
existing docstring to add Args for weights (and any other inputs used by
load_from), a Returns section if it returns a value, and a Raises section for
any expected exceptions, while keeping the current summary about Swin and
HyenaTransformerBlock loading behavior.
In `@tests/networks/blocks/test_hyena_block.py`:
- Around line 327-337: The HyenaMixer 3D tests only verify output shape, so they
still pass even if the FFT short-conv path is never used. Update the tests
around HyenaMixer construction to assert the short-conv module type/config when
use_fft_short_conv and short_conv_fft_chunk_size are set, so the branch in
HyenaMixer’s short-conv wiring is explicitly exercised and covered by unit
tests.
In `@tests/networks/nets/test_hyena_nd_unetr.py`:
- Around line 40-95: Add a missing constructor-contract test for the
no-dependency path in TestHyenaNDUNETRConstructorContract so coverage still
exists when HAS_NVSUBQ is false. Create a new test alongside the existing
HyenaNDUNETR constructor tests that is skipped unless nvsubquadratic is absent
and asserts the documented ImportError behavior from HyenaNDUNETR/__init__. Keep
the focus on the constructor symbols HyenaNDUNETR and
TestHyenaNDUNETRConstructorContract, and ensure this path is exercised by a unit
test rather than only the dependency-present cases.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cb613c72-ab77-43ad-a849-6b6826f9760c
📒 Files selected for processing (14)
CHANGELOG.mddocs/source/installation.mddocs/source/networks.rstmonai/networks/blocks/__init__.pymonai/networks/blocks/hyena.pymonai/networks/nets/__init__.pymonai/networks/nets/hyena_nd_unetr.pymonai/networks/nets/swin_unetr.pyrequirements-dev.txtsetup.cfgtests/min_tests.pytests/networks/blocks/test_hyena_block.pytests/networks/nets/test_hyena_nd_unetr.pytests/networks/nets/test_swin_unetr.py
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/cicd_tests.yml:
- Around line 240-266: The hyena-dep job is checking out branch-controlled code
with the default checkout credential behavior, which can leave the token
persisted in the repo config. Update the workflow to explicitly restrict the
job’s permissions to contents: read and configure actions/checkout so it does
not persist credentials; make this change in the hyena-dep job around the
existing actions/checkout step.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 42eafa0a-ede0-409b-8db4-4624f326eb78
📒 Files selected for processing (5)
.github/workflows/cicd_tests.yml.github/workflows/pythonapp-hyena-gpu.ymlrequirements-dev.txtsetup.cfgtests/networks/nets/test_hyena_nd_unetr.py
✅ Files skipped from review due to trivial changes (1)
- setup.cfg
🚧 Files skipped from review as they are similar to previous changes (2)
- requirements-dev.txt
- tests/networks/nets/test_hyena_nd_unetr.py
…n CI Two classes of fix, both unblocking checks that previously died at the nvsubquadratic install step and so never ran: CI install (hyena-dep + hyena-gpu jobs): * nvsubquadratic 0.1.0 over-declares install_requires (megatron-core, subquadratic-ops-torch-cu12 [CUDA sdist], nvidia-dali, wandb, datasets, pytorch-lightning, ...), none of which the HyenaND operators import at runtime (only torch + einops + omegaconf). Install with `pip install --no-deps nvsubquadratic` + omegaconf; drop the bare dep from requirements-dev.txt (kept in setup.cfg's [hyena] extra). codeformat (latent, now that the 3.10 install succeeds and the job runs): * isort: move the hyena_nd_unetr import to its alphabetical slot in nets/__init__.py; drop a double blank line in blocks/hyena.py. * black: reflow 5 HyenaND source/test files to MONAI's 120-char width. No behavior change: 57 pass / 28 skip across the three Hyena test files with CUDA hidden. Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
|
Pending on v0.1.1 release NVIDIA-BioNeMo/nvSubquadratic#136 |
Signed-off-by: Farhad Ramezanghorbani <farhadr@nvidia.com>
SwinUNETR + HyenaND: subquadratic alternative to windowed self-attention
Upstreams the four-variant PanTS matrix from NeurIPS 2026 paper id 26539
(Native Multi-Dimensional Subquadratic Operators via Input Dependent Long
Convolutions). HyenaND replaces windowed self-attention with a gated long
convolution backed by FFT global receptive field at O(N log N) cost instead
of attention's O(N²)-within-a-window.
Optional dependency: nvsubquadratic
The Hyena operators are backed by
nvsubquadratic(PyPI), NVIDIA's PyTorch-native library of
subquadratic attention alternatives. It is optional — gated via
optional_import, installedthrough the new
hyenaextra:
bash pip install 'monai[hyena]' monaicore,requirements-dev.txt, andpip install monai[all]are unaffected. Without thepackage,
import monaiandSwinUNETR(use_hyena=False)behave exactly as before; the Hyenaclasses raise a clear
ImportErroronly when constructed, and the Hyena tests skip via@skipUnless(is_nvsubquadratic_available(), ...).Public surface
monai.networks.blocks:HyenaMixer,HyenaTransformerBlock,DepthwiseFFTConv{2,3}d.monai.networks.nets.SwinUNETR: newuse_hyena/hyena_stages/hyena_*kwargs threaded throughSwinTransformer→BasicLayer.monai.networks.nets.HyenaNDUNETR: thinSwinUNETRsubclass withfrom_paper_variant("HHHH" | "HAHA" | "HHAA").[hyena]extras_require →pip install 'monai[hyena]'(
nvsubquadratic0.1.0 on PyPI).nvsubquadratic is gated through optional_import; SwinUNETR(use_hyena=False) never imports it.
Tests
72 new tests across test_hyena_block.py (40), test_swin_unetr.py (15 Hyena classes), test_hyena_nd_unetr.py (17).
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.