Add GPU-accelerated dcm image decoding#8954
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds env-driven DICOM reader selection, an nvImageCodec-backed pydicom reader and plugin wrapper, SHA-256-based cache hashing, and matching docs, requirements, and tests. Estimated code review effort: 4 (Complex) | ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
monai/utils/misc.py (1)
577-583: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsolidate the DICOM reader env lookup
MONAIEnvVars.dicom_reader()has no callers, whileget_preferred_dicom_reader_key()readsMONAI_DICOM_READERdirectly and applies its own fallback/warning path. Drop the duplicate accessor or make the reader-key lookup use it so the behavior lives in one place.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/utils/misc.py` around lines 577 - 583, The DICOM reader environment lookup is duplicated between MONAIEnvVars.dicom_reader() and get_preferred_dicom_reader_key(), so consolidate the logic in one place. Update get_preferred_dicom_reader_key() to use MONAIEnvVars.dicom_reader() for reading MONAI_DICOM_READER and handling the default, then keep any warning/fallback behavior there. If dicom_reader() is not needed elsewhere, remove the redundant accessor to avoid drift.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@monai/data/image_reader.py`:
- Around line 1150-1156: verify_suffix in ImageReader currently blocks DICOM
selection whenever _nvimgcodec_available is false, which prevents the documented
pydicom fallback from ever being used. Update verify_suffix to accept DICOM
paths based on is_dicom_path(filename) and has_pydicom alone, without gating on
_nvimgcodec_available, so ImageReader can still be selected and decode through
the pydicom fallback path when nvImageCodec is unavailable.
In `@tests/data/test_nvimgcodec_pydicom_reader.py`:
- Around line 49-54: The current test coverage in
test_get_default_reader_registration_order only checks the pydicom override path
and misses the failing nvimgcodec fallback path. Add a test around
get_default_reader_registration_order and LoadImage auto-selection that sets
MONAI_DICOM_READER=nvimgcodec while simulating is_nvimgcodec_available() as
false, then assert the documented DICOM behavior still works (or assert the
intended fallback outcome). Use the existing LoadImage and
get_default_reader_registration_order symbols so the new case covers the
verify_suffix-related failure mode.
---
Nitpick comments:
In `@monai/utils/misc.py`:
- Around line 577-583: The DICOM reader environment lookup is duplicated between
MONAIEnvVars.dicom_reader() and get_preferred_dicom_reader_key(), so consolidate
the logic in one place. Update get_preferred_dicom_reader_key() to use
MONAIEnvVars.dicom_reader() for reading MONAI_DICOM_READER and handling the
default, then keep any warning/fallback behavior there. If dicom_reader() is not
needed elsewhere, remove the redundant accessor to avoid drift.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e7cf914b-5fce-455f-b4c6-6b64b67c3e5e
📒 Files selected for processing (12)
CONTRIBUTING.mddocs/source/data.rstmonai/data/__init__.pymonai/data/image_reader.pymonai/data/nvimgcodec_pydicom_plugin.pymonai/transforms/io/array.pymonai/transforms/io/dictionary.pymonai/utils/misc.pypyproject.tomlrequirements-dev.txttests/data/test_init_reader.pytests/data/test_nvimgcodec_pydicom_reader.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
monai/transforms/io/array.py (1)
193-195: 🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy liftKeep fallback DICOM readers registered.
get_default_reader_registration_order()now registers only the preferred DICOM reader. WithMONAI_DICOM_READER=nvimgcodec,NvImgCodecPydicomReader.verify_suffix()returnsFalsewhen nvimgcodec/CUDA is unavailable, so auto-selection skips the only DICOM reader and DICOM loads fail outright. Keep the other DICOM readers registered behind the preferred one so this path still falls back cleanly.Also applies to: 268-269
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/transforms/io/array.py` around lines 193 - 195, The default reader registration in the reader setup path only registers the preferred DICOM reader, which prevents fallback when that reader is unavailable; update the registration logic in the reader initialization flow (the code that iterates over get_default_reader_registration_order() and calls self.register with SUPPORTED_READERS) so the preferred DICOM reader is registered first but the other DICOM readers remain registered afterward. Ensure NvImgCodecPydicomReader stays preferred when available, while keeping the remaining DICOM readers available for auto-selection if verify_suffix() rejects the preferred one.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@monai/transforms/io/array.py`:
- Around line 193-195: The default reader registration in the reader setup path
only registers the preferred DICOM reader, which prevents fallback when that
reader is unavailable; update the registration logic in the reader
initialization flow (the code that iterates over
get_default_reader_registration_order() and calls self.register with
SUPPORTED_READERS) so the preferred DICOM reader is registered first but the
other DICOM readers remain registered afterward. Ensure NvImgCodecPydicomReader
stays preferred when available, while keeping the remaining DICOM readers
available for auto-selection if verify_suffix() rejects the preferred one.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7af1b16a-deec-4586-bafb-b0f0c73f5e06
📒 Files selected for processing (1)
monai/transforms/io/array.py
…mpression Signed-off-by: M Q <mingmelvinq@nvidia.com>
…ression Signed-off-by: M Q <mingmelvinq@nvidia.com>
Signed-off-by: M Q <mingmelvinq@nvidia.com>
Signed-off-by: M Q <mingmelvinq@nvidia.com>
Signed-off-by: M Q <mingmelvinq@nvidia.com>
for more information, see https://pre-commit.ci
though the underlying bug is in LoadImage’s auto-select path.
Root cause
test_nibabel_reader_5 is TEST_CASE_4_1 with {"mmap": False} — not the NibabelReader(mmap=False) case.
What happens:
1. mmap=False is passed to all default readers, including ITKReader.
2. Your branch registers the DICOM reader last, so it is tried first in reverse auto-select order.
3. ITKReader.verify_suffix() returns True whenever ITK is installed (any file type).
4. LoadImage calls itk.imread(..., mmap=False), which ITK does not support → TypeError.
5. The auto-select path had no try/except, so it never fell back to NibabelReader.
On dev branch, ITK was tried later (dict order), so Nibabel usually won first.
Fix
Unified reader selection in LoadImage.__call__ so auto-select also catches read failures and tries
the next reader (same as the explicit-reader path)
Verification
- test_nibabel_reader_5 — passed
- All 7 test_nibabel_reader cases — passed
Signed-off-by: M Q <mingmelvinq@nvidia.com>
Signed-off-by: M Q <mingmelvinq@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
tests/data/test_persistentdataset.py (1)
281-285: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winAvoid shelling out from this test.
PersistentDataset._cachecheck()deep-copies the item before caching, so this__reduce__runsos.system(...)during the test itself. That makes the test side-effectful and shell-dependent for no extra coverage. Use a benign module-level picklable sentinel instead and lettorch.load(..., weights_only=True)reject it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/data/test_persistentdataset.py` around lines 281 - 285, The _BadType.__reduce__ test helper currently shells out through os.system, making the test side-effectful and dependent on the shell during PersistentDataset._cachecheck() deep-copying. Replace that inline reducer with a benign module-level picklable sentinel object/function and keep the assertion focused on torch.load(..., weights_only=True) rejecting it, so the test exercises the same failure path without executing commands.tests/losses/deform/test_bending_energy.py (1)
70-72: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueAdd
device=devicefor consistency.Lines 70/72 build CPU tensors while the sibling case (L68) uses
device=device. Harmless here—theValueErrorfires during shape validation before any device compute—but inconsistent with the rest.♻️ Match device usage
with self.assertRaisesRegex(ValueError, "All spatial dimensions"): - loss.forward(torch.ones((1, 3, 5, 2, 5))) + loss.forward(torch.ones((1, 3, 5, 2, 5), device=device)) with self.assertRaisesRegex(ValueError, "All spatial dimensions"): - loss.forward(torch.ones((1, 3, 5, 5, 2))) + loss.forward(torch.ones((1, 3, 5, 5, 2), device=device))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/losses/deform/test_bending_energy.py` around lines 70 - 72, The test cases in the bending energy shape-validation checks are creating default CPU tensors, unlike the surrounding case that already passes device=device. Update the tensor construction in the affected test inputs in test_bending_energy to include device=device so all cases use consistent device placement while still exercising the same ValueError path in loss.forward.monai/losses/deform.py (1)
76-89: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winReturn a tuple from
_idx.
x[_idx(...)]currently passes a Pythonlistof slices into tensor indexing. Returntuple(out)here so each slice is applied positionally and the intent stays explicit across PyTorch versions.♻️ Return a tuple
- def _idx(overrides: dict) -> list: - out: list = [slice_all, slice_all] + def _idx(overrides: dict) -> tuple: + out: list = [slice_all, slice_all] for d in range(2, x.ndim): out.append(overrides.get(d, slice_inner)) - return out + return tuple(out)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/losses/deform.py` around lines 76 - 89, The helper _idx in deform.py currently builds a list of slices and that list is used directly for tensor indexing; change _idx to return a tuple of slices instead so x[_idx(...)] applies positional indexing explicitly and consistently. Update the return value in _idx and keep the existing callers in the same function using _idx({dim_1: slice_plus}), _idx({}), and the paired dim_1/dim_2 cases unchanged.tests/metrics/test_surface_distance.py (1)
192-201: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDocument
_edge_masks.This helper carries the core test setup for the new KD-tree path, but it has no docstring. Please add a short Google-style docstring covering the
seedargument and the returned edge masks. As per path instructions, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/metrics/test_surface_distance.py` around lines 192 - 201, Add a short Google-style docstring to `_edge_masks` describing the `seed` parameter and the two boolean edge-mask arrays it returns. Keep the docstring aligned with the existing test helpers in `tests/metrics/test_surface_distance.py`, and include a brief Returns section for `edges_pred` and `edges_gt`; no functional changes are needed.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@monai/apps/nnunet/nnunetv2_runner.py`:
- Line 768: The validate_single_model path is forwarding a duplicate val
argument into train_single_model, causing a TypeError when callers pass val
explicitly. Update validate_single_model to either remove val from kwargs before
calling train_single_model or pass it only once, and make sure the call site
stays consistent with train_single_model’s signature. Add or update a regression
test covering validate_single_model(..., val=False) to verify the conflict is
handled.
In `@monai/auto3dseg/utils.py`:
- Around line 285-289: The report_format validator in verify_report_format is
still recursing after detecting an invalid schema shape, so malformed list
schemas can incorrectly pass validation. Update the list-handling branch to fail
immediately after the len(v_fmt) != 1 warning in verify_report_format, returning
False instead of continuing into v_fmt[0]. Add a regression test for the
malformed report_format list case (for example, a multi-item list where the
first item matches) to ensure it is rejected.
In `@monai/metrics/utils.py`:
- Around line 309-312: The spacing handling in the coordinate-scaling path
currently allows length-1 sequences like [1.0] to broadcast across all axes,
which makes it inconsistent with the EDT validation. Update the spacing
validation in the relevant utility in utils.py to explicitly reject non-scalar
spacing values unless their length matches the number of spatial dimensions, and
keep the scalar case working. Add a regression test covering the mismatched
1-element spacing case to ensure it raises the same validation error.
In `@tests/apps/nnunet/test_nnunetv2_runner_command.py`:
- Around line 47-63: The command-building logic in train_single_model_command
handles both pretrained_weights and p, but the tests only cover
pretrained_weights today. Add a unit test in test_nnunetv2_runner_command that
passes {"p": "custom_plan"} to _make_runner().train_single_model_command and
asserts the generated cmd includes -p with custom_plan, so the p branch in the
command builder cannot regress unnoticed.
---
Nitpick comments:
In `@monai/losses/deform.py`:
- Around line 76-89: The helper _idx in deform.py currently builds a list of
slices and that list is used directly for tensor indexing; change _idx to return
a tuple of slices instead so x[_idx(...)] applies positional indexing explicitly
and consistently. Update the return value in _idx and keep the existing callers
in the same function using _idx({dim_1: slice_plus}), _idx({}), and the paired
dim_1/dim_2 cases unchanged.
In `@tests/data/test_persistentdataset.py`:
- Around line 281-285: The _BadType.__reduce__ test helper currently shells out
through os.system, making the test side-effectful and dependent on the shell
during PersistentDataset._cachecheck() deep-copying. Replace that inline reducer
with a benign module-level picklable sentinel object/function and keep the
assertion focused on torch.load(..., weights_only=True) rejecting it, so the
test exercises the same failure path without executing commands.
In `@tests/losses/deform/test_bending_energy.py`:
- Around line 70-72: The test cases in the bending energy shape-validation
checks are creating default CPU tensors, unlike the surrounding case that
already passes device=device. Update the tensor construction in the affected
test inputs in test_bending_energy to include device=device so all cases use
consistent device placement while still exercising the same ValueError path in
loss.forward.
In `@tests/metrics/test_surface_distance.py`:
- Around line 192-201: Add a short Google-style docstring to `_edge_masks`
describing the `seed` parameter and the two boolean edge-mask arrays it returns.
Keep the docstring aligned with the existing test helpers in
`tests/metrics/test_surface_distance.py`, and include a brief Returns section
for `edges_pred` and `edges_gt`; no functional changes are needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b7c6533c-5868-4100-8327-5574b49624af
📒 Files selected for processing (29)
docs/source/installation.mdmonai/apps/nnunet/nnunetv2_runner.pymonai/apps/nnunet/utils.pymonai/apps/vista3d/transforms.pymonai/auto3dseg/utils.pymonai/data/dataset.pymonai/data/image_reader.pymonai/data/utils.pymonai/inferers/merger.pymonai/losses/deform.pymonai/losses/image_dissimilarity.pymonai/losses/nacl_loss.pymonai/metrics/froc.pymonai/metrics/generalized_dice.pymonai/metrics/utils.pymonai/networks/blocks/text_embedding.pymonai/networks/layers/filtering.pymonai/networks/layers/simplelayers.pymonai/networks/nets/basic_unet.pymonai/networks/nets/basic_unetplusplus.pymonai/transforms/utils.pymonai/utils/profiling.pytests/apps/nnunet/__init__.pytests/apps/nnunet/test_nnunetv2_runner_command.pytests/data/test_persistentdataset.pytests/losses/deform/test_bending_energy.pytests/metrics/test_compute_froc.pytests/metrics/test_surface_distance.pytests/utils/enums/test_wsireader.py
💤 Files with no reviewable changes (11)
- monai/networks/nets/basic_unet.py
- monai/networks/layers/simplelayers.py
- monai/networks/blocks/text_embedding.py
- monai/networks/nets/basic_unetplusplus.py
- monai/transforms/utils.py
- monai/metrics/generalized_dice.py
- monai/apps/vista3d/transforms.py
- monai/losses/image_dissimilarity.py
- monai/utils/profiling.py
- monai/networks/layers/filtering.py
- monai/inferers/merger.py
✅ Files skipped from review due to trivial changes (2)
- tests/apps/nnunet/init.py
- monai/apps/nnunet/utils.py
dfa67df to
7fbef1b
Compare
ericspod
left a comment
There was a problem hiding this comment.
Hi @MMelQin thanks for this addition for faster dicom loading, it's something we've wanted for a while. I had a number of comments about implementation specifics and the requirements.
One important thing is that our CI system currently doesn't have GPU testing support, we've moved off blossom and don't have a replacement ready yet. Our release tests were being run locally so I don't know how the added tests here will go. We want to test against multiple OS, Python, and PyTorch configurations and identify which ones are a problem, I'm afraid you'd have to do that locally yourself to some degree at least and report here on any issues.
I also mentioned we weren't doing the macOS full tests for time reasons (they were taking hours, far longer than Windows), you can re-enable these temporarily for this PR by editing the action but I may want to update the action to do an installation under macOS to test that part. Since we don't the issue I flagged with the requirements wasn't caught.
| break | ||
| # Unified reader selection so auto-select also catches read failures and tries the next reader | ||
| # (same as the explicit-reader path) | ||
| if self.auto_select and not reader.verify_suffix(filename): |
There was a problem hiding this comment.
I think the original intent of the code this replaced was to fallback to the reader method argument (if given) when the selected reader fails. This didn't work since the reader variable was shadowed by the loop variable. Do you think implementing this here would be beneficial?
|
|
||
| self.readers: list[ImageReader] = [] | ||
| for r in SUPPORTED_READERS: # set predefined readers as default | ||
| for r in get_default_reader_registration_order(): # set predefined readers as default |
There was a problem hiding this comment.
With this change SUPPORTED_READERS now doesn't define what readers are available or their order. Perhaps get_default_reader_registration_order should be in this file, or removed and code added to this file to modify SUPPORTED_READERS appropriately to have readers in the right order. The intent with SUPPORTED_READERS was to allow users to add their own readers to the dictionary, the current change breaks that behaviour I think.
|
|
||
| class TestNvImgCodecPydicomReaderIntegration(unittest.TestCase): | ||
| @SkipIfNoModule("pydicom") | ||
| def test_load_dicom_with_pydicom_env(self): |
There was a problem hiding this comment.
For this test I think we need @SkipIfNoModule("nvidia.nvimgcodec.tools.dicom.pydicom_plugin") as well.
| @SkipIfNoModule("pydicom") | ||
| @patch("monai.data.nvimgcodec_pydicom_plugin.register_as_decoder_plugin", return_value=False) | ||
| @patch("monai.data.nvimgcodec_pydicom_plugin.is_nvimgcodec_available", return_value=False) | ||
| def test_load_dicom_with_nvimgcodec_reader_fallback(self, _mock_available, _mock_register): |
There was a problem hiding this comment.
I think one as well but rely on the patches to force the fallback.
| onnx_graphsurgeon | ||
| polygraphy | ||
| pytest # FIXME: added to get around cupy 14.1.0 creating the requirement through polygraphy and trt_compiler somehow | ||
| cupy-cuda13x |
There was a problem hiding this comment.
We currently only run the min tests for macOS, not the full tests since these took a long time. This was done here which you might want to re-enable to run that test for this PR at least once. I think these requirements will break the macOS installation process since these won't be available on that platform. You'll need to add ; platform_system != "Darwin" (I think) to your added requirements if these work under Windows, otherwise it will be like the cucim requirements with ; platform_system == "Linux".
There was a problem hiding this comment.
Thanks for the reminder that MONAI Core has more supported platforms!
Description
Add GPU accelerated DICOM compressed image decoding using Pydicom and a custom decoder plugin based on NVIDIA nvimgcodec.
Implementation is done by subclassing the existing PydicomReader for its existing (and presumed verified) functionalities of loading dcm file(s) and parsing pixel data into image data array purely using pydicom library, and not its GPU direct load option which had been known to have serious deficiencies (omitting the proper and required interpretation of raw bytes!). This known issue needs to be addressed in a separate PR, so the new reader in this PR simply discards the GPU direct load option.
Introduced a mechanism to select DICOM reader at runtime by the use of a new env var, with the default being the ITK based reader.
It is also worth noting that the GPU accelerated decoder plugin for Pydicom was first developed, tested, and used in MONAI Deploy App SDK in this module, and then replicated in nvimgcodec for broader distribution. Full test of the decoder using all pydicom embedded test files of the supported transfer syntax is in MONAI Deploy App SDK. Future plan is to upstream this custom decoder plugin to pydicom to make the use even simpler.
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.