fix(ZarrAvgMerger): use separate tmpdir attrs to prevent value_store tmpdir GC on zarr v3#8816
Conversation
…tmpdir from being garbage collected Previously, when both `value_store` and `count_store` were None, `self.tmpdir` was assigned twice sequentially — causing the first TemporaryDirectory (for value_store) to be immediately garbage collected when the second one (for count_store) overwrote the reference. This silently deleted the value store's backing directory. Fix: introduce `self.value_tmpdir` and `self.count_tmpdir` as separate attributes so both temporary directories remain alive for the lifetime of the ZarrAvgMerger instance. Fixes Project-MONAI#8476
📝 WalkthroughWalkthrough
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Fixes a Zarr v3-specific lifecycle bug in ZarrAvgMerger where a temporary directory backing value_store could be prematurely garbage-collected when both value_store and count_store were None, causing runtime failures during array creation/usage.
Changes:
- Split the single
self.tmpdirintoself.value_tmpdirandself.count_tmpdirto keep bothTemporaryDirectoryinstances alive for the merger’s lifetime on Zarr v3. - Remove the now-unneeded
self.tmpdirhandling in the Zarr v2 branch. - Minor refactor/simplification of codec/compressor assignment logic while preserving behavior across Zarr v2/v3.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monai/inferers/merger.py`:
- Around line 345-347: The current v2 fallback uses codecs[0], value_codecs[0],
and count_codecs[0] which will IndexError on empty sequences; update the logic
in the Merger initializer (the assignment to self.codecs, self.value_codecs,
self.count_codecs) to treat an empty list/tuple as None (i.e., if isinstance(x,
(list,tuple)) and len(x) > 0 then use x[0], elif x is not None use x, else use
the corresponding compressor/value_compressor/count_compressor fallback), or
alternatively raise a clear ValueError when an explicit empty sequence is
provided; also add a regression unit test that passes empty lists for
codecs/value_codecs/count_codecs and asserts the correct fallback behavior or
the expected ValueError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ce0d16fe-45cb-4e1e-a4db-03756ea00f72
📒 Files selected for processing (1)
monai/inferers/merger.py
| self.codecs = codecs[0] if isinstance(codecs, (list, tuple)) else codecs if codecs is not None else compressor # type: ignore[assignment] | ||
| self.value_codecs = value_codecs[0] if isinstance(value_codecs, (list, tuple)) else value_codecs if value_codecs is not None else value_compressor # type: ignore[assignment] | ||
| self.count_codecs = count_codecs[0] if isinstance(count_codecs, (list, tuple)) else count_codecs if count_codecs is not None else count_compressor # type: ignore[assignment] |
There was a problem hiding this comment.
Guard empty codec lists in the v2 fallback.
codecs[0], value_codecs[0], and count_codecs[0] will raise IndexError for []. Please treat an empty sequence as None or raise a targeted ValueError, and add a regression test for that path. As per coding guidelines, "Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Ensure new or modified definitions will be covered by existing or new unit tests."
Suggested fix
+ def _unwrap_v2_codec_arg(arg):
+ if isinstance(arg, (list, tuple)):
+ return arg[0] if arg else None
+ return arg
+
else:
# For zarr v2, use compressors
- self.codecs = codecs[0] if isinstance(codecs, (list, tuple)) else codecs if codecs is not None else compressor # type: ignore[assignment]
- self.value_codecs = value_codecs[0] if isinstance(value_codecs, (list, tuple)) else value_codecs if value_codecs is not None else value_compressor # type: ignore[assignment]
- self.count_codecs = count_codecs[0] if isinstance(count_codecs, (list, tuple)) else count_codecs if count_codecs is not None else count_compressor # type: ignore[assignment]
+ self.codecs = _unwrap_v2_codec_arg(codecs) if codecs is not None else compressor # type: ignore[assignment]
+ self.value_codecs = _unwrap_v2_codec_arg(value_codecs) if value_codecs is not None else value_compressor # type: ignore[assignment]
+ self.count_codecs = _unwrap_v2_codec_arg(count_codecs) if count_codecs is not None else count_compressor # type: ignore[assignment]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@monai/inferers/merger.py` around lines 345 - 347, The current v2 fallback
uses codecs[0], value_codecs[0], and count_codecs[0] which will IndexError on
empty sequences; update the logic in the Merger initializer (the assignment to
self.codecs, self.value_codecs, self.count_codecs) to treat an empty list/tuple
as None (i.e., if isinstance(x, (list,tuple)) and len(x) > 0 then use x[0], elif
x is not None use x, else use the corresponding
compressor/value_compressor/count_compressor fallback), or alternatively raise a
clear ValueError when an explicit empty sequence is provided; also add a
regression unit test that passes empty lists for
codecs/value_codecs/count_codecs and asserts the correct fallback behavior or
the expected ValueError.
Description
Fixes #8476
Root Cause
In
ZarrAvgMerger.__init__, when bothvalue_storeandcount_storeareNoneand zarr v3 is detected, a singleself.tmpdirattribute was assigned twice sequentially:When
self.tmpdiris overwritten by the second assignment, Python's reference count for the firstTemporaryDirectorydrops to zero and it is immediately garbage-collected — deleting the directory thatvalue_storewas pointing to. This causes the zarrValueErrorseen in the issue.Fix
Introduce two separate attributes
self.value_tmpdirandself.count_tmpdirso bothTemporaryDirectoryobjects remain alive for the full lifetime of theZarrAvgMergerinstance:Also cleaned up the old
self.tmpdir: TemporaryDirectory | Nonedeclaration which is no longer needed.Testing
The existing
tests/inferers/test_zarr_avg_merger.pytest suite covers this path (the failingtest_zarr_avg_merger_patches_13andtest_zarr_avg_merger_patches_14cases from the issue).Checklist