Skip to content

fix(ZarrAvgMerger): use separate tmpdir attrs to prevent value_store tmpdir GC on zarr v3#8816

Open
Zeesejo wants to merge 1 commit intoProject-MONAI:devfrom
Zeesejo:fix/zarr-avg-merger-tmpdir-overwrite
Open

fix(ZarrAvgMerger): use separate tmpdir attrs to prevent value_store tmpdir GC on zarr v3#8816
Zeesejo wants to merge 1 commit intoProject-MONAI:devfrom
Zeesejo:fix/zarr-avg-merger-tmpdir-overwrite

Conversation

@Zeesejo
Copy link
Copy Markdown

@Zeesejo Zeesejo commented Apr 10, 2026

Description

Fixes #8476

Root Cause

In ZarrAvgMerger.__init__, when both value_store and count_store are None and zarr v3 is detected, a single self.tmpdir attribute was assigned twice sequentially:

# Before (buggy)
if value_store is None:
    self.tmpdir = TemporaryDirectory()           # ← assigned here
    self.value_store = zarr.storage.LocalStore(self.tmpdir.name)

if count_store is None:
    self.tmpdir = TemporaryDirectory()           # ← OVERWRITES the above!
    self.count_store = zarr.storage.LocalStore(self.tmpdir.name)

When self.tmpdir is overwritten by the second assignment, Python's reference count for the first TemporaryDirectory drops to zero and it is immediately garbage-collected — deleting the directory that value_store was pointing to. This causes the zarr ValueError seen in the issue.

Fix

Introduce two separate attributes self.value_tmpdir and self.count_tmpdir so both TemporaryDirectory objects remain alive for the full lifetime of the ZarrAvgMerger instance:

# After (fixed)
self.value_tmpdir: TemporaryDirectory | None = None
self.count_tmpdir: TemporaryDirectory | None = None

if value_store is None:
    self.value_tmpdir = TemporaryDirectory()
    self.value_store = zarr.storage.LocalStore(self.value_tmpdir.name)

if count_store is None:
    self.count_tmpdir = TemporaryDirectory()
    self.count_store = zarr.storage.LocalStore(self.count_tmpdir.name)

Also cleaned up the old self.tmpdir: TemporaryDirectory | None declaration which is no longer needed.

Testing

The existing tests/inferers/test_zarr_avg_merger.py test suite covers this path (the failing test_zarr_avg_merger_patches_13 and test_zarr_avg_merger_patches_14 cases from the issue).

Checklist

  • New/modified code has proper docstrings
  • No new dependencies introduced
  • Backward compatible — no API changes

…tmpdir from being garbage collected

Previously, when both `value_store` and `count_store` were None,
`self.tmpdir` was assigned twice sequentially — causing the first
TemporaryDirectory (for value_store) to be immediately garbage
collected when the second one (for count_store) overwrote the
reference. This silently deleted the value store's backing directory.

Fix: introduce `self.value_tmpdir` and `self.count_tmpdir` as
separate attributes so both temporary directories remain alive for
the lifetime of the ZarrAvgMerger instance.

Fixes Project-MONAI#8476
Copilot AI review requested due to automatic review settings April 10, 2026 07:27
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

ZarrAvgMerger.__init__ in monai/inferers/merger.py refactors temporary directory and codec handling logic. For zarr v3 with unset value_store and/or count_store, independent TemporaryDirectory instances replace a single reusable instance. Zarr-version detection is performed once before use. Codec and compressor normalization is simplified: zarr v3 converts these parameters to lists, while zarr v2 selects first elements from sequences with fallbacks to compressor variants. No method signatures or aggregate/finalize control flow change.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main fix: using separate tmpdir attributes instead of reusing a single one that gets overwritten in zarr v3.
Description check ✅ Passed Description covers root cause, fix, testing strategy, and completed checklist items. Issue reference included but checklist section shows only non-breaking change marked.
Linked Issues check ✅ Passed PR directly addresses #8476 by fixing the temporary directory garbage collection issue that causes the ValueError in zarr v3 array creation.
Out of Scope Changes check ✅ Passed Changes are scoped to ZarrAvgMerger.init refactoring (tmpdir attribute separation, codec normalization cleanup) with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Zarr v3-specific lifecycle bug in ZarrAvgMerger where a temporary directory backing value_store could be prematurely garbage-collected when both value_store and count_store were None, causing runtime failures during array creation/usage.

Changes:

  • Split the single self.tmpdir into self.value_tmpdir and self.count_tmpdir to keep both TemporaryDirectory instances alive for the merger’s lifetime on Zarr v3.
  • Remove the now-unneeded self.tmpdir handling in the Zarr v2 branch.
  • Minor refactor/simplification of codec/compressor assignment logic while preserving behavior across Zarr v2/v3.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@monai/inferers/merger.py`:
- Around line 345-347: The current v2 fallback uses codecs[0], value_codecs[0],
and count_codecs[0] which will IndexError on empty sequences; update the logic
in the Merger initializer (the assignment to self.codecs, self.value_codecs,
self.count_codecs) to treat an empty list/tuple as None (i.e., if isinstance(x,
(list,tuple)) and len(x) > 0 then use x[0], elif x is not None use x, else use
the corresponding compressor/value_compressor/count_compressor fallback), or
alternatively raise a clear ValueError when an explicit empty sequence is
provided; also add a regression unit test that passes empty lists for
codecs/value_codecs/count_codecs and asserts the correct fallback behavior or
the expected ValueError.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ce0d16fe-45cb-4e1e-a4db-03756ea00f72

📥 Commits

Reviewing files that changed from the base of the PR and between cc92126 and fe4e11e.

📒 Files selected for processing (1)
  • monai/inferers/merger.py

Comment on lines +345 to +347
self.codecs = codecs[0] if isinstance(codecs, (list, tuple)) else codecs if codecs is not None else compressor # type: ignore[assignment]
self.value_codecs = value_codecs[0] if isinstance(value_codecs, (list, tuple)) else value_codecs if value_codecs is not None else value_compressor # type: ignore[assignment]
self.count_codecs = count_codecs[0] if isinstance(count_codecs, (list, tuple)) else count_codecs if count_codecs is not None else count_compressor # type: ignore[assignment]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard empty codec lists in the v2 fallback.

codecs[0], value_codecs[0], and count_codecs[0] will raise IndexError for []. Please treat an empty sequence as None or raise a targeted ValueError, and add a regression test for that path. As per coding guidelines, "Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Ensure new or modified definitions will be covered by existing or new unit tests."

Suggested fix
+        def _unwrap_v2_codec_arg(arg):
+            if isinstance(arg, (list, tuple)):
+                return arg[0] if arg else None
+            return arg
+
         else:
             # For zarr v2, use compressors
-            self.codecs = codecs[0] if isinstance(codecs, (list, tuple)) else codecs if codecs is not None else compressor  # type: ignore[assignment]
-            self.value_codecs = value_codecs[0] if isinstance(value_codecs, (list, tuple)) else value_codecs if value_codecs is not None else value_compressor  # type: ignore[assignment]
-            self.count_codecs = count_codecs[0] if isinstance(count_codecs, (list, tuple)) else count_codecs if count_codecs is not None else count_compressor  # type: ignore[assignment]
+            self.codecs = _unwrap_v2_codec_arg(codecs) if codecs is not None else compressor  # type: ignore[assignment]
+            self.value_codecs = _unwrap_v2_codec_arg(value_codecs) if value_codecs is not None else value_compressor  # type: ignore[assignment]
+            self.count_codecs = _unwrap_v2_codec_arg(count_codecs) if count_codecs is not None else count_compressor  # type: ignore[assignment]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@monai/inferers/merger.py` around lines 345 - 347, The current v2 fallback
uses codecs[0], value_codecs[0], and count_codecs[0] which will IndexError on
empty sequences; update the logic in the Merger initializer (the assignment to
self.codecs, self.value_codecs, self.count_codecs) to treat an empty list/tuple
as None (i.e., if isinstance(x, (list,tuple)) and len(x) > 0 then use x[0], elif
x is not None use x, else use the corresponding
compressor/value_compressor/count_compressor fallback), or alternatively raise a
clear ValueError when an explicit empty sequence is provided; also add a
regression unit test that passes empty lists for
codecs/value_codecs/count_codecs and asserts the correct fallback behavior or
the expected ValueError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ValueError: compressor cannot be used for arrays with zarr_format 3

2 participants