Skip to content

Conversation

@yarikoptic
Copy link
Member

implemented by adding "subpath" within an Asset record and adding support to assess it for .zarrs.

Note also that it adjusts logic from try/except to explicit checking -- much better since avoids swallowing unrelated exceptions.

Tried with

dandi -l debug download -f debug --preserve-tree dandi://dandi/000108/sub-MITU01/ses-20210521h17m17s06/micr/sub-MITU01_ses-20210521h17m17s06_sample-178_stain-LEC_run-1_chunk-1_SPIM.ome.zarr/0/0/0/0/0

which seems succeeded downloading, we seems still need to

  • disable checksumming for such partial downloads (since there is no partial checksums ATM at API level)
  • add tests and verify that works on full file paths
  • verify behavior upon redownloads

implemented by adding "subpath" within an Asset record and adding support
to assess it for .zarrs.

Note also that it adjusts logic from try/except to explicit checking -- much
better since avoids swallowing unrelated exceptions.

Tried with

   dandi -l debug download -f debug --preserve-tree dandi://dandi/000108/sub-MITU01/ses-20210521h17m17s06/micr/sub-MITU01_ses-20210521h17m17s06_sample-178_stain-LEC_run-1_chunk-1_SPIM.ome.zarr/0/0/0/0/0

which seems succeeded downloading, we seems still need to

- [ ] disable checksumming for such partial downloads (since there is no
      partial checksums ATM at API level)
- [ ] add tests and verify that works on full file paths
@codecov
Copy link

codecov bot commented Mar 24, 2025

Codecov Report

Attention: Patch coverage is 36.84211% with 12 lines in your changes missing coverage. Please review.

Project coverage is 62.82%. Comparing base (386e50d) to head (3849b7d).

Files with missing lines Patch % Lines
dandi/dandiapi.py 37.50% 10 Missing ⚠️
dandi/download.py 33.33% 2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (386e50d) and HEAD (3849b7d). Click for more details.

HEAD has 140 uploads less than BASE
Flag BASE (386e50d) HEAD (3849b7d)
unittests 150 10
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1596       +/-   ##
===========================================
- Coverage   88.58%   62.82%   -25.76%     
===========================================
  Files          78       78               
  Lines       10872    10885       +13     
===========================================
- Hits         9631     6839     -2792     
- Misses       1241     4046     +2805     
Flag Coverage Δ
unittests 62.82% <36.84%> (-25.76%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

)
asset = assets[0]
elif not assets:
zarr_suf = ".zarr/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about .ngff/?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should also be supported

Comment on lines 1255 to 1260
if (zarr_suf in path) and (zarr_suffix_idx := path.index(zarr_suf)):
# If path ends with .zarr/, we might have a zarr asset without
# a trailing slash in the path. Try again:
zarr_path_len = zarr_suffix_idx + len(zarr_suf)
zarr_path = path[: zarr_path_len - 1] # -1 for trailing /
subpath = path[len(zarr_path) :]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several issues with this:

  • This code accepts paths containing a component that's just the Zarr suffix, e.g., foo/.zarr/bar.
  • The behavior differs from dandidav, which tests each .zarr/.ngff-suffixed leading portion for Zarr-ness.
  • Instead of futzing around with strings, I think it'd better to convert path to a pathlib.PurePosixPath and check path.parts to see if any part ends with .zarr (or .ngff, and also that the suffix is not the whole of the part).

Copy link
Member Author

@yarikoptic yarikoptic Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I agree that dealing with Path instance might be more robust

#: The asset's (forward-slash-separated) path
path: str
#: The path within the asset, as e.g. path in a zarr/
subpath: str | None = Field(default=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of a BaseRemoteAsset object pointing to a resource inside an asset; the class is supposed to be for assets only. It'd be better to introduce a separate class for Zarr prefixes (and possibly also add a variant of get_asset_by_path() for returning this separate class).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also disliked it a little but it is already quite a hierarchy of classes there leading to nearly combinatorial explosion. If you see a sensible way, e.g. through adding a single extra class - please go ahead.

yarikoptic and others added 2 commits March 24, 2025 15:04
Co-authored-by: John T. Wodder II <jwodder@users.noreply.github.com>
@jwodder jwodder added minor Increment the minor version when merged cmd-download cmd-delete zarr and removed cmd-delete labels Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cmd-download minor Increment the minor version when merged zarr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants