Skip to content

Conversation

@kabilar
Copy link
Member

@kabilar kabilar commented Sep 19, 2025

I uploaded Zarr assets yesterday and the DANDI Hub terminated overnight, so I re-ran the upload this morning to make sure everything transferred. It turns out that all the data was already transferred but the terminal output was a bit confusing:

12:08 $ DANDI_DEVEL=1 dandi upload --allow-any-path --validation ignore .
2025-09-19 12:08:48,388 [    INFO] Found 15 files to consider
PATH                                              SIZE     ERRORS PROGRESS STATUS                         MESSAGE                  
dandiset.yaml                                     1.3 kB                   skipped                        should be edited online  
.../micr/sub-I74_sample-IC_acq-overview_XPCT.json 77 Bytes   0             skipped                        file exists              
...r/sub-I74_sample-IC_acq-overview_XPCT.ome.zarr 89.9 GB    0             done                           exists - reuploading     
.../sub-I74_sample-IC_acq-zoom_chunk-01_XPCT.json 74 Bytes   0             skipped                        file exists              
...-I74_sample-IC_acq-zoom_chunk-01_XPCT.ome.zarr            0             comparing against remote Zarr  exists - reuploading     
.../sub-I74_sample-IC_acq-zoom_chunk-02_XPCT.json 74 Bytes   0             skipped                        file exists              
...-I74_sample-IC_acq-zoom_chunk-02_XPCT.ome.zarr            0             comparing against remote Zarr  exists - reuploading     
.../sub-I74_sample-IC_acq-zoom_chunk-03_XPCT.json 74 Bytes   0             skipped                        file exists              
...-I74_sample-IC_acq-zoom_chunk-03_XPCT.ome.zarr 134.7 GB   0             done                           exists - reuploading     
...icr/sub-I74_sample-hemi_acq-overview_XPCT.json 77 Bytes   0             skipped                        file exists              
...sub-I74_sample-hemi_acq-overview_XPCT.ome.zarr            0             comparing against remote Zarr  exists - reuploading     
...ub-I74_sample-hemi_acq-zoom_chunk-01_XPCT.json 74 Bytes   0             skipped                        file exists              
...74_sample-hemi_acq-zoom_chunk-01_XPCT.ome.zarr            0             comparing against remote Zarr  exists - reuploading     
...ub-I74_sample-hemi_acq-zoom_chunk-02_XPCT.json 74 Bytes   0             skipped                        file exists              
...74_sample-hemi_acq-zoom_chunk-02_XPCT.ome.zarr                          pre-validating                                          
Summary:                                          224.6 GB                 8 skipped                      1 should be edited online
                                                                           2 done                         7 file exists            
                                                                           4 comparing against remote ... 6 exists - reuploading   
                                                                           1 pre-validating

Specifically I would suggest the following changes:

  • exists - reuploading should be changed to file exists. I have submitted the change here.
  • done should be changed to skipped. I am not sure where the most appropriate place in the zarr.py module would be for the change to take place.

@kabilar kabilar requested a review from yarikoptic September 19, 2025 14:37
@kabilar kabilar self-assigned this Sep 19, 2025
@kabilar kabilar added the UX label Sep 19, 2025
@codecov
Copy link

codecov bot commented Sep 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.82%. Comparing base (2827711) to head (54bc515).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1712   +/-   ##
=======================================
  Coverage   74.82%   74.82%           
=======================================
  Files          84       84           
  Lines       11693    11693           
=======================================
  Hits         8749     8749           
  Misses       2944     2944           
Flag Coverage Δ
unittests 74.82% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yarikoptic
Copy link
Member

TWICE I tried to compose reply and then laptop was crashing due to too many displays etc... Friday it was !

I re-ran the upload this morning to make sure everything transferred. It turns out that all the data was already transferred but the terminal output was a bit confusing:

this is incomplete display, i.e. process didn't finish -- how do you know that all was transferred? do you have access to the log? (path is printed at the end)

  • exists - reuploading should be changed to file exists. I have submitted the change here.

well -- 'zarr' is not a file per se. Could be a "fileset" or "catalog". We should unify naming. Which one you prefer?

That "reuploading" part -- in principle it is consistent with how we would report for a regular blob asset if decide to reupload: see the end of that check_replace_asset - https://github.com/dandi/dandi-cli/blob/HEAD/dandi/upload.py#L510 but we are taking a "shortcut" and trying to "reupload" entire zarr because (I think) we decided in db7c834 that it is not to pre-compute checksum on it so we check during reupload if anything to be reuploaded for it -- we do not just skip.

  • done should be changed to skipped. I am not sure where the most appropriate place in the zarr.py module would be for the change to take place.

I would need to comb the logic there... It is either the "done" from
https://github.com/dandi/dandi-cli/blob/HEAD/dandi/files/zarr.py#L888 where we could indeed add analysis if anything was actually found different/uploaded and change to "skipped", but may be also came from a a generic handling of upload of any asset in

❯ git grep -np '"done"' dandi/upload.py
dandi/upload.py=94=def upload(
dandi/upload.py:401:                yield {"status": "done"}

? needs analysis...

@yarikoptic yarikoptic added the patch Increment the patch version when merged label Sep 20, 2025
@yarikoptic yarikoptic marked this pull request as draft October 13, 2025 17:24
@yarikoptic
Copy link
Member

changed to draft since still needs discussion etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Increment the patch version when merged UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants