Skip to content

Conversation

@frankblubaugh
Copy link

@frankblubaugh frankblubaugh commented Nov 27, 2025

Closes #123

This PR should add ZFP as a compression filter. First contribution to a rust project so I would appreciate any review.

This should add ZFP as an optional feature that can be implemented as such:

            file.new_dataset_builder()
                .zfp_lossless()
                .with_data(&data)
                .chunk((10000,))
                .create("zfp_lossless_1d")
                .unwrap();

or

            file.new_dataset_builder()
                .zfp_accuracy(1e-4)
                .with_data(&data)
                .chunk((10000,))
                .create("zfp_accuracy_1d")
                .unwrap();

@frankblubaugh
Copy link
Author

I reran this current build on my own fork with the came CI.yml build as this and every test passed. Don't know why it didn't pass here.

@mulimoen
Copy link
Collaborator

That is just the buggy 1.8 release which is rather flakey

Copy link
Collaborator

@mulimoen mulimoen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big undertaking and you seemed to have gotten it mostly right! There is a missing entry in src/hl/dataset.rs:1003 and a missing entry in the changelog

Frank Blubaugh added 4 commits November 29, 2025 07:24
…d teh filter calls with feature dependent macro implementation like the other features
…ot natively take the number of bytes needed for the processing but instead it uses the dimensionality of the compressed array to do the data compression. This information is captured byt the ZfpConfig Struct. This takes care of allocating the appropriate buffer size for the compression downstream from here.
@frankblubaugh
Copy link
Author

Something strange is happening as this is producing files 2x bigger than the reference MATLAB HDF5 implementation

@mulimoen
Copy link
Collaborator

mulimoen commented Dec 3, 2025

Are you using the same chunk size?

@mulimoen
Copy link
Collaborator

mulimoen commented Dec 5, 2025

Did you figure out the difference to the matlab produced file? I can see the filter is significantly compressing some dummy data (hdf5/examples/chunking.rs)

@frankblubaugh
Copy link
Author

I did find the answer. The filters were working fine as implemented in the crate. However, if you wrote a file out and then tried to parse it via h5py/matlab the filter header information was not packed the same way causing a malformed read error. I think it’s figured out but I want to mull the inputs.

On my phone, but currently the call on the builder is now:
Zfp_accuracy(acc,chunk_dims,element_bytes)

I couldn’t fine a robust way to get the as built chunk dims AND the element type from just the plist parameters. Updating the filter call would be a much more significant refactor I think as every other filter arg would need to change.

Currently I’m inclined to require the user to have to redundantly specify data type and chunk dims but it would be cleaner without it.

@frankblubaugh frankblubaugh marked this pull request as draft December 19, 2025 18:47
@frankblubaugh
Copy link
Author

This should be ready to go assuming it passes the right test s

@frankblubaugh frankblubaugh marked this pull request as ready for review December 19, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement ZFP Compression

2 participants