Skip to content

Add a libpressio codec wrapper#23

Draft
juntyr wants to merge 43 commits intomainfrom
pressio
Draft

Add a libpressio codec wrapper#23
juntyr wants to merge 43 commits intomainfrom
pressio

Conversation

@juntyr
Copy link
Copy Markdown
Owner

@juntyr juntyr commented Apr 24, 2025

  • build libpressio successfully on the host
  • build libpressio successfully on WASM
  • add the codec wrapper

@juntyr juntyr force-pushed the main branch 2 times, most recently from 5e17bf8 to 33abcb5 Compare July 22, 2025 07:28
@juntyr juntyr force-pushed the main branch 2 times, most recently from 5873c6a to 078499b Compare August 19, 2025 09:43
@juntyr juntyr force-pushed the pressio branch 5 times, most recently from 860975f to 7e9dfe8 Compare February 18, 2026 13:09
@juntyr
Copy link
Copy Markdown
Owner Author

juntyr commented Mar 2, 2026

@robertu94 If I compress with the noop compressor, the compressed data returned says that it has no data (but it does have the correct dtype, ndim, and number of elements). Am I doing something wrong? I do need to verify that the output has data to ensure that we're not reading out uninitialised data (UB in Rust), right?

@robertu94
Copy link
Copy Markdown

The noop compressor simply copies the input to the output. If the input has not data (e.g. pressio_data_new_empty) then the decompressed data will not either.

@juntyr
Copy link
Copy Markdown
Owner Author

juntyr commented Mar 2, 2026

The input does have data in my local test, and debug printing the compressed data before and after compression shows that the other metadata (dtype, dims, length) are filled in - yet the has_data call still returns false afterwards

@robertu94
Copy link
Copy Markdown

Can you please share a link to the test code, and I'll take a look (likely next week).

@juntyr
Copy link
Copy Markdown
Owner Author

juntyr commented Mar 3, 2026

@robertu94 Ok, I indentified the issue.

https://github.com/robertu94/libpressio/blob/868a3a70d6ebf55ad67509fbca03bdd0bc1bc246/src/plugins/compressors/noop.cc#L59-L62

While the noop compressor clones the data on compression, it copies into the decompressed data on decompression. Since I was passing in an array with no data for the decompressed output, it copied out no data, and still saw the array as empty.

Fun fact - the noop compressor is unsound (in the Rust world)! I can give it a compressed array with 5xi64 values and an empty decompressed array with shape 10xi64, and it will read 10 values from the compressed array, i.e. 5 uninitialised data values.

Is this behaviour intended? Rust cannot expose any safe code that would expose uninitialised memory, so no compressor should do this.

@robertu94
Copy link
Copy Markdown

robertu94 commented Mar 3, 2026

You're right, that's a bug. What compressors are supposed to do in the decompression case is 1) if the output.has_data() == false allocate memory to hold the decompressed data. or 2) if output.has_data() == true check if output->size_in_bytes() < output_size and if so, modify the output's dimension metadata (but not capacity) to fix that.

What probably needs to happen in this case is a new function be added to the domain_manager class to handle this kind of "copy/move if it fits" mechanism so it doesn't need to be reimplmemented everywhere.

@juntyr
Copy link
Copy Markdown
Owner Author

juntyr commented Mar 10, 2026

Ok, now let's try to match https://github.com/robertu94/libpressio/blob/master/tools/swig/libpressio.py as much as possible ...

@juntyr juntyr mentioned this pull request Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants