stdlib: std::compression::zip and std::compression::deflate#2930
stdlib: std::compression::zip and std::compression::deflate#2930lerno merged 28 commits intoc3lang:masterfrom
std::compression::zip and std::compression::deflate#2930Conversation
Book-reader
left a comment
There was a problem hiding this comment.
It's great to see zip support being added to the stdlib! I don't have much knowledge of the zip spec so I can't comment on the implementation, but I do have this one small style nitpick
There was a problem hiding this comment.
figured out the other problem, with this change and #2930 (comment) writing files 0xFFFFFFFF and larger works
I added a specific test case for this scenario. If you would like to extend the testing further, that would be appreciated. |
|
A few more observations:
|
Hi @konimarti, excellent feedback!
I also fixed a small bug in the STORE read method. |
…ate` - C3 implementation of DEFLATE (RFC 1951) and ZIP archive handling. - Support for reading and writing archives using STORE and DEFLATE methods. - Decompression supports both fixed and dynamic Huffman blocks. - Compression using greedy LZ77 matching. - Zero dependencies on libc. - Stream-based entry reading and writing. - Full unit test coverage. NOTE: This is an initial implementation. Future improvements could be: - Optimization of the LZ77 matching (lazy matching). - Support for dynamic Huffman blocks in compression. - ZIP64 support for large files/archives. - Support for encryption and additional compression methods.
deflate: - replace linear search with hash-based match finding. - implement support for dynamic Huffman blocks using the Package-Merge algorithm. - add streaming decompression. - add buffered StreamBitReader. zip: - add ZIP64 support. - add CP437 and UTF-8 filename encoding detection. - add DOS date/time conversion and timestamp preservation. - add ZipEntryReader for streaming entry reads. - implement ZipArchive.extract and ZipArchive.recover helpers. other: - Add `set_modified_time` to std::io; - Add benchmarks and a few more unit tests.
add tests
fix method not passed to open_writer
- detect encrypted zip - `ZipArchive.open_writer` default to DEFLATE
Update ZipLFH, ZipCDH, ZipEOCD, Zip64EOCD, and Zip64Locator structs to use little-endian bitstruct types from std::core::bitorder
added a test to track this
|
@lerno I've pushed the final optimizations and fixes. All tests and benchmarks are passing. Last change would be: quoting @Book-reader "The style in the stdlib is usually to have the allocator as the first parameter instead of using a default parameter", yes? |
|
Would it be possible to compress the data written to a |
Yes, it's definitely possible, but it would require a streaming deflate compressor which the fn void? compress_stream(Allocator allocator = mem, InStream input, OutStream output)This is indeed a more complex implementation than the current one and would be a good future improvement. I had some issues with implementing this last week but will definitively give it another go when I have the time. |
* style changes * update tests * style changes in `deflate.c3`
Book-reader
left a comment
There was a problem hiding this comment.
Great! Most of the feedback I have now are these incredibly minor code style things I missed before.
I do think the allocator should be changed to be the first parameter in deflate::compress etc, the reason the stdlib style changed in 0.7 to be like that was so it was clear which functions would cause allocations that would need to be freed because they would need to be explicitly given an allocator.
|
There's an issue in the The following tests highlights the issue and fails with an io::EOF error in the last read_byte call: fn void test_deflate_embedded_stream() => @pool()
{
String base = "This is a streaming test for DEFLATE. ";
char[] compressed = deflate::compress(mem, base[..])!!;
defer free(compressed.ptr);
usz append_len = compressed.len + 1;
char[] append = mem::malloc(append_len)[:append_len];
defer free(append.ptr);
append[:compressed.len] = compressed[..];
append[^1] = 'c';
ByteReader reader;
reader.init(append);
ByteWriter writer;
writer.tinit();
deflate::decompress_stream(&reader, &writer)!!;
assert(writer.str_view() == base);
assert(reader.read_byte()!! == 'c');
} |
|
Maybe something I introduced? |
I dropped the last two commits and re-tested and it still fails. It seems not related to your changes. I think it's related to the refill logic of the bitreader. |
|
Yeah, I looked through it and this happens due to the internal buffering. |
…ailable. - `instream.seek` is replaced by `set_cursor` and `cursor`. - `instream.available`, `cursor` etc are long/ulong rather than isz/usz to be correct on 32-bit.
Uh oh!
There was an error while loading. Please reload this page.