Summary
Update the compression tooling to produce models that use DECODE
operators instead of the metadata-based compression scheme.
The interpreter already supports DECODE operators for decompression.
The previous tooling stored compression metadata in the flatbuffer,
requiring the interpreter to detect compressed tensors and requiring
each kernel to implement decompression internally. The updated tooling
inserts explicit DECODE custom operators into the graph, handling
decompression before kernels execute and eliminating per-kernel
decompression logic.
Additionally, refactor the tooling to use a plugin architecture,
enabling multiple compression methods (LUT, Huffman, Pruning) to be
implemented independently.
Changes
- Replace
model_facade with model_editor for TFLite model manipulation
- Add DECODE operator insertion logic for compressed tensors
- Add
Compressor protocol for compression plugins
- Implement LUT compression as a plugin (Huffman and Pruning are stubs)
- Add integration tests verifying compressed models produce correct
inference results through the TFLM interpreter
Testing
All changes include unit tests. Integration tests run with
--//:with_compression and verify:
- Compressed model outputs match uncompressed
- DECODE operators are inserted
- Compressed models are smaller than originals
Summary
Update the compression tooling to produce models that use DECODE
operators instead of the metadata-based compression scheme.
The interpreter already supports DECODE operators for decompression.
The previous tooling stored compression metadata in the flatbuffer,
requiring the interpreter to detect compressed tensors and requiring
each kernel to implement decompression internally. The updated tooling
inserts explicit DECODE custom operators into the graph, handling
decompression before kernels execute and eliminating per-kernel
decompression logic.
Additionally, refactor the tooling to use a plugin architecture,
enabling multiple compression methods (LUT, Huffman, Pruning) to be
implemented independently.
Changes
model_facadewithmodel_editorfor TFLite model manipulationCompressorprotocol for compression pluginsinference results through the TFLM interpreter
Testing
All changes include unit tests. Integration tests run with
--//:with_compressionand verify: