feat(compression): update tooling to use DECODE operators by rkuester · Pull Request #3574 · tensorflow/tflite-micro

rkuester · 2026-05-26T14:05:39Z

This is a draft PR for running CI, review, and seeing the commits in
context. The commits along this branch will be individually submitted
for merge.

This replaces #3400, whose branch had fallen far behind main. Rather
than rebase, the tooling was rebuilt fresh on current main,
commit-by-commit, with tf-free tests that build fixtures via
model_editor instead of TensorFlow.

The first commit is the tflm_py_* shim-wrapper change submitted
separately as #3573; the remaining commits are the DECODE tooling.

See the linked issue for a description of the change.

BUG=implements #3256

Implement unified module for creating, reading, and modifying TFLite models with a clean API. The module eliminates manual index tracking and buffer management through automatic bookkeeping, supporting both declarative and imperative construction styles. Wrapper classes (Tensor, Operator, Subgraph, Model) hold the underlying flatbuffer T objects as backing storage rather than copying fields into dataclasses. This ensures all schema fields are preserved during read-modify-write cycles, even fields not explicitly handled by model_editor. Future schema additions will be preserved automatically. Add comprehensive test coverage including field preservation tests that verify unhandled schema fields survive read-modify-write. BUG=part of tensorflow#3256

…_editor Replace model_facade with model_editor in compress.py and tests. model_editor provides a cleaner API with better buffer and metadata handling. Update BUILD dependencies accordingly. BUG=part of tensorflow#3256

Remove model_facade module and its tests, now superseded by model_editor. BUG=part of tensorflow#3256

…ess_test Replace dictionary-based test_models.build() with model_editor's declarative API for building test models. BUG=part of tensorflow#3256

Remove test_models module and its tests, now superseded by model_editor. BUG=part of tensorflow#3256

Add decode module with DecodeType constants and DecodeCommonMetadata, per the TFLM DECODE Operator Design document. BUG=part of tensorflow#3256

Define the plugin interface for compression methods. Each compressor implements the Compressor protocol with a compress() method that returns encoded data and ancillary data. BUG=part of tensorflow#3256

Implement LutCompressor using the Compressor protocol. Lookup table compression replaces tensor values with indices into a table of unique values, producing packed indices and ancillary data in the format expected by the TFLM DECODE kernel. Supports per-tensor and per-channel compression, sizes value tables to actual unique count, and handles unquantized tensors. BUG=part of tensorflow#3256

Add spec types, YAML parser support, and plugin stubs for Huffman and Pruning compression methods. The plugins raise CompressionError when invoked, to be replaced with working implementations later. BUG=part of tensorflow#3256

Add alt_decompression_memory_size parameter to the Python interpreter API. When non-zero, allocates a separate memory region for DECODE operator outputs and calls SetDecompressionMemory before AllocateTensors. BUG=part of tensorflow#3256

Insert DECODE operators before consumers of compressed tensors. Each consumer gets its own DECODE operator to support alternate decompression memory, which resets allocations between DECODE invocations. After insertion, compressed tensors are rewritten to hold encoded data as UINT8 with shape matching byte count. BUG=part of tensorflow#3256

Replace monolithic compression logic with a dispatch table that routes compression requests to plugin modules based on the spec's compression method type. After compressing tensors, insert DECODE operators into the model graph. Warn when compression expands data, helping users identify tensors that don't benefit from compression. BUG=part of tensorflow#3256

Add tests that compress models with LUT compression, run them through the TFLM Python interpreter, and verify outputs match uncompressed originals. Cover per-tensor and per-channel quantization, various index bitwidths, unquantized weights, and alternate decompression memory. BUG=part of tensorflow#3256

Add a manual test for verifying compression on proprietary models that can't be checked into the repository. See the module docstring for usage instructions. BUG=part of tensorflow#3256

Explicit inheritance from Protocol enables static type checking at definition time and makes the interface self-documenting. BUG=part of tensorflow#3256

An upcoming change registers the DECODE operator unconditionally in the Python ops resolver, after which compress() emits DECODE-based models that load successfully. That breaks this test's original approach, which ran a model through compress() and expected the load to fail. Rewrite it to instead inject a raw COMPRESSION_METADATA entry into the flatbuffer via model_editor, directly exercising the HasCompressionMetadata() detection path for legacy-compressed models. Decoupling the test from compress() output lets it verify the legacy-rejection behavior independently of whether the DECODE operator is registered, so it passes both before and after that upcoming change. BUG=part of tensorflow#3256

The DECODE kernel and its dependencies are already compiled unconditionally -- none are guarded by USE_TFLM_COMPRESSION. Remove the #ifdef around AddDecode() in PythonOpsResolver so DECODE-based compressed models work in a default Python build. Remove the with_compression_enabled gating from compression and proprietary integration tests, since they use DECODE-based models that no longer require the flag. BUG=part of tensorflow#3256

Add test_multiple_compressed_inputs_batched: a CONCATENATION with two compressed tensor inputs, each with a different bitwidth, should produce a single DECODE with 4 inputs and 2 outputs, each ancillary tensor carrying its own distinct data. Marked expectedFailure until the implementation lands. Add test_mixed_compressed_and_uncompressed_inputs: a CONCATENATION with one compressed and one plain input leaves the plain input untouched. This already passes with the current code. BUG=part of tensorflow#3256

When a single operator (e.g., CONCATENATION) has multiple compressed tensor inputs, group them into one DECODE instead of creating a separate DECODE for each. Grouping is per-consumer, so a tensor shared across different consumers still gets a separate DECODE before each one to avoid clobbering the alternate decompression memory. BUG=part of tensorflow#3256

An empty spec list passed to compress() previously returned an unmodified model silently. Fail early with a clear error instead, since an empty spec is almost certainly a mistake. BUG=part of tensorflow#3256

This was referenced May 26, 2026

feat(compression): update tooling to use DECODE operators #3400

Closed

feat(compression): implement model_editor for TFLite model manipulation #3439

Closed

rkuester force-pushed the feat-decode branch from d9b9b1f to 21dae01 Compare May 26, 2026 16:49

rkuester added 21 commits May 26, 2026 16:05

chore(compression): remove model_facade.py

e1afb8c

Remove model_facade module and its tests, now superseded by model_editor. BUG=part of tensorflow#3256

refactor(compression): replace test_models with model_editor in compr…

f384f98

…ess_test Replace dictionary-based test_models.build() with model_editor's declarative API for building test models. BUG=part of tensorflow#3256

chore(compression): remove test_models.py

2901f55

Remove test_models module and its tests, now superseded by model_editor. BUG=part of tensorflow#3256

feat(compression): add DECODE operator types and metadata

31434d4

Add decode module with DecodeType constants and DecodeCommonMetadata, per the TFLM DECODE Operator Design document. BUG=part of tensorflow#3256

feat(compression): add Compressor protocol

5be61ce

Define the plugin interface for compression methods. Each compressor implements the Compressor protocol with a compress() method that returns encoded data and ancillary data. BUG=part of tensorflow#3256

test(compression): add proprietary model integration test

6791fba

Add a manual test for verifying compression on proprietary models that can't be checked into the repository. See the module docstring for usage instructions. BUG=part of tensorflow#3256

refactor(compression): compressors inherit from Compressor protocol

b264245

Explicit inheritance from Protocol enables static type checking at definition time and makes the interface self-documenting. BUG=part of tensorflow#3256

feat(compression): reject empty compression spec

92300fa

An empty spec list passed to compress() previously returned an unmodified model silently. Fail early with a clear error instead, since an empty spec is almost certainly a mistake. BUG=part of tensorflow#3256

docs(python): explain env vars in test runner

8001af0

rkuester force-pushed the feat-decode branch 2 times, most recently from 15d857c to 8001af0 Compare May 26, 2026 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compression): update tooling to use DECODE operators#3574

feat(compression): update tooling to use DECODE operators#3574
rkuester wants to merge 21 commits into
tensorflow:mainfrom
rkuester:feat-decode

rkuester commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rkuester commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant