Skip to content

felixgalindo/TinyMLDelta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TinyMLDelta

Incremental model updates for TinyML and embedded AI devices.

Instead of shipping a full TensorFlow Lite Micro model (20–200+ KB) over the air, TinyMLDelta ships a tiny binary patch that mutates the existing model in flash into a new one — safe, atomic, and guardrail-checked.


Results

POSIX simulation (no hardware required)

Metric Value
Base model size 67,440 bytes
Target model size 67,440 bytes
Patch size 475 bytes
Diff payload 382 bytes (1 chunk)
Bandwidth reduction 99.3%
Integrity CRC32 per chunk
Slot strategy A/B atomic swap
Journal Crash-safe (power-loss recovery)

A weight-update to a real TFLite sensor model produces a 475-byte patch instead of a 67 KB re-flash. The entire update, verification, and slot flip runs in under a second on a simulated flash image.

[run_demo] Patch size       :      475 bytes
[run_demo] Base model size  :    67440 bytes
[run_demo] Target model size:    67440 bytes
...
TinyMLDelta: chunk[0]: off=62728 len=382 enc=0 has_crc=1
TinyMLDelta: patch applied OK, new active slot=0
[verify_flash] SUCCESS: target model found at offset 131072 in flash image.

Arduino UNO Q — live thermocouple anomaly demo

Metric Value
MCU STM32U585 (Cortex-M33, 2 MB flash, 512 KB RAM)
Sensor SparkFun MCP9600 (Qwiic, K-type thermocouple)
Patch transport Monitor serial (USB CDC → Linux bridge → STM32)
Patch buffer 4 KB RAM (A/B slots in SRAM)
Anomaly method Welford Z-score (TFLM autoencoder optional)
Update Live, no re-flash, CRC32 verified
================================================
 TinyMLDelta Qwiic Thermo Anomaly Demo
 Arduino UNO Q | MCP9600 | TinyMLDelta
================================================
[TRAIN] 200/200  23.44 C
[TRAIN] Baseline: mean=23.45 std=0.12 C
[UPDATE] Receiving 475 bytes...
[UPDATE] Done (patch applied, slot flipped).
[INFER]  23.44 C  score=0.009  OK
[INFER]  41.00 C  score=4.21   *** ANOMALY ***

What it solves

Problem TinyMLDelta's answer
OTA bandwidth cost Ship diffs, not full models
Flash wear One write per changed byte, not the whole model
Update latency Seconds to transfer a patch vs minutes for a full image
Fleet fragmentation Guardrails enforce ABI/opset/arena compatibility before applying
Bootloader complexity No custom bootloader needed — just the C runtime in your firmware

Supported today

  • TensorFlow Lite Micro models
  • POSIX / macOS simulated flash environment
  • CRC32 per-chunk integrity
  • A/B slot atomic updates
  • Crash-safe journaling (power-loss recovery)
  • Arduino UNO Q (STM32U585 + Zephyr)
  • RAW and RLE chunk encoding

On the roadmap

  • Edge Impulse frontend
  • SHA-256 and AES-CMAC signatures
  • Model versioning TLVs
  • Zephyr RTOS port
  • Arduino UNO R4 WiFi port
  • ESP32 / Tachyon ports

When to use a patch vs a full firmware update

TinyMLDelta safely updates models when the firmware remains compatible. Compatibility is enforced by metadata TLVs generated by PatchGen and validated by the MCU runtime.

✔ Patch-friendly (no firmware update needed)

  • Weight and bias updates
  • Quantization parameter changes
  • Re-training the same architecture on new data
  • Minor graph edits with no operator changes
  • Same opset, ABI, arena size, and I/O schema

✖ Requires a full firmware update

Change Why
New operators Firmware must link the new kernels
Opset version change Operator implementations differ
TFLM ABI change Interpreter ABI mismatch
Larger arena requirement Arena is fixed at compile time
Different I/O shapes or dtypes Application code depends on these

TinyMLDelta automatically rejects incompatible patches.


Architecture

   PC / CI                          MCU (device)
   ────────────────────             ──────────────────────────────
   base.tflite  ──┐
   target.tflite ─┤
                  ▼
           PatchGen (Python)
           • byte-level diff
           • RLE compression           flash slot A  [active model]
           • CRC32 per chunk           flash slot B  [inactive]
           • metadata TLVs
                  │
                  │  OTA (serial / BLE / MQTT / …)
                  ▼
           TinyMLDelta Core (C)
           • parse header + TLVs
           • enforce guardrails
           • copy A → B
           • apply diff chunks → B
           • verify CRC32
           • atomic slot flip: B → active
                  │
                  ▼
                         flash slot B  [new active model]

PatchGen is stateless and runs off-target (laptop, CI server). TinyMLDelta Core is platform-agnostic C that lives in your firmware.


Quickstart

Option A — No hardware (POSIX simulation)

cd examples/posix
./setup.sh --run

Runs the full flow — model generation → patch generation → simulated flash apply → verification — entirely on your Mac or Linux machine.

Option B — Arduino UNO Q + MCP9600 thermocouple

cd examples/UnoQ_TinyMLDeltaDemo
./setup.sh --upload        # install deps, compile, flash
python3 run_demo.py        # train → update → infer

Examples

Example Platform What it shows
examples/posix/ macOS / Linux Full update flow, no hardware. Model gen → patch → simulated flash → verify.
examples/UnoQ_TinyMLDeltaDemo/ Arduino UNO Q Live thermocouple anomaly demo. Train on-device → generate patch on PC → push OTA → infer.
examples/UnoQ_EI_TinyMLDeltaDemo/ Arduino UNO Q Edge Impulse variant (stub, Z-score fallback runs today).
examples/modelgen/ PC Standalone TFLite model generator used by the POSIX demo.

Wire format

Patch header (tmd_hdr_t, packed, little-endian)

typedef struct __attribute__((packed)) {
    uint8_t  v;               // format version (always 1)
    uint8_t  algo;            // 0=NONE, 1=CRC32, 2=SHA256, 3=CMAC
    uint16_t chunks_n;        // number of diff chunks
    uint32_t base_len;        // expected base model size
    uint32_t target_len;      // expected target model size
    uint8_t  base_chk[32];    // integrity digest of base
    uint8_t  target_chk[32];  // integrity digest of target
    uint16_t meta_len;        // bytes of metadata TLVs that follow
    uint16_t flags;
} tmd_hdr_t;

Metadata TLVs ([tag][len][value...])

Tag Name Type Purpose
0x01 REQ_ARENA_BYTES u32 Reject if firmware arena < this value
0x02 TFLM_ABI u16 Reject on ABI version mismatch
0x03 OPSET_HASH u32 Reject if op-set changed
0x04 IO_HASH u32 Reject if I/O tensor schema changed
≥0x80 vendor any Ignored by core; application-defined

Chunk header (tmd_chunk_hdr_t)

typedef struct __attribute__((packed)) {
    uint32_t off;      // byte offset into the model
    uint16_t len;      // payload length in bytes
    uint8_t  enc;      // 0 = RAW, 1 = RLE
    uint8_t  has_crc;  // 1 = CRC32 appended after payload
} tmd_chunk_hdr_t;

Installation (CLI / PatchGen only)

cd cli/
./install.sh                    # creates .tinyenv + installs tensorflow
source .tinyenv/bin/activate
python3 tinymldelta_patchgen.py base.tflite target.tflite patch.tmd

Directory layout

TinyMLDelta/
├── cli/
│   ├── install.sh                   Create .tinyenv + install CLI deps
│   ├── requirements.txt             Python dependencies
│   ├── tinymldelta_patchgen.py      PatchGen: diff engine, TLV writer, .tmd output
│   └── tinymldelta_meta_compute.py  Optional: extract TFLite metadata for TLVs
│
├── examples/
│   ├── posix/                       No-hardware simulation (macOS/Linux)
│   │   ├── setup.sh                 Install deps + build
│   │   ├── run_demo.sh              End-to-end: generate → patch → apply → verify
│   │   ├── README.md
│   │   ├── demo_apply.c             POSIX patch applier
│   │   ├── tinymldelta_ports_posix.c POSIX flash/journal/log port
│   │   ├── flash_layout.h           Simulated A/B flash geometry
│   │   ├── make_flash.py            Build flash.bin with A/B slots
│   │   └── verify_flash.py          Confirm target model in flash after update
│   │
│   ├── modelgen/
│   │   ├── make_models.py           Generate base.tflite + target.tflite for demos
│   │   └── README.md
│   │
│   ├── UnoQ_TinyMLDeltaDemo/        Arduino UNO Q + MCP9600 thermocouple
│   │   ├── setup.sh                 Install deps + compile (+ --upload to flash)
│   │   ├── run_demo.py              Interactive demo runner (train→update→infer)
│   │   ├── UnoQ_TinyMLDeltaDemo.ino Main Arduino sketch
│   │   ├── tinymldelta_ports_arduino.cpp  RAM-backed port for UNO Q
│   │   ├── flash_layout_uno_q.h     Virtual A/B slot map
│   │   ├── make_model.py            Train autoencoder from CSV, generate patch.tmd
│   │   ├── send_patch.py            Send patch to device over serial
│   │   ├── model.h                  Placeholder model (regenerated by make_model.py)
│   │   └── README.md
│   │
│   └── UnoQ_EI_TinyMLDeltaDemo/    Arduino UNO Q + Edge Impulse (stub)
│       ├── UnoQ_EI_TinyMLDeltaDemo.ino
│       ├── tinymldelta_ports_arduino.cpp
│       ├── flash_layout_uno_q.h
│       └── README.md
│
├── runtime/
│   ├── include/
│   │   ├── tinymldelta.h            Public C API (tmd_apply_patch_from_memory)
│   │   ├── tinymldelta_config.h     Build-time flags + firmware guardrail config
│   │   ├── tinymldelta_internal.h   Wire format: tmd_hdr_t, tmd_chunk_hdr_t, TLVs
│   │   └── tinymldelta_ports.h      Platform abstraction: flash, digest, slots, journal
│   └── src/
│       └── tinymldelta_core.c       Platform-agnostic patch engine
│
├── CONTRIBUTING.md
├── SECURITY.md
├── LICENSE                          Apache-2.0
└── README.md                        This file

Contributing

Contributions welcome — see CONTRIBUTING.md for guidelines.

Areas of highest interest:

  • New MCU ports (Zephyr, ESP32, STM32 bare-metal, Tachyon)
  • Edge Impulse frontend
  • SHA-256 / AES-CMAC signing pipeline
  • LZ4 or bsdiff compression backend
  • CI test harness for the POSIX demo

License

Apache-2.0 © 2024–2025 Felix Galindo

About

TinyMLDelta is an incremental model-update system for TinyML and embedded AI devices.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors