fix: accept valid ExecuTorch FlatBuffers binaries by yash2998chhabria · Pull Request #715 · promptfoo/modelaudit

yash2998chhabria · 2026-03-16T18:08:44Z

Summary

recognize valid ExecuTorch FlatBuffers binaries in .pte files instead of treating every non-ZIP file as an invalid archive
teach shared file-type detection to identify the ET12 ExecuTorch binary signature so valid public models do not emit S901 mismatch noise
add regression coverage for scanner handling, file-type validation, and changelog the false-positive reduction

Validation

/Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/ruff format modelaudit/ tests/
/Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/ruff check --fix modelaudit/ tests/
/Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/mypy modelaudit/
/Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/pytest -n auto -m "not slow and not integration" --maxfail=1
10-model Hugging Face rerun in the worktree: executorch flagged 10/10 before the fix and 0/10 after the fix

Summary by CodeRabbit

Bug Fixes
- Eliminated false positives for valid ExecuTorch FlatBuffers binaries by recognizing ExecuTorch binaries and short-circuiting archive checks.
- Improved file-type validation for .pte model files to accept genuine ExecuTorch binaries alongside archive formats.
Tests
- Added tests covering ExecuTorch binary detection, acceptance, and rejection scenarios.
- Enabled ExecuTorch scanner tests to run in the test matrix.
Documentation
- Added a changelog entry noting the fix.

Recognize valid ExecuTorch FlatBuffers programs in .pte files, prevent file-type validation noise for those binaries, and add regression coverage for scanner and detection helpers. Co-Authored-By: Codex <noreply@openai.com>

coderabbitai · 2026-03-16T18:09:03Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Detects ExecuTorch FlatBuffers binaries by inspecting header bytes and adds an early-accept path in the ExecuTorch scanner. File-type detection and validation now recognize ExecuTorch binaries; tests and CHANGELOG updated accordingly.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Added an Unreleased "Fixed" entry documenting elimination of false positives for ExecuTorch FlatBuffers binaries and `.pte` file-type validation.
ExecuTorch scanner `modelaudit/scanners/executorch_scanner.py`	Imported `_is_valid_executorch_binary`; scanner now reads an 8‑byte header, short-circuits to accept valid ExecuTorch binaries, records a passing "ExecuTorch Binary Format Validation", sets `bytes_scanned` to file size, finishes result, and returns early for valid binaries.
File detection utilities `modelaudit/utils/file/detection.py`	Added `_is_executorch_binary_signature` and `_is_valid_executorch_binary`; `detect_file_format_from_magic` returns `"executorch"` for valid binaries; `validate_file_type` treats ExecuTorch binaries as valid even if not ZIP.
Tests & config `tests/conftest.py`, `tests/scanners/test_executorch_scanner.py`, `tests/utils/file/test_filetype.py`	Enabled scanner tests in conftest; added helper to create minimal ExecuTorch binaries and tests for accepting versioned/valid headers and rejecting invalid signatures; extended filetype tests for detection and validation of ExecuTorch binaries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble bytes, four to eight I peep,
ET12 winks — the scanner's calm and deep.
No zip to chase, no false alarm in sight,
I hop, I mark, the binary sleeps tight.
Hooray — the pte is properly right! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 31.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: accept valid ExecuTorch FlatBuffers binaries' directly and accurately summarizes the main change: modifying the scanner to recognize and accept valid ExecuTorch FlatBuffers binaries instead of treating them as invalid archives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/executorch-binary-validation

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Around line 84-87: Remove the duplicate "### Fixed" heading and merge its
following content ("eliminate false positives for valid ExecuTorch FlatBuffers
binaries and file-type validation on public `.pte` models") into the existing
"### Fixed" section above; specifically, delete the second "### Fixed" header
and append its bullet/text to the first "### Fixed" block so the changelog
follows the Keep a Changelog format and avoids the MD024 duplicate-heading lint
warning.

In `@modelaudit/scanners/executorch_scanner.py`:
- Around line 40-44: The function _is_executorch_binary currently checks only
header[4:6] against b"ET" which contradicts the comment ("bytes 4..7") and is
looser than the 4-byte signature used elsewhere; update _is_executorch_binary to
check header length >= 8 and compare header[4:8] to the full 4-byte signature
used in detection.py (refer to _is_executorch_binary_signature) — also update
the inline comment to say "bytes 4..7" or "bytes 4-7" to reflect the 4-byte
check.

In `@modelaudit/utils/file/detection.py`:
- Around line 192-195: The detection helpers are inconsistent:
_is_executorch_binary_signature(prefix) checks for a 4-byte identifier b"ET12"
while ExecuTorchScanner._is_executorch_binary(header) only checks header[4:6] ==
b"ET"; align them by using the same 4-byte check everywhere. Update
ExecuTorchScanner._is_executorch_binary to check header[4:8] == b"ET12" (or
conversely change _is_executorch_binary_signature to match the 2-byte check if
you prefer the looser rule) so both _is_executorch_binary_signature and
ExecuTorchScanner._is_executorch_binary use the identical byte-range and value
for detection.

In `@tests/scanners/test_executorch_scanner.py`:
- Around line 60-68: The test function
test_executorch_scanner_accepts_binary_program_header is missing a return type
hint; update its definition to include the standard test annotation "-> None"
(i.e., change the def signature for
test_executorch_scanner_accepts_binary_program_header to include the return type
hint) so it follows the project coding guidelines for tests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 35ef307b-6091-42f6-b321-ac4cf2bdf625

📥 Commits

Reviewing files that changed from the base of the PR and between d9fe283 and da0125e.

📒 Files selected for processing (6)

CHANGELOG.md
modelaudit/scanners/executorch_scanner.py
modelaudit/utils/file/detection.py
tests/conftest.py
tests/scanners/test_executorch_scanner.py
tests/utils/file/test_filetype.py

CHANGELOG.md

modelaudit/scanners/executorch_scanner.py

modelaudit/utils/file/detection.py

tests/scanners/test_executorch_scanner.py

Align ExecuTorch binary signature checks with the shared detector, remove the duplicate changelog heading, and add the missing test return annotation. Co-Authored-By: Codex <noreply@openai.com>

coderabbitai

♻️ Duplicate comments (1)

tests/scanners/test_executorch_scanner.py (1)
60-60: 🧹 Nitpick | 🔵 Trivial

Add type hint for tmp_path parameter.

Per coding guidelines, test parameters should have type hints: tmp_path: Path.
Proposed fix
-def test_executorch_scanner_accepts_binary_program_header(tmp_path) -> None:
+def test_executorch_scanner_accepts_binary_program_header(tmp_path: Path) -> None:
As per coding guidelines: "Use type hints -> None on all test methods and tmp_path: Path / monkeypatch: pytest.MonkeyPatch on test parameters"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_executorch_scanner.py` at line 60, Add a type hint for
the test parameter by changing the function signature of
test_executorch_scanner_accepts_binary_program_header to accept tmp_path: Path
(i.e., tmp_path: Path) while preserving the existing -> None return annotation;
also ensure Path is imported (from pathlib import Path) at the top of the test
module if it isn't already imported.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/scanners/test_executorch_scanner.py`:
- Line 60: Add a type hint for the test parameter by changing the function
signature of test_executorch_scanner_accepts_binary_program_header to accept
tmp_path: Path (i.e., tmp_path: Path) while preserving the existing -> None
return annotation; also ensure Path is imported (from pathlib import Path) at
the top of the test module if it isn't already imported.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 956755eb-cc0c-464a-9114-557c55ecd183

📥 Commits

Reviewing files that changed from the base of the PR and between da0125e and 6c5ba7b.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/scanners/executorch_scanner.py
tests/scanners/test_executorch_scanner.py

yash2998chhabria · 2026-03-16T19:12:03Z

Testing Metadata Update

Timestamp: 2026-03-16T12:11:49.545724-07:00
Validation method: one external worktree per changed scanner, public Hugging Face models relevant to each scanner, target 10 models per scanner where available
Parallelism: 5 workers, chosen after resource monitoring on this machine (14 CPU, 24 GiB RAM, tight disk headroom)
Broad changed-scanner sweep status: 39/39 scanners completed
Pending broad-sweep scanner: none
Note: the broad table below is the baseline run on main before PR-specific fixes; PR-focused reruns are listed separately

Branch Validation

ruff format modelaudit/ tests/
ruff check --fix modelaudit/ tests/
mypy modelaudit/
pytest -n auto -m "not slow and not integration" --maxfail=1

Broad Hugging Face Sweep (baseline on main)

scanner	discovered	flagged	clean
catboost	10	0	10
cntk	0	0	0
compressed	0	0	0
coreml	10	0	10
executorch	10	10	0
flax_msgpack	10	0	10
gguf	10	0	10
jax_checkpoint	2	0	2
joblib	10	3	7
keras_h5	10	1	9
keras_zip	10	7	3
lightgbm	10	0	10
llamafile	5	0	5
manifest	10	0	10
mxnet	0	0	0
numpy	10	1	9
oci_layer	0	0	0
onnx	10	0	10
openvino	10	0	10
paddle	3	0	3
pickle	10	10	0
pmml	0	0	0
pytorch_binary	10	0	10
pytorch_zip	10	4	6
r_serialized	1	1	0
rknn	10	1	9
safetensors	10	0	10
skops	4	0	4
tar	1	1	0
tensorrt	10	10	0
text	10	0	10
tf_metagraph	7	3	4
tf_savedmodel	10	9	1
tflite	10	1	9
torch7	0	0	0
torchserve_mar	0	0	0
weight_distribution	10	0	10
xgboost	10	0	10
zip	8	0	8

Focused Rerun For This PR

Scanner: executorch
Baseline on main: discovered=10, flagged=10, clean=0
Post-fix rerun in PR worktree: discovered=10, flagged=0, clean=10
Regression outcome: public ExecuTorch .pte binaries that previously failed as invalid archives now scan cleanly with no residual file-type mismatch noise

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/utils/file/detection.py`:
- Around line 1011-1013: The comment above the ExecuTorch format check is out of
date; update the comment that precedes the conditional that checks ext_format ==
"executorch" (the block that returns header_format == "zip" or
_is_valid_executorch_binary(path")) to state that ExecuTorch files may be either
ZIP archives or valid ExecuTorch FlatBuffers binaries (and not only ZIPs).
Locate the ext_format/header_format check and adjust the single-line comment to
reflect both accepted formats, referencing the function
_is_valid_executorch_binary and the variables ext_format, header_format, and
path.
- Around line 475-476: The code calls _is_valid_executorch_binary(file_path) for
every file which triggers extra I/O; instead, first perform the cheap in-memory
signature/magic-byte check (the existing lightweight check used elsewhere in
this module—e.g. the function/constant that verifies the ExecuTorch magic bytes)
and only if that quick check passes call _is_valid_executorch_binary(file_path);
update the current if block that returns "executorch" to gate the expensive
validation behind the in-memory signature check, avoiding open/stat/seek on
files that don't match the signature.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d5a46e68-0236-4681-bf6d-138cd46f3ce8

📥 Commits

Reviewing files that changed from the base of the PR and between 74a8bd6 and 73cf4a5.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/utils/file/detection.py
tests/utils/file/test_filetype.py

modelaudit/utils/file/detection.py

# Conflicts: # CHANGELOG.md

fix: accept executorch flatbuffer binaries

da0125e

Recognize valid ExecuTorch FlatBuffers programs in .pte files, prevent file-type validation noise for those binaries, and add regression coverage for scanner and detection helpers. Co-Authored-By: Codex <noreply@openai.com>

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

modelaudit/scanners/executorch_scanner.py Outdated Show resolved Hide resolved

modelaudit/utils/file/detection.py Show resolved Hide resolved

tests/scanners/test_executorch_scanner.py Outdated Show resolved Hide resolved

fix: address executorch review feedback

6c5ba7b

Align ExecuTorch binary signature checks with the shared detector, remove the duplicate changelog heading, and add the missing test return annotation. Co-Authored-By: Codex <noreply@openai.com>

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

yash2998chhabria and others added 3 commits March 16, 2026 12:17

fix: annotate executorch regression test

3cf0108

fix(executorch): validate versioned flatbuffer binaries

74a8bd6

Merge remote-tracking branch 'origin/main' into audit/pr715

73cf4a5

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

modelaudit/utils/file/detection.py Outdated Show resolved Hide resolved

modelaudit/utils/file/detection.py Outdated Show resolved Hide resolved

Merge branch 'main' into fix/executorch-binary-validation

5c4ed6d

# Conflicts: # CHANGELOG.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: accept valid ExecuTorch FlatBuffers binaries#715

fix: accept valid ExecuTorch FlatBuffers binaries#715
yash2998chhabria wants to merge 6 commits intomainfrom
fix/executorch-binary-validation

yash2998chhabria commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 16, 2026 •

edited

Loading

Reviews paused

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

yash2998chhabria commented Mar 16, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yash2998chhabria commented Mar 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

yash2998chhabria commented Mar 16, 2026

Testing Metadata Update

Branch Validation

Focused Rerun For This PR

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yash2998chhabria commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 16, 2026 •

edited

Loading