fix: accept valid ExecuTorch FlatBuffers binaries#715
fix: accept valid ExecuTorch FlatBuffers binaries#715yash2998chhabria wants to merge 6 commits intomainfrom
Conversation
Recognize valid ExecuTorch FlatBuffers programs in .pte files, prevent file-type validation noise for those binaries, and add regression coverage for scanner and detection helpers. Co-Authored-By: Codex <noreply@openai.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughDetects ExecuTorch FlatBuffers binaries by inspecting header bytes and adds an early-accept path in the ExecuTorch scanner. File-type detection and validation now recognize ExecuTorch binaries; tests and CHANGELOG updated accordingly. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Around line 84-87: Remove the duplicate "### Fixed" heading and merge its
following content ("eliminate false positives for valid ExecuTorch FlatBuffers
binaries and file-type validation on public `.pte` models") into the existing
"### Fixed" section above; specifically, delete the second "### Fixed" header
and append its bullet/text to the first "### Fixed" block so the changelog
follows the Keep a Changelog format and avoids the MD024 duplicate-heading lint
warning.
In `@modelaudit/scanners/executorch_scanner.py`:
- Around line 40-44: The function _is_executorch_binary currently checks only
header[4:6] against b"ET" which contradicts the comment ("bytes 4..7") and is
looser than the 4-byte signature used elsewhere; update _is_executorch_binary to
check header length >= 8 and compare header[4:8] to the full 4-byte signature
used in detection.py (refer to _is_executorch_binary_signature) — also update
the inline comment to say "bytes 4..7" or "bytes 4-7" to reflect the 4-byte
check.
In `@modelaudit/utils/file/detection.py`:
- Around line 192-195: The detection helpers are inconsistent:
_is_executorch_binary_signature(prefix) checks for a 4-byte identifier b"ET12"
while ExecuTorchScanner._is_executorch_binary(header) only checks header[4:6] ==
b"ET"; align them by using the same 4-byte check everywhere. Update
ExecuTorchScanner._is_executorch_binary to check header[4:8] == b"ET12" (or
conversely change _is_executorch_binary_signature to match the 2-byte check if
you prefer the looser rule) so both _is_executorch_binary_signature and
ExecuTorchScanner._is_executorch_binary use the identical byte-range and value
for detection.
In `@tests/scanners/test_executorch_scanner.py`:
- Around line 60-68: The test function
test_executorch_scanner_accepts_binary_program_header is missing a return type
hint; update its definition to include the standard test annotation "-> None"
(i.e., change the def signature for
test_executorch_scanner_accepts_binary_program_header to include the return type
hint) so it follows the project coding guidelines for tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 35ef307b-6091-42f6-b321-ac4cf2bdf625
📒 Files selected for processing (6)
CHANGELOG.mdmodelaudit/scanners/executorch_scanner.pymodelaudit/utils/file/detection.pytests/conftest.pytests/scanners/test_executorch_scanner.pytests/utils/file/test_filetype.py
Align ExecuTorch binary signature checks with the shared detector, remove the duplicate changelog heading, and add the missing test return annotation. Co-Authored-By: Codex <noreply@openai.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/scanners/test_executorch_scanner.py (1)
60-60: 🧹 Nitpick | 🔵 TrivialAdd type hint for
tmp_pathparameter.Per coding guidelines, test parameters should have type hints:
tmp_path: Path.Proposed fix
-def test_executorch_scanner_accepts_binary_program_header(tmp_path) -> None: +def test_executorch_scanner_accepts_binary_program_header(tmp_path: Path) -> None:As per coding guidelines: "Use type hints
-> Noneon all test methods andtmp_path: Path/monkeypatch: pytest.MonkeyPatchon test parameters"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_executorch_scanner.py` at line 60, Add a type hint for the test parameter by changing the function signature of test_executorch_scanner_accepts_binary_program_header to accept tmp_path: Path (i.e., tmp_path: Path) while preserving the existing -> None return annotation; also ensure Path is imported (from pathlib import Path) at the top of the test module if it isn't already imported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/scanners/test_executorch_scanner.py`:
- Line 60: Add a type hint for the test parameter by changing the function
signature of test_executorch_scanner_accepts_binary_program_header to accept
tmp_path: Path (i.e., tmp_path: Path) while preserving the existing -> None
return annotation; also ensure Path is imported (from pathlib import Path) at
the top of the test module if it isn't already imported.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 956755eb-cc0c-464a-9114-557c55ecd183
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/scanners/executorch_scanner.pytests/scanners/test_executorch_scanner.py
Testing Metadata Update
Branch Validation
Broad Hugging Face Sweep (baseline on main)
Focused Rerun For This PR
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/utils/file/detection.py`:
- Around line 1011-1013: The comment above the ExecuTorch format check is out of
date; update the comment that precedes the conditional that checks ext_format ==
"executorch" (the block that returns header_format == "zip" or
_is_valid_executorch_binary(path")) to state that ExecuTorch files may be either
ZIP archives or valid ExecuTorch FlatBuffers binaries (and not only ZIPs).
Locate the ext_format/header_format check and adjust the single-line comment to
reflect both accepted formats, referencing the function
_is_valid_executorch_binary and the variables ext_format, header_format, and
path.
- Around line 475-476: The code calls _is_valid_executorch_binary(file_path) for
every file which triggers extra I/O; instead, first perform the cheap in-memory
signature/magic-byte check (the existing lightweight check used elsewhere in
this module—e.g. the function/constant that verifies the ExecuTorch magic bytes)
and only if that quick check passes call _is_valid_executorch_binary(file_path);
update the current if block that returns "executorch" to gate the expensive
validation behind the in-memory signature check, avoiding open/stat/seek on
files that don't match the signature.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d5a46e68-0236-4681-bf6d-138cd46f3ce8
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/utils/file/detection.pytests/utils/file/test_filetype.py
# Conflicts: # CHANGELOG.md
Summary
.ptefiles instead of treating every non-ZIP file as an invalid archiveET12ExecuTorch binary signature so valid public models do not emitS901mismatch noiseValidation
/Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/ruff format modelaudit/ tests//Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/ruff check --fix modelaudit/ tests//Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/mypy modelaudit//Users/yashchhabria/projects/modelauditing/modelaudit/.venv/bin/pytest -n auto -m "not slow and not integration" --maxfail=1executorchflagged10/10before the fix and0/10after the fixSummary by CodeRabbit
Bug Fixes
Tests
Documentation