Skip to content

fix: stop suppressing mixed-case pickle modules#698

Open
mldangelo wants to merge 16 commits intomainfrom
feat/pickle-module-plausibility
Open

fix: stop suppressing mixed-case pickle modules#698
mldangelo wants to merge 16 commits intomainfrom
feat/pickle-module-plausibility

Conversation

@mldangelo
Copy link
Member

@mldangelo mldangelo commented Mar 13, 2026

Summary

  • replace the lowercase-only pickle module plausibility check with a valid-identifier check so mixed-case modules like PIL and attacker-chosen names like EvilPkg are no longer auto-suppressed
  • add scanner-level regressions for GLOBAL, STACK_GLOBAL, memoized STACK_GLOBAL, and later-stream mixed-case REDUCE targets
  • keep malformed names like foo..bar or whitespace-containing labels on the implausible path instead of widening to junk data strings

Testing

  • uv run ruff format modelaudit/ tests/
  • uv run ruff check --fix modelaudit/ tests/
  • uv run ruff check modelaudit/ tests/
  • uv run ruff format --check modelaudit/ tests/
  • uv run mypy modelaudit/
  • uv run pytest tests/scanners/test_pickle_scanner.py -q -k "plausible or mixed_case or import_only or PIL or later_stream"
  • uv run pytest -n auto -m "not slow and not integration" --maxfail=1

QA

  • on untouched main, scanning /tmp/modelaudit-plausibility-qa.vd7LZf/mixed_case_reduce.pkl (EvilPkg.thing via REDUCE) was clean
  • on this branch, the same payload reports Found REDUCE opcode with non-allowlisted global: EvilPkg.thing
  • scanning /tmp/modelaudit-plausibility-qa.vd7LZf/malformed_reduce.pkl (foo..bar) stays clean on this branch, preserving the malformed-name guardrail

Scope note

  • mixed-case dangerous builtins such as Builtins.eval were already detected before this change
  • broader import-only coverage for non-dangerous mixed-case unknown modules remains separate work from the import-only findings task

Summary by CodeRabbit

  • Bug Fixes

    • Improved module-name validation in pickle scanning to accept valid mixed-case Python module identifiers (e.g., PIL), preventing incorrect bypasses of import/reduce checks.
  • Behavior Changes

    • Streaming scan result semantics adjusted so operational error flags align with operation success, keeping critical findings recorded without marking the operation as failed.
  • Tests

    • Expanded test coverage for many pickle opcode scenarios and streaming scan exit-code/behavior cases.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces per-segment lowercase/regex checks for pickle module-name plausibility with per-segment Python identifier validation (allowing mixed-case names like PIL); updates CHANGELOG security bullet; adjusts streaming scan has_errors derivation to use scan_result.success; expands tests for pickle payloads, streaming semantics, and CLI streaming exit behavior.

Changes

Cohort / File(s) Summary
Changelog
CHANGELOG.md
Replaced ONNX external_data CVE bullet with a new security note: mixed-case valid pickle module names are no longer treated as implausible, so import/REDUCE checks won't bypass on names like PIL or attacker-chosen EvilPkg.
Pickle scanner
modelaudit/scanners/pickle_scanner.py
Reworked import-plausibility logic: removed regex/lowercasing and re usage; each dotted segment now validated with str.isidentifier() and a short case-sensitive allowlist introduced. _is_plausible_import_only_module updated to reuse the new check.
Streaming/core aggregation
modelaudit/core.py, modelaudit/models.py
Streaming aggregate logic altered: has_errors is now derived from scan_result.success (set when success is False) rather than scan_result.has_errors, aligning operational error flag with scan success.
Tests — pickle scanner
tests/scanners/test_pickle_scanner.py
Imported _is_plausible_python_module; added helpers to craft protocol-4 payloads and expanded tests covering mixed-case modules, GLOBAL/STACK_GLOBAL/REDUCE (including memoized variants), import-only cases, and malformed payloads.
Tests — streaming & CLI
tests/test_streaming_scan.py, tests/test_cli.py
Added tests asserting streaming semantics: critical findings don't set operational errors (success remains True) but record failed checks; added CLI streaming test for local directory exit-code and file cleanup.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I nibble bytes beneath the moon,
Mixed-case names now hum a tune.
PIL and friends hop in pursuit,
Identifiers keep paths astute.
A tiny hop — the scanner's root. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.52% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the primary change: stopping the suppression of mixed-case pickle modules. It is directly related to the main objective of replacing the lowercase-only validation with a valid-identifier check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/pickle-module-plausibility
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 721-731: The test test_pil_global_reduce_is_not_suppressed should
not depend on CheckStatus.FAILED; instead assert that a REDUCE Opcode Safety
Check mentions "PIL.Image" (regardless of status) and still assert that no
message contains "implausible module name 'PIL'". Update the first assertion to:
assert any("PIL.Image" in c.message for c in reduce_checks) and keep the second
assertion as-is to ensure PIL is not classified implausible; reference the
test_pil_global_reduce_is_not_suppressed function and the "REDUCE Opcode Safety
Check" checks when making the change.
- Around line 751-758: The test
test_mixed_case_import_only_payload_still_flags_import currently only checks
that a mixed-case dangerous global ("Builtins.eval") is flagged, which doesn't
exercise the new mixed-case plausibility logic; update the test to add a second
payload using _craft_global_import_only_pickle that represents the benign
mixed-case counterpart (e.g., a name that should not be considered the dangerous
global under the new rules) and assert that no suspicious import issue is
present for that payload, so the test covers both the flagged and non-flagged
mixed-case paths and prevents regressions in the mixed-case rules.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b0a6fa3b-73b8-4cc1-a9c3-272601d67dcc

📥 Commits

Reviewing files that changed from the base of the PR and between 698859d and c311b25.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • modelaudit/scanners/pickle_scanner.py
  • tests/scanners/test_pickle_scanner.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 864-875: The test
test_mixed_case_unknown_import_only_remains_non_failing currently only asserts
there is no FAILED "Global Module Reference Check" for "EvilPkg.thing"; tighten
it to also assert there are no WARNING or CRITICAL statuses for any check whose
name == "Global Module Reference Check" and
check.details.get("import_reference") == "EvilPkg.thing". Update the assertion
over result.checks (or add a new one) to ensure no check for that
import_reference has status in {CheckStatus.FAILED, CheckStatus.WARNING,
CheckStatus.CRITICAL}, referencing the result variable and
check.name/check.status/check.details keys used in the test.
- Around line 753-781: The two helpers _craft_global_import_only_pickle and
_craft_global_only_pickle produce identical GLOBAL-only payloads; consolidate by
having one delegate to the other to avoid duplication and future drift (e.g.,
implement _craft_global_import_only_pickle to simply return
_craft_global_only_pickle(module, func) or vice versa), keep the public
signatures and docstring intact, and ensure any callers still receive bytes as
before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c4b9f6b2-4053-4dfd-a6a2-5078630bf651

📥 Commits

Reviewing files that changed from the base of the PR and between c311b25 and 549be18.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • modelaudit/scanners/pickle_scanner.py
  • tests/scanners/test_pickle_scanner.py

…ausibility

# Conflicts:
#	CHANGELOG.md
#	tests/scanners/test_pickle_scanner.py
… into feat/pickle-module-plausibility

# Conflicts:
#	CHANGELOG.md
#	tests/scanners/test_pickle_scanner.py
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
tests/scanners/test_pickle_scanner.py (2)

783-787: 🧹 Nitpick | 🔵 Trivial

Consolidate duplicate GLOBAL-only payload helpers.

_craft_global_import_only_pickle and _craft_global_only_pickle currently emit identical bytes. Delegate one to the other to avoid future drift.

♻️ Proposed refactor
 `@staticmethod`
 def _craft_global_import_only_pickle(module: str, func: str) -> bytes:
     """Craft minimal pickle that only imports a GLOBAL and stops."""
-
-    return b"\x80\x02" + b"c" + f"{module}\n{func}\n".encode() + b"."
+    return TestPickleScannerBlocklistHardening._craft_global_only_pickle(module, func)

Also applies to: 789-792

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_pickle_scanner.py` around lines 783 - 787, The two helper
functions _craft_global_import_only_pickle and _craft_global_only_pickle emit
identical bytes; refactor by delegating one to the other to avoid duplication
and future drift: pick one implementation (e.g., keep _craft_global_only_pickle
as the canonical builder) and change the other (e.g.,
_craft_global_import_only_pickle) to call and return that function with the same
parameters (module, func), ensuring both names remain available for tests that
reference either helper.

975-985: ⚠️ Potential issue | 🟡 Minor

Decouple the PIL plausibility regression from current allowlist status.

This test is about “not implausible anymore”, but it currently hard-requires CheckStatus.FAILED. If PIL.Image is later allowlisted, this regression would fail for the wrong reason.

🧪 Suggested assertion adjustment
         reduce_checks = [c for c in result.checks if c.name == "REDUCE Opcode Safety Check"]
-        assert any(c.status == CheckStatus.FAILED and "PIL.Image" in c.message for c in reduce_checks), (
-            f"Expected failed REDUCE check for PIL.Image, got: {[c.message for c in reduce_checks]}"
+        assert any("PIL.Image" in c.message for c in reduce_checks), (
+            f"Expected REDUCE analysis to resolve PIL.Image, got: {[c.message for c in reduce_checks]}"
         )
         assert not any("implausible module name 'PIL'" in c.message for c in reduce_checks), (
             "PIL should no longer be classified as an implausible module"
         )
Based on learnings: Preserve or strengthen security detections; test both benign and malicious samples when adding scanner/feature changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_pickle_scanner.py` around lines 975 - 985, Update the
test_pil_global_reduce_is_not_suppressed to stop hard-requiring
CheckStatus.FAILED for the REDUCE Opcode Safety Check: locate the reduce_checks
filtered by name "REDUCE Opcode Safety Check" and assert that at least one
check.message references "PIL.Image" (regardless of check.status), and
separately assert that none of the reduce_checks contain the "implausible module
name 'PIL'" text; if desired, also keep an optional assertion that if any
check.status == CheckStatus.FAILED then its message contains "PIL.Image" to
preserve the original failure expectation without coupling to allowlist changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 783-787: The two helper functions _craft_global_import_only_pickle
and _craft_global_only_pickle emit identical bytes; refactor by delegating one
to the other to avoid duplication and future drift: pick one implementation
(e.g., keep _craft_global_only_pickle as the canonical builder) and change the
other (e.g., _craft_global_import_only_pickle) to call and return that function
with the same parameters (module, func), ensuring both names remain available
for tests that reference either helper.
- Around line 975-985: Update the test_pil_global_reduce_is_not_suppressed to
stop hard-requiring CheckStatus.FAILED for the REDUCE Opcode Safety Check:
locate the reduce_checks filtered by name "REDUCE Opcode Safety Check" and
assert that at least one check.message references "PIL.Image" (regardless of
check.status), and separately assert that none of the reduce_checks contain the
"implausible module name 'PIL'" text; if desired, also keep an optional
assertion that if any check.status == CheckStatus.FAILED then its message
contains "PIL.Image" to preserve the original failure expectation without
coupling to allowlist changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3c09abff-f86f-4458-aca5-8934906eba7e

📥 Commits

Reviewing files that changed from the base of the PR and between 09ddc2f and 7c699d9.

📒 Files selected for processing (1)
  • tests/scanners/test_pickle_scanner.py

Keep mixed-case module names like PIL analyzable without reintroducing uppercase data-label false positives such as PEDRA_2020. Also preserve streaming security exit codes by separating operational errors from security findings.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Line 1906: Add a type annotation to the new constant
_CASE_SENSITIVE_IMPORT_SEGMENTS so its type is explicit (e.g.
_CASE_SENSITIVE_IMPORT_SEGMENTS: FrozenSet[str] = frozenset({"PIL", "Cython"}));
import FrozenSet from typing if not already present or use built-in
frozenset[str] for Python 3.9+ to satisfy the coding guideline requiring type
hints.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 988-1004: Update the two tests
test_mixed_case_stack_global_reduce_is_not_suppressed and
test_mixed_case_memoized_stack_global_reduce_is_not_suppressed to also assert
that the REDUCE failure message does not include the implausible-module text;
specifically, after locating reduce_checks and asserting a failed REDUCE for
"EvilPkg.thing", add an assertion that no reduce_check.message contains
"implausible module name 'EvilPkg'" (i.e., ensure the implausible-module warning
is absent while the REDUCE failure is present).

In `@tests/test_cli.py`:
- Around line 1423-1438: Update the
test_exit_code_security_issues_streaming_local_directory function to add type
hints (tmp_path: Path and return -> None) and assert a concrete malicious
finding in the CLI output instead of relying solely on exit code; specifically,
after invoking cli(["scan", "--stream", str(streamed_dir)]) check result.output
(or result.stdout) contains a known indicator such as the malicious check name
or exact issue message present in malicious_system_call.pkl (reference the test
name test_exit_code_security_issues_streaming_local_directory and the sample
file malicious_system_call.pkl) and keep the existing exit_code and streamed_dir
deletion assertion.

In `@tests/test_streaming_scan.py`:
- Around line 209-238: The test
test_scan_model_streaming_critical_findings_do_not_set_operational_errors should
assert the exact malicious finding instead of using the broad failed_checks >=
1; update it to check the ScanResult payload returned from scan_model_streaming
(e.g., inspect result.failed_checks/details or result.checks) to assert the
check name "Dangerous Pickle Check", severity IssueSeverity.CRITICAL, message
"Detected malicious payload", and location matching temp_test_files[0], and keep
the original success/has_errors assertions; also add a type annotation to the
test signature (e.g., def
test_scan_model_streaming_critical_findings_do_not_set_operational_errors(temp_test_files:
Sequence[Path]) -> None:) so the test is properly typed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 866f94f8-744b-4ca2-a327-d5e959f9bad6

📥 Commits

Reviewing files that changed from the base of the PR and between 7c699d9 and 22b88aa.

📒 Files selected for processing (6)
  • modelaudit/core.py
  • modelaudit/models.py
  • modelaudit/scanners/pickle_scanner.py
  • tests/scanners/test_pickle_scanner.py
  • tests/test_cli.py
  • tests/test_streaming_scan.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant