fix: stop suppressing mixed-case pickle modules#698
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplaces per-segment lowercase/regex checks for pickle module-name plausibility with per-segment Python identifier validation (allowing mixed-case names like Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 721-731: The test test_pil_global_reduce_is_not_suppressed should
not depend on CheckStatus.FAILED; instead assert that a REDUCE Opcode Safety
Check mentions "PIL.Image" (regardless of status) and still assert that no
message contains "implausible module name 'PIL'". Update the first assertion to:
assert any("PIL.Image" in c.message for c in reduce_checks) and keep the second
assertion as-is to ensure PIL is not classified implausible; reference the
test_pil_global_reduce_is_not_suppressed function and the "REDUCE Opcode Safety
Check" checks when making the change.
- Around line 751-758: The test
test_mixed_case_import_only_payload_still_flags_import currently only checks
that a mixed-case dangerous global ("Builtins.eval") is flagged, which doesn't
exercise the new mixed-case plausibility logic; update the test to add a second
payload using _craft_global_import_only_pickle that represents the benign
mixed-case counterpart (e.g., a name that should not be considered the dangerous
global under the new rules) and assert that no suspicious import issue is
present for that payload, so the test covers both the flagged and non-flagged
mixed-case paths and prevents regressions in the mixed-case rules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b0a6fa3b-73b8-4cc1-a9c3-272601d67dcc
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/scanners/pickle_scanner.pytests/scanners/test_pickle_scanner.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 864-875: The test
test_mixed_case_unknown_import_only_remains_non_failing currently only asserts
there is no FAILED "Global Module Reference Check" for "EvilPkg.thing"; tighten
it to also assert there are no WARNING or CRITICAL statuses for any check whose
name == "Global Module Reference Check" and
check.details.get("import_reference") == "EvilPkg.thing". Update the assertion
over result.checks (or add a new one) to ensure no check for that
import_reference has status in {CheckStatus.FAILED, CheckStatus.WARNING,
CheckStatus.CRITICAL}, referencing the result variable and
check.name/check.status/check.details keys used in the test.
- Around line 753-781: The two helpers _craft_global_import_only_pickle and
_craft_global_only_pickle produce identical GLOBAL-only payloads; consolidate by
having one delegate to the other to avoid duplication and future drift (e.g.,
implement _craft_global_import_only_pickle to simply return
_craft_global_only_pickle(module, func) or vice versa), keep the public
signatures and docstring intact, and ensure any callers still receive bytes as
before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c4b9f6b2-4053-4dfd-a6a2-5078630bf651
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/scanners/pickle_scanner.pytests/scanners/test_pickle_scanner.py
…ausibility # Conflicts: # CHANGELOG.md # tests/scanners/test_pickle_scanner.py
… into feat/pickle-module-plausibility # Conflicts: # CHANGELOG.md # tests/scanners/test_pickle_scanner.py
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tests/scanners/test_pickle_scanner.py (2)
783-787: 🧹 Nitpick | 🔵 TrivialConsolidate duplicate GLOBAL-only payload helpers.
_craft_global_import_only_pickleand_craft_global_only_picklecurrently emit identical bytes. Delegate one to the other to avoid future drift.♻️ Proposed refactor
`@staticmethod` def _craft_global_import_only_pickle(module: str, func: str) -> bytes: """Craft minimal pickle that only imports a GLOBAL and stops.""" - - return b"\x80\x02" + b"c" + f"{module}\n{func}\n".encode() + b"." + return TestPickleScannerBlocklistHardening._craft_global_only_pickle(module, func)Also applies to: 789-792
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_pickle_scanner.py` around lines 783 - 787, The two helper functions _craft_global_import_only_pickle and _craft_global_only_pickle emit identical bytes; refactor by delegating one to the other to avoid duplication and future drift: pick one implementation (e.g., keep _craft_global_only_pickle as the canonical builder) and change the other (e.g., _craft_global_import_only_pickle) to call and return that function with the same parameters (module, func), ensuring both names remain available for tests that reference either helper.
975-985:⚠️ Potential issue | 🟡 MinorDecouple the
PILplausibility regression from current allowlist status.This test is about “not implausible anymore”, but it currently hard-requires
CheckStatus.FAILED. IfPIL.Imageis later allowlisted, this regression would fail for the wrong reason.Based on learnings: Preserve or strengthen security detections; test both benign and malicious samples when adding scanner/feature changes.🧪 Suggested assertion adjustment
reduce_checks = [c for c in result.checks if c.name == "REDUCE Opcode Safety Check"] - assert any(c.status == CheckStatus.FAILED and "PIL.Image" in c.message for c in reduce_checks), ( - f"Expected failed REDUCE check for PIL.Image, got: {[c.message for c in reduce_checks]}" + assert any("PIL.Image" in c.message for c in reduce_checks), ( + f"Expected REDUCE analysis to resolve PIL.Image, got: {[c.message for c in reduce_checks]}" ) assert not any("implausible module name 'PIL'" in c.message for c in reduce_checks), ( "PIL should no longer be classified as an implausible module" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/scanners/test_pickle_scanner.py` around lines 975 - 985, Update the test_pil_global_reduce_is_not_suppressed to stop hard-requiring CheckStatus.FAILED for the REDUCE Opcode Safety Check: locate the reduce_checks filtered by name "REDUCE Opcode Safety Check" and assert that at least one check.message references "PIL.Image" (regardless of check.status), and separately assert that none of the reduce_checks contain the "implausible module name 'PIL'" text; if desired, also keep an optional assertion that if any check.status == CheckStatus.FAILED then its message contains "PIL.Image" to preserve the original failure expectation without coupling to allowlist changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 783-787: The two helper functions _craft_global_import_only_pickle
and _craft_global_only_pickle emit identical bytes; refactor by delegating one
to the other to avoid duplication and future drift: pick one implementation
(e.g., keep _craft_global_only_pickle as the canonical builder) and change the
other (e.g., _craft_global_import_only_pickle) to call and return that function
with the same parameters (module, func), ensuring both names remain available
for tests that reference either helper.
- Around line 975-985: Update the test_pil_global_reduce_is_not_suppressed to
stop hard-requiring CheckStatus.FAILED for the REDUCE Opcode Safety Check:
locate the reduce_checks filtered by name "REDUCE Opcode Safety Check" and
assert that at least one check.message references "PIL.Image" (regardless of
check.status), and separately assert that none of the reduce_checks contain the
"implausible module name 'PIL'" text; if desired, also keep an optional
assertion that if any check.status == CheckStatus.FAILED then its message
contains "PIL.Image" to preserve the original failure expectation without
coupling to allowlist changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3c09abff-f86f-4458-aca5-8934906eba7e
📒 Files selected for processing (1)
tests/scanners/test_pickle_scanner.py
Keep mixed-case module names like PIL analyzable without reintroducing uppercase data-label false positives such as PEDRA_2020. Also preserve streaming security exit codes by separating operational errors from security findings.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Line 1906: Add a type annotation to the new constant
_CASE_SENSITIVE_IMPORT_SEGMENTS so its type is explicit (e.g.
_CASE_SENSITIVE_IMPORT_SEGMENTS: FrozenSet[str] = frozenset({"PIL", "Cython"}));
import FrozenSet from typing if not already present or use built-in
frozenset[str] for Python 3.9+ to satisfy the coding guideline requiring type
hints.
In `@tests/scanners/test_pickle_scanner.py`:
- Around line 988-1004: Update the two tests
test_mixed_case_stack_global_reduce_is_not_suppressed and
test_mixed_case_memoized_stack_global_reduce_is_not_suppressed to also assert
that the REDUCE failure message does not include the implausible-module text;
specifically, after locating reduce_checks and asserting a failed REDUCE for
"EvilPkg.thing", add an assertion that no reduce_check.message contains
"implausible module name 'EvilPkg'" (i.e., ensure the implausible-module warning
is absent while the REDUCE failure is present).
In `@tests/test_cli.py`:
- Around line 1423-1438: Update the
test_exit_code_security_issues_streaming_local_directory function to add type
hints (tmp_path: Path and return -> None) and assert a concrete malicious
finding in the CLI output instead of relying solely on exit code; specifically,
after invoking cli(["scan", "--stream", str(streamed_dir)]) check result.output
(or result.stdout) contains a known indicator such as the malicious check name
or exact issue message present in malicious_system_call.pkl (reference the test
name test_exit_code_security_issues_streaming_local_directory and the sample
file malicious_system_call.pkl) and keep the existing exit_code and streamed_dir
deletion assertion.
In `@tests/test_streaming_scan.py`:
- Around line 209-238: The test
test_scan_model_streaming_critical_findings_do_not_set_operational_errors should
assert the exact malicious finding instead of using the broad failed_checks >=
1; update it to check the ScanResult payload returned from scan_model_streaming
(e.g., inspect result.failed_checks/details or result.checks) to assert the
check name "Dangerous Pickle Check", severity IssueSeverity.CRITICAL, message
"Detected malicious payload", and location matching temp_test_files[0], and keep
the original success/has_errors assertions; also add a type annotation to the
test signature (e.g., def
test_scan_model_streaming_critical_findings_do_not_set_operational_errors(temp_test_files:
Sequence[Path]) -> None:) so the test is properly typed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 866f94f8-744b-4ca2-a327-d5e959f9bad6
📒 Files selected for processing (6)
modelaudit/core.pymodelaudit/models.pymodelaudit/scanners/pickle_scanner.pytests/scanners/test_pickle_scanner.pytests/test_cli.pytests/test_streaming_scan.py
…ausibility # Conflicts: # modelaudit/core.py # tests/test_cli.py # tests/test_streaming_scan.py
# Conflicts: # CHANGELOG.md
Summary
PILand attacker-chosen names likeEvilPkgare no longer auto-suppressedREDUCEtargetsfoo..baror whitespace-containing labels on the implausible path instead of widening to junk data stringsTesting
uv run ruff format modelaudit/ tests/uv run ruff check --fix modelaudit/ tests/uv run ruff check modelaudit/ tests/uv run ruff format --check modelaudit/ tests/uv run mypy modelaudit/uv run pytest tests/scanners/test_pickle_scanner.py -q -k "plausible or mixed_case or import_only or PIL or later_stream"uv run pytest -n auto -m "not slow and not integration" --maxfail=1QA
main, scanning/tmp/modelaudit-plausibility-qa.vd7LZf/mixed_case_reduce.pkl(EvilPkg.thingviaREDUCE) was cleanFound REDUCE opcode with non-allowlisted global: EvilPkg.thing/tmp/modelaudit-plausibility-qa.vd7LZf/malformed_reduce.pkl(foo..bar) stays clean on this branch, preserving the malformed-name guardrailScope note
Builtins.evalwere already detected before this changeSummary by CodeRabbit
Bug Fixes
Behavior Changes
Tests