Skip to content
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Fixed

- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Optional: Consider clarifying "interactive text run" terminology.

The entry accurately describes the security fix. However, the phrase "interactive text run" could be more idiomatic. Consider alternatives like "interactive scan", "interactive mode", or "interactive session" for better clarity.

Example:

-- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
+- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive scan; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive scan; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` at line 84, The changelog line uses the phrase "interactive
text run," which is unclear; update the sentence in CHANGELOG.md replacing
"interactive text run" with a clearer term such as "interactive scan" or
"interactive mode" (e.g., change the bullet text under **security:** to read
"...unless a human explicitly trusts that config in an interactive scan;
remembered trust..."), keeping the rest of the sentence intact and preserving
the note about remembered trust and invalidation when the config changes.

- **security:** remove `dill.load` / `dill.loads` from the pickle safe-global allowlist so recursive dill deserializers stay flagged as dangerous loader entry points
- **security:** add exact dangerous helper coverage for validated torch and NumPy refs such as `numpy.f2py.crackfortran.getlincoef`, `torch._dynamo.guards.GuardBuilder.get`, and `torch.utils.collect_env.run`
- **security:** add exact dangerous-global coverage for `numpy.load`, `site.main`, `_io.FileIO`, `test.support.script_helper.assert_python_ok`, `_osx_support._read_output`, `_aix_support._read_cmd_output`, `_pyrepl.pager.pipe_pager`, `torch.serialization.load`, and `torch._inductor.codecache.compile_file` (9 PickleScan-only loader and execution primitives)
Expand Down
7 changes: 7 additions & 0 deletions docs/user/security-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,13 @@ ModelAudit is a static security scanner for model artifacts. It analyzes files a
- `modelaudit metadata` defaults to non-deserializing extraction for untrusted inputs.
- `--trust-loaders` may deserialize model content and should only be used on trusted artifacts in isolated environments.

## Local scan policy files

- Local `.modelaudit.toml` or `pyproject.toml` policy files are not applied implicitly during scans.
- Interactive text scans may offer to trust a detected local policy file for future runs on that same config directory.
- Remembered trust is stored in the local ModelAudit cache and is invalidated automatically if the config file changes.
- CI and other non-interactive scans should use explicit configuration rather than relying on remembered local trust.

## Interpreting scan results

- `CRITICAL`: High-confidence risk indicator. Block release/use by default.
Expand Down
170 changes: 170 additions & 0 deletions modelaudit/cache/trusted_config_store.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
"""Secure persistence for trusted local ModelAudit configuration files."""

from __future__ import annotations

import hashlib
import json
import os
from contextlib import suppress
from dataclasses import dataclass
from pathlib import Path
from uuid import uuid4

from ..config.local_config import LocalConfigCandidate

TRUST_STORE_VERSION = 1


@dataclass(frozen=True)
class TrustedConfigRecord:
"""Persisted trust metadata for a local config directory."""

config_path: str
config_sha256: str


class TrustedConfigStore:
"""Read and write trusted local config state under the cache directory."""

def __init__(self, store_path: Path | None = None):
self.store_path = store_path or (Path.home() / ".modelaudit" / "cache" / "trusted_local_configs.json")
Comment on lines +29 to +30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add the missing constructor return type hint.

TrustedConfigStore.__init__ should explicitly annotate -> None to comply with the repo typing rule.

♻️ Proposed fix
-class TrustedConfigStore:
+class TrustedConfigStore:
@@
-    def __init__(self, store_path: Path | None = None):
+    def __init__(self, store_path: Path | None = None) -> None:
         self.store_path = store_path or (Path.home() / ".modelaudit" / "cache" / "trusted_local_configs.json")

As per coding guidelines, **/*.py: Always include type hints in Python code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/cache/trusted_config_store.py` around lines 29 - 30, Add an
explicit return type hint to the TrustedConfigStore constructor: update the
TrustedConfigStore.__init__ signature to annotate the return type as -> None
(e.g., def __init__(self, store_path: Path | None = None) -> None:), leaving the
body unchanged so it complies with the repo typing rule.


def is_trusted(self, candidate: LocalConfigCandidate) -> bool:
"""Return True when a candidate matches a previously trusted config hash."""
records = self._load_records()
key = str(candidate.config_dir)
record = records.get(key)
if record is None:
return False

if record.config_path != str(candidate.config_path):
return False

current_hash = self._hash_config(candidate.config_path)
return current_hash is not None and current_hash == record.config_sha256

def trust(self, candidate: LocalConfigCandidate) -> None:
"""Persist trust for a resolved local config candidate."""
config_hash = self._hash_config(candidate.config_path)
if config_hash is None:
return

records = self._load_records()
records[str(candidate.config_dir)] = TrustedConfigRecord(
config_path=str(candidate.config_path),
config_sha256=config_hash,
)
self._write_records(records)

def _load_records(self) -> dict[str, TrustedConfigRecord]:
"""Load trusted config records from disk."""
if not self._is_secure_target(self.store_path):
return {}

try:
if not self.store_path.exists():
return {}
if self.store_path.is_symlink() or not self.store_path.is_file():
return {}

with self.store_path.open(encoding="utf-8") as handle:
payload = json.load(handle)
except Exception:
return {}

if not isinstance(payload, dict) or payload.get("version") != TRUST_STORE_VERSION:
return {}

repos = payload.get("repos", {})
if not isinstance(repos, dict):
return {}

records: dict[str, TrustedConfigRecord] = {}
for key, value in repos.items():
if not isinstance(key, str) or not isinstance(value, dict):
continue
config_path = value.get("config_path")
config_sha256 = value.get("config_sha256")
if isinstance(config_path, str) and isinstance(config_sha256, str):
records[key] = TrustedConfigRecord(config_path=config_path, config_sha256=config_sha256)
return records

def _write_records(self, records: dict[str, TrustedConfigRecord]) -> None:
"""Write the current trust records atomically with private permissions."""
parent = self.store_path.parent
if not _ensure_secure_directory(parent):
return

payload = {
"version": TRUST_STORE_VERSION,
"repos": {
key: {"config_path": record.config_path, "config_sha256": record.config_sha256}
for key, record in records.items()
},
}
temp_path = parent / f".trusted_local_configs.{uuid4().hex}.tmp"
flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL
if hasattr(os, "O_NOFOLLOW"):
flags |= os.O_NOFOLLOW

try:
fd = os.open(temp_path, flags, 0o600)
with os.fdopen(fd, "w", encoding="utf-8") as handle:
json.dump(payload, handle, indent=2, sort_keys=True)
_tighten_permissions(temp_path, 0o600)
os.replace(temp_path, self.store_path)
_tighten_permissions(self.store_path, 0o600)
except Exception:
with suppress(OSError):
temp_path.unlink()

def _hash_config(self, config_path: Path) -> str | None:
"""Return a stable hash for the config file contents."""
try:
return hashlib.sha256(config_path.read_bytes()).hexdigest()
except Exception:
return None

def _is_secure_target(self, path: Path) -> bool:
"""Return True when the parent path is suitable for reads and writes."""
return not _has_symlink_component(path)


def _tighten_permissions(path: Path, mode: int) -> None:
"""Best-effort permission hardening for cache trust paths."""
if os.name == "nt":
return

with suppress(OSError):
path.chmod(mode)


def _has_symlink_component(path: Path) -> bool:
"""Return True when path or an ancestor is a symlink."""
current = path
while True:
try:
if current.is_symlink():
return True
except OSError:
return True
if current == current.parent:
return False
current = current.parent
Comment on lines +142 to +153
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider simplifying symlink detection.

The _has_symlink_component function manually walks parent directories to detect symlinks. This is functionally correct and fail-safe (returns True on OSError).

An alternative approach would be to compare path.resolve() with the original path, which would detect symlinks anywhere in the path. However, the current implementation is more explicit and handles edge cases like broken symlinks more predictably.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/cache/trusted_config_store.py` around lines 142 - 153, Replace the
manual parent-walk in _has_symlink_component with a simpler resolve-based check:
use Path.resolve(strict=False) and return True when path !=
path.resolve(strict=False) (this detects any symlink component anywhere in the
path while avoiding raising on broken symlinks); update the function to call
Path.resolve(strict=False) on the input Path and return the comparison result
instead of the loop and OSError handling.



def _ensure_secure_directory(path: Path) -> bool:
"""Create a directory when possible and reject symlinked targets."""
if _has_symlink_component(path):
return False

try:
path.mkdir(parents=True, mode=0o700, exist_ok=True)
except OSError:
return False

if not path.is_dir() or _has_symlink_component(path):
return False

_tighten_permissions(path, 0o700)
return True
118 changes: 106 additions & 12 deletions modelaudit/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@
is_delegated_from_promptfoo,
set_user_email,
)
from .cache.trusted_config_store import TrustedConfigStore
from .config import ModelAuditConfig, set_config
from .config.local_config import find_local_config_for_paths
from .core import determine_exit_code, scan_model_directory_or_file
from .integrations.jfrog import scan_jfrog_artifact
from .integrations.sarif_formatter import format_sarif_output
Expand All @@ -41,7 +43,12 @@
record_scan_started,
)
from .utils import resolve_dvc_file
from .utils.helpers.auto_defaults import apply_auto_overrides, generate_auto_defaults, parse_size_string
from .utils.helpers.auto_defaults import (
apply_auto_overrides,
detect_ci_environment,
generate_auto_defaults,
parse_size_string,
)
from .utils.helpers.interrupt_handler import interruptible_scan
from .utils.sources.cloud_storage import download_from_cloud, is_cloud_url
from .utils.sources.huggingface import (
Expand Down Expand Up @@ -78,6 +85,83 @@ def style_text(text: str, **kwargs: Any) -> str:
return text


def get_trusted_config_store() -> TrustedConfigStore:
"""Return the persistent store used for trusted local configs."""
return TrustedConfigStore()


def can_use_trusted_local_config(output_format: str) -> bool:
"""Return True when the current scan mode supports trusted local configs."""
return (
output_format == "text"
and sys.stdin.isatty()
and sys.stdout.isatty()
and not detect_ci_environment()
and not is_delegated_from_promptfoo()
)


def maybe_load_trusted_local_config(
paths: list[str],
output_format: str,
*,
quiet: bool,
) -> tuple[ModelAuditConfig | None, bool, Path | None]:
"""Load a trusted local config for interactive text scans when available."""
if not can_use_trusted_local_config(output_format):
return None, False, None

candidate = find_local_config_for_paths(paths)
if candidate is None:
return None, False, None

store = get_trusted_config_store()
if store.is_trusted(candidate):
return ModelAuditConfig.load(candidate.config_path), True, candidate.config_path

if quiet:
return None, False, None

click.echo(style_text(f"Found local ModelAudit config at {candidate.config_path}", fg="cyan"))
click.echo("It can suppress findings or change severities.")
choice = click.prompt(
"Use it? [y] once, [a] always, [n] no",
type=click.Choice(["y", "a", "n"], case_sensitive=False),
default="n",
show_choices=False,
).lower()

if choice == "n":
return None, False, None

if choice == "a":
store.trust(candidate)

return ModelAuditConfig.load(candidate.config_path), True, candidate.config_path


def build_scan_rule_config(
paths: list[str],
suppress: tuple[str, ...],
severity_overrides: dict[str, str],
*,
output_format: str,
quiet: bool,
) -> tuple[ModelAuditConfig, bool, Path | None]:
"""Build the effective scan rule config, including trusted local policy when enabled."""
base_config, local_config_applied, local_config_path = maybe_load_trusted_local_config(
paths,
output_format,
quiet=quiet,
)
cli_config = ModelAuditConfig.from_cli_args(
suppress=list(suppress) if suppress else None,
severity=severity_overrides if severity_overrides else None,
base_config=base_config,
)
return cli_config, local_config_applied, local_config_path


def expand_paths(paths: tuple[str, ...]) -> tuple[list[str], list[str]]:
"""Expand and validate input paths with type safety."""
expanded: list[str] = []
Expand Down Expand Up @@ -770,17 +854,6 @@ def scan_command(
flush_telemetry()
sys.exit(2)

# Apply rule configuration from CLI and config files
severity_overrides = parse_severity_overrides(severity)
try:
cli_config = ModelAuditConfig.from_cli_args(
suppress=list(suppress) if suppress else None,
severity=severity_overrides if severity_overrides else None,
)
except ValueError as exc:
raise click.BadParameter(str(exc)) from exc
set_config(cli_config)

# Generate defaults based on input analysis
auto_defaults = generate_auto_defaults(expanded_paths)

Expand Down Expand Up @@ -846,6 +919,27 @@ def scan_command(
final_skip_files = config.get("skip_non_model_files", True)
final_strict_license = config.get("strict_license", False)

# Apply rule configuration from CLI and any trusted local config for this scan mode.
severity_overrides = parse_severity_overrides(severity)
try:
cli_config, local_config_applied, local_config_path = build_scan_rule_config(
expanded_paths,
suppress,
severity_overrides,
output_format=final_format,
quiet=quiet,
)
except ValueError as exc:
raise click.BadParameter(str(exc)) from exc
set_config(cli_config)

if local_config_applied:
if final_cache:
final_cache = False
if not quiet and show_styled_output and local_config_path is not None:
click.echo(style_text(f"Using local ModelAudit config: {local_config_path}", fg="cyan"))
click.echo(style_text("Scan result cache disabled for this run.", fg="yellow"))

# Handle max download size from automatic defaults or max_size override
max_download_bytes = None
if max_size is not None:
Expand Down
Loading
Loading