Skip to content

Security hardening: webhook validation, adapter path traversal, API key hygiene, and input validation #1246

@planetf1

Description

@planetf1

A set of small, independent hardening items identified during a security audit. None require architectural changes; each is a one-PR fix. Grouping here for tracking — they can be split into separate PRs or tackled together.

Jinja2 bare Environment in TemplateFormatter

mellea/formatters/template_formatter.py:181 constructs a bare jinja2.Environment() and loads template source from repr.template:

if repr.template:
    return jinja2.Environment().from_string(repr.template)  # type: ignore

repr.template is currently developer-only (the only two sites that set it use literal strings), but the template= argument is public API with no sandbox. If a downstream caller passed model output there it would be unsandboxed SSTI. The safe path at lines 244-245 uses jinja2.select_autoescape() but that only protects against XSS, not SSTI.

Mitigation: replace with jinja2.sandbox.SandboxedEnvironment(undefined=jinja2.StrictUndefined) — one-line change.

H1 — Webhook URL has no validation or timeout

mellea/core/utils.py_resolve_webhook_url() returns the MELLEA_LOGS_WEBHOOK environment variable verbatim with no scheme enforcement, no host allowlist, and no validation. RESTHandler.emit() posts to this URL with no timeout= argument, so a non-responsive server hangs the caller indefinitely.

Mitigations: validate HTTPS-only, add a hostname allowlist or deny-list (at minimum 169.254.0.0/16), add timeout= to requests.request() in emit().

M2 — Adapter io_config path not canonicalised

mellea/backends/adapters/adapter.pyfrom_model_directory() opens model_path / io_config_rel where io_config_rel comes from a JSON file without calling Path.resolve() or checking the resolved path stays inside model_path. A ../-traversal in the JSON index or a symlink in the model directory can read arbitrary files.

Mitigation: Path.resolve(strict=True) on the joined path; assert it is a subpath of model_path.

M3 — Server detection makes an outbound HTTP request with no host validation or timeout

mellea/helpers/server_type.py:73is_vllm_server_with_structured_output() constructs a URL from base_url and calls requests.get() with no hostname validation and no timeout=. A base_url pointing at a cloud metadata endpoint (169.254.169.254) or a slow server hangs the call.

Mitigation: deny 169.254.0.0/16 before making the request; add timeout=5.

M4 — API key stored as a public single-underscore attribute with no repr masking

mellea/backends/openai.pyself._api_key uses Python convention only (not enforced). No __repr__ override, no __slots__. The key appears in debug output and object dumps. mcp.py constructs h["Authorization"] = f"Bearer {api_key}" as a plain dict.

Mitigation: add __repr__ returning "sk-***"; consider __slots__ or double-underscore mangling for the attribute.

M5 — Temp file has no explicit chmod; mkdtemp has no umask guard

mellea/stdlib/tools/interpreter.pyNamedTemporaryFile(delete=False) at line 384 is not followed by os.chmod(temp_file, 0o600). mkdtemp() at line 1176 has no explicit umask guard. On systems with a permissive default umask the temp directory may be world-readable.

Mitigation: os.chmod(temp_file, 0o600) after creation; set umask explicitly around mkdtemp().

N3 — Message.role not validated at runtime

mellea/stdlib/components/chat.pyMessage.__init__ accepts any string for role; the Literal["system","user","assistant","tool"] type hint is not enforced at runtime. The _parse classmethod extracts the role directly from backend API response metadata without checking the value.

Mitigation: add a runtime check in __init__:

_VALID_ROLES = {"system", "user", "assistant", "tool"}
if role not in _VALID_ROLES:
    raise ValueError(f"Invalid role {role!r}")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions