Marg security notes

This file is the v1.0 self-led security review. It is the artefact that satisfies the P07 exit criterion for a security audit on the Marg-only paths. Kavach-specific paths are reviewed in P09.

Threat model

Marg is a self-hosted proxy between client applications and large language model providers. The interesting trust boundaries are:

Client app to Marg. Apps authenticate with a Marg-issued API key. Marg enforces budgets, rate limits, and route policy.
Marg to provider. Marg holds the real provider credentials. The client app must never see them.
Operator to Marg. Operators authenticate with an admin Bearer token. The admin token can mint API keys, change budgets, and reload routing policy.
Marg to its backends. Postgres and Redis hold durable and hot state. Compromise of either means trust in everything downstream is compromised. Same with the local request log on disk.
Operator workstation to Marg Console. The console is a static single-page app served by the admin port. It uses the same admin Bearer token as the JSON API.

Out of scope: TLS termination (assumed handled by a reverse proxy or ALB in front of Marg), host-level kernel exploits, supply-chain compromise of crates.io.

Authentication

Client API keys

Format: marg_live_<32-byte-base32>. Generated with a CSPRNG.
Surfaced once at creation; stored as a SHA-256 hash.
Lookup is hash(token) -> key_id. No timing oracle on the hash itself (the lookup is a single indexed query).
Cache TTL on the auth path is 60 seconds. A revoke explicitly invalidates the cache; the revoke endpoint also clears the budget gauge for the key so stale Prometheus series do not accumulate.

Admin tokens

Same format and storage as client keys.
5-second cache TTL on the admin port (admin operations are infrequent and need fast revoke).
A 0600-mode bootstrap token is written on first boot to a configurable path. Idempotent: subsequent boots do not mint another token if any active admin token exists.
marg admin bootstrap mints an explicit token when the bootstrap path is empty (or set to "").

Privilege separation

There is exactly one privilege level inside Marg: admin. Either you have an admin token or you do not. v1.0 intentionally does not ship per-route ACLs or RBAC; that complexity belongs in the operator's IAM, not in Marg.

Provider credentials

Provider API keys live only in marg.toml (or in env vars / files referenced by the env: / file: secret shapes).
They never appear in:
- logs (every error path scrubs the Authorization header before logging),
- response bodies,
- the admin API,
- the request log on disk,
- any Prometheus metric label.
Provider responses can echo prompt content but never our provider key (we strip the Authorization header on the outbound side too, in case a provider ever decided to reflect headers).
marg keys only deals with Marg-issued keys. There is no endpoint that returns a provider key.

Body and connection limits

Configurable [server].max_body_bytes cap on chat request bodies. Default 1 MiB. Requests larger than the limit are refused with a 413 by the tower-http RequestBodyLimitLayer before the chat handler ever runs.
HTTP/1.1 keep-alive and HTTP/2 supported, with axum's default per-connection request limit. Backpressure on streaming is end-to-end; Marg never buffers a full response.
Connect timeout and read timeout per provider, with sensible defaults (120 s read timeout).
File descriptors raised to 1,048,576 in the shipped systemd unit; documented as the production minimum.

Streaming safety

SSE upstream chunks pass through unchanged. The only operation Marg performs is extracting usage from the final chunk (or Anthropic's stop event, or Google's UsageMetadata, or Bedrock's event-stream metadata) for budget settlement.
The streaming pipeline is a token-by-token tokio channel; there is no buffering of the response body.
A dropped client connection cancels the upstream call. When the inner mpsc send to the client fails, the streaming task drops the reqwest byte-stream, which aborts the underlying HTTP request to the provider so further tokens are never generated. marg_provider_errors_total{kind="client_disconnect"} increments and the request is logged with status 499 (client closed request). This closes the cost-amplification path that v1.0 pre-P08 carried as a known issue.

Failover and fail-closed

4xx from upstream surfaces directly to the client. Never retried, never failed over. This avoids "user prompt was malformed" silently retrying against a different provider that responds differently.
5xx, connect timeout, read timeout, and network errors trigger the fallback chain. Each fallback runs at most once per request.
Hot store unreachable (Redis down) returns 503 with x-marg-reason: hot_store_unreachable. No silent permits.
Durable backend unreachable (Postgres / SQLite) or any other unexpected internal failure on the chat path returns 503 with x-marg-reason: internal_error. Same fail-closed rule.
Asynchronous write batcher full (queue depth has reached [storage.write_batcher].channel_depth) returns 503 with x-marg-reason: storage_overloaded. The request never silently drops its spend or audit row. The marg_write_batcher_overflow_total counter increments per refusal.

Cross-origin and console

The admin Bearer token authentication is header-based, not cookie-based. CSRF is therefore not applicable to the admin port.
[admin.cors] defaults to disabled. Enable only when serving the console from a different origin (e.g. the Vite dev server). When the console is served same-origin from the admin port (the default), CORS is not needed.
The console's DOM is constructed via a small h() helper that uses textContent, never innerHTML, so user-controlled strings (key names, request log content) cannot inject HTML. XSS via innerHTML is structurally unreachable.

Server-Side Request Forgery (SSRF)

Provider base_url is operator-controlled config. There is no client-controlled URL anywhere in the request path. Apps cannot redirect Marg to an arbitrary host.
Provider names referenced in routes are validated at config-load against the providers block. An unknown name is a startup error, not a runtime ambiguity.

Secret references

plain:, env:, file: shapes are resolved at startup, not at request time. A file: reference returns the trimmed file contents.
A missing env var or unreadable file is a fatal startup error, not a silent empty string. We surface the missing reference before accepting traffic.

Audit and request log

v1.0 request log records: timestamp, request_id, key_id, team, model, provider, input_tokens, output_tokens, cost_usd, status, failover count, and per-attempt provider plus outcome.
No prompt or response body is ever logged in v0.1.0. The [security].log_prompts and log_responses config keys are parsed for forward compatibility but currently have no effect on the running binary.
The log lives in the durable backend (SQLite or Postgres). On a disk-full event the request log surfaces 503 and never silently drops entries.

The full tamper-evident, post-quantum signed audit chain is a mandatory v1.0 feature backed by Kavach (ADR-010 / ADR-011). Marg appends one signed entry per request to a SignedAuditChain and ships permit tokens that are ML-DSA-65 (+ optional Ed25519) signed end-to-end. The operational request log lives alongside it for metrics and admin UI consumption.

Cryptography in v1.0

Token hashing uses SHA-256 with no salt because tokens are high-entropy CSPRNG output (32 bytes of base32). Salting offers no protection here and would prevent the indexed hash(token) -> key_id lookup.
TLS termination is delegated. Run Marg behind a reverse proxy or ALB that handles TLS. Marg itself binds plain HTTP.
Bedrock SigV4 signing is implemented locally with hmac and sha2. The signer is feature-by-feature compatible with the AWS docs; no rolling-your-own primitives.

Post-quantum signatures (ML-DSA-65 + Ed25519 hybrid via Kavach) are baked into the v1.0 binary. Every audit chain entry and every permit token carries a signature; the [kavach].audit_hybrid and [kavach].permit_signer_hybrid knobs control whether the hybrid Ed25519 companion runs alongside ML-DSA-65 or whether the signer runs PQ-only.

Dependencies

All Rust dependencies pinned in Cargo.lock. Periodic cargo audit runs documented under "release process" in CHANGELOG.md.
Console dependencies: zero runtime, build-time only (Vite + TypeScript). The shipped bundle is hand-written DOM and a minimal h() helper; no React, no framework, no extra attack surface.

Things v1.0 deliberately does NOT do

No per-route ACL. Use IAM to control who has an admin token.
No request-body content filtering. Drishti / content moderation is a separate, optional layer that lands later.
No automatic provider failover on 4xx. 4xx surfaces directly.
No self-signed TLS. Front Marg with a real terminator.
No "demo mode" with bundled keys. Every deployment must provision its own provider credentials and admin token.

Reporting a vulnerability

Email security@<your-domain> with the request_id, the affected version, and the smallest reproducing case you can share. We acknowledge within 48 hours and patch within 14 days for any issue that is exploitable against a default-configuration deployment.

Self-audit log

Each row is one explicit check performed during the v1.0 review. This list is part of the P07 exit criteria; subsequent releases add rows as the surface grows.

Path	Verified	Evidence and notes
Authorization header never reaches logs	yes	`marg-server::observability::record_outcome` (`marg-server/src/observability.rs:77`) writes only method, path, status, latency, request_id, key_id, model, provider. tower-http's `TraceLayer` defaults at `marg-server/src/lib.rs:134` do not log request headers.
Provider key not in `/admin/keys` response	yes	`admin/handlers/keys.rs:79-108` returns only `MargKey` plus `BudgetSpec`. Create at the same file returns only the freshly minted Marg token. No upstream provider field anywhere.
Auth cache invalidated on revoke	yes	`admin/handlers/keys.rs:110-121` calls `state.metrics.clear_budget_remaining(&id)` then `state.key_cache.invalidate_all()`. Coarse but correct: every key's cache entry drops, so the revoke takes effect inside the next request.
Admin token file mode 0600	yes	`admin/server.rs:106-120` `write_bootstrap_file` uses `OpenOptions::mode(0o600)` under `#[cfg(unix)]` at line 113.
Console uses `textContent`, never `innerHTML`	yes	`console/src/dom.ts:10-58` `h()` helper wraps strings via `document.createTextNode` at line 52. Zero `innerHTML` occurrences anywhere in `console/src/`.
4xx upstream not failed over	yes	`marg-core/src/request_log.rs:47-52` `is_retriable` matches only `Timeout
Body size enforced before allocation	yes	`marg-server/src/lib.rs:135` installs `tower_http::limit::RequestBodyLimitLayer::new(cfg.server.max_body_bytes)`. A secondary hardcoded 8 MiB ceiling lives in `chat.rs:30` (`MAX_REQUEST_BYTES`) as a defence-in-depth bound.
`file:` secret missing is fatal at startup	yes	`marg-core/src/secret.rs:39-49` returns `Err(ConfigError::Validation)` on a read failure; the server start path propagates.
`env:` secret missing is fatal at startup	yes	Same file, lines 25-37, returns `Err` on `std::env::var` failure.
Bootstrap idempotency	yes	`admin/server.rs:59-67` `count_active_admin_tokens` first; mints only when the count is zero.
Admin auth middleware uniform	yes	`admin/router.rs:13-39` mounts every `/admin/` JSON route inside the `protected` group with `require_admin_token`. The only public paths on the admin port are `/`, `/console`, `/admin/openapi.json`, and `/metrics`.
Streaming: client drop cancels upstream	yes	`chat.rs::stream_response` breaks the loop on the first failed `tx.send` and drops `byte_stream`, which aborts the reqwest streaming request to the upstream provider. `marg_provider_errors_total{kind="client_disconnect"}` increments. The request is logged with status 499.
Write batcher overflow refuses (never silently drops)	yes	`chat.rs::non_stream_response` and `chat.rs::stream_response` route every `add_spend` and `append_request_log` through `state.write_batcher.enqueue(WriteJob::...)`. On `Err(Overflow)` the non-stream path returns `ChatError::StorageOverloaded` (503 + `x-marg-reason: storage_overloaded`); the stream path logs the overflow at warn. The `marg_write_batcher_overflow_total` counter increments per refusal.
Strict-mode rate limit is opt-in only	yes	`quota::check` passes `state.rate_limits.strict_mode` into `hot.allow_request`. The default in `marg-core::config::RateLimitsConfig::default` is `strict_mode = false`, i.e. the documented token-bucket convention. Enabling it requires an explicit `[rate_limits].strict_mode = true` line in marg.toml.
Permit tokens are signed	yes	`marg-server/src/kavach/runtime.rs::build_runtime` constructs `PqTokenSigner::from_keypair_hybrid` (or `_pq_only` when `[kavach].permit_signer_hybrid = false`) from the same `KavachKeyPair` loaded for the audit chain and attaches it via `Gate::with_token_signer`. Every `Verdict::Permit` carries an ML-DSA-65 (+ Ed25519) signature over `PermitToken::canonical_bytes()`. Verification recipe is in `docs/kavach.md`. Walkthrough scenario 11 covers a verify + byte-flip cycle.
Drift invalidations drop the local cache	yes	`chat.rs` `Verdict::Invalidate` branch calls `state.kavach.session_store.invalidate(session_id)` (flips `invalidated = true` on the persisted session row), `state.key_cache.invalidate_all()` (drops the moka auth cache so the next request re-resolves against storage), and `kavach::emit_key_event(.., KeyEventKind::Invalidated, ..)` (appends `marg.key_event.v1` to the chain). Subsequent requests on the same key 401 because the cache miss path re-validates and Kavach refuses on the still-invalidated session row.
Drift detector tuning is hot-reloadable, not restart-only	yes	`marg-server::policy::reload` calls `kavach::reload_policy`, which re-reads `[kavach.drift]` from `marg.toml` via `build_drift_evaluator` and stores the result into `runtime.drift_evaluator: Arc<SwappableDriftEvaluator>`. The gate's evaluator list is constructed once at boot with this wrapper attached; the wrapper's `evaluate` re-reads the inner `Option<Arc<DriftEvaluator>>` on every call, same `ArcSwap` pattern as `SwappableInvariantSet`.
Cross-restart audit chain documented as parked	yes	ADR-016 records the v1.0 limitation: each Marg process boots a fresh `SignedAuditChain` because Kavach 0.1.2's API does not expose `resume_from(prev_head_hash, prev_index)`. Per-lifetime JSONL files are independently verifiable; a unified-history walk requires the operator to verify each file separately. `docs/kavach.md` "Cross-restart audit chain (parked: ADR-016)" surfaces the same information to operators.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security

SECURITY.md

Marg security notes

Threat model

Authentication

Client API keys

Admin tokens

Privilege separation

Provider credentials

Body and connection limits

Streaming safety

Failover and fail-closed

Cross-origin and console

Server-Side Request Forgery (SSRF)

Secret references

Audit and request log

Cryptography in v1.0

Dependencies

Things v1.0 deliberately does NOT do

Reporting a vulnerability

Self-audit log

There aren't any published security advisories

Security: SarthiAI/Marg

Security

SECURITY.md

Marg security notes

Threat model

Authentication

Client API keys

Admin tokens

Privilege separation

Provider credentials

Body and connection limits

Streaming safety

Failover and fail-closed

Cross-origin and console

Server-Side Request Forgery (SSRF)

Secret references

Audit and request log

Cryptography in v1.0

Dependencies

Things v1.0 deliberately does NOT do

Reporting a vulnerability

Self-audit log

There aren't any published security advisories