You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(proxy): return 403 for non-CONNECT requests, add deny logging, and revise error messages (#79)
ClosesNVIDIA#42
- Change non-CONNECT proxy response from 405 to 403 to align with
how CONNECT denials are surfaced
- Add structured deny logging for non-CONNECT requests with hostname
extraction from absolute-form URIs
- Revise 7 user-facing error messages across proxy.rs and
dev-sandbox-policy.rego to follow consistent principle: generic
policy-deny messages for non-inference requests, descriptive messages
for recognized inference endpoints
- Update E2E test assertion to match new error message
- Update architecture docs to reflect new behavior
Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Copy file name to clipboardExpand all lines: architecture/inference-routing.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,7 +142,7 @@ File mode does not spawn a refresh task -- routes are static for the sandbox lif
142
142
Both route source modes degrade gracefully when routes are unavailable:
143
143
144
144
-**Empty routes in file mode**: If `routes: []` in the file, `build_inference_context()` returns `None` and inference routing is disabled. This is confirmed by the `build_inference_context_empty_route_file_returns_none` test.
145
-
-**Empty routes in cluster mode**: If the initial cluster bundle has zero routes, the sandbox still creates `InferenceContext` with an empty cache and starts background refresh. Intercepted inference requests return `503` (`{"error": "no inference routes configured"}`) until a later refresh provides routes.
145
+
-**Empty routes in cluster mode**: If the initial cluster bundle has zero routes, the sandbox still creates `InferenceContext` with an empty cache and starts background refresh. Intercepted inference requests return `503` (`{"error": "inference endpoint detected without matching inference route"}`) until a later refresh provides routes.
146
146
-**Cluster mode errors**: `PermissionDenied` or `NotFound` errors (detected via string matching on the gRPC error message) indicate no inference policy is configured for this sandbox. The sandbox logs this and proceeds without inference routing. Other gRPC errors also result in graceful degradation: inference routing is disabled, but the sandbox starts normally.
147
147
-**File mode errors**: Parse failures or missing files in standalone mode are fatal -- `build_inference_context()` propagates the error and the sandbox refuses to start. Only an empty-but-valid routes list is gracefully disabled.
148
148
@@ -253,11 +253,11 @@ Built at sandbox startup in `crates/navigator-sandbox/src/lib.rs` by `build_infe
253
253
- If `detect_inference_pattern()` matches:
254
254
- Strip credential and framing/hop-by-hop headers (`Authorization`, `x-api-key`, `host`, `content-length`, and all hop-by-hop headers)
255
255
- Acquire a read lock on the route cache
256
-
- If routes are empty, return `503` JSON: `{"error": "no inference routes configured"}`
256
+
- If routes are empty, return `503` JSON: `{"error": "inference endpoint detected without matching inference route"}`
257
257
- Call `Router::proxy_with_candidates()` to select a route and forward the request locally
258
258
- Return the backend's response to the client (response hop-by-hop and framing headers are stripped before formatting)
259
259
- If no pattern matches:
260
-
- Return a `403` JSON error: `{"error": "only inference API calls are allowed on this connection"}`
260
+
- Return a `403` JSON error: `{"error": "connection not allowed by policy"}`
261
261
- If the router call fails:
262
262
- Map the `RouterError` to an HTTP status via `router_error_to_http()` and return a JSON error
263
263
@@ -634,8 +634,8 @@ The inference routing migration is a breaking protocol change. The `ProxyInferen
Copy file name to clipboardExpand all lines: architecture/sandbox.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -270,7 +270,7 @@ Uses the same input JSON shape as `evaluate_network()`. Evaluates the `data.navi
270
270
271
271
- `"allow"`-- endpoint + binary explicitly matched in a network policy
272
272
- `"inspect_for_inference"`-- no policy match but `inference.allowed_routes` is non-empty
273
-
- `"deny"`-- no matching policy and no inference routing configured
273
+
- `"deny"`-- network connections not allowed by policy
274
274
275
275
The Rego logic:
276
276
1. If `network_policy_for_request` exists (endpoint + binary match), return `"allow"`
@@ -582,7 +582,7 @@ Startup steps:
582
582
583
583
### Request parsing
584
584
585
-
The proxy reads up to 8192 bytes (`MAX_HEADER_BYTES`) looking for `\r\n\r\n`. It validates the method is `CONNECT` (returning 405 for anything else) and parses the `host:port` target.
585
+
The proxy reads up to 8192 bytes (`MAX_HEADER_BYTES`) looking for `\r\n\r\n`. It validates the method is `CONNECT` (returning 403 for anything else with a structured log) and parses the `host:port` target.
586
586
587
587
### Control-plane bypass
588
588
@@ -632,7 +632,7 @@ The `action` field carries the matched policy name (for `Allow` and `InspectForI
632
632
633
633
Every CONNECT request produces an `info!()` log line with all context: source/destination addresses, binary path, PID, ancestor chain, cmdline paths, action (`allow`, `inspect_for_inference`, or `deny`), engine, matched policy, and deny reason.
634
634
635
-
For `InspectForInference` connections, the initial log records `action=inspect_for_inference`. If the subsequent inference interception fails (TLS handshake failure, client disconnect, non-inference request, payload too large, missing context, or I/O error), a second `CONNECT` log is emitted with `action=deny` and a `reason` describing the failure. Successfully routed connections produce no second log. This two-log pattern gives operators visibility into why an `inspect_for_inference` decision ultimately resulted in a denial.
635
+
For `InspectForInference` connections, the initial log records `action=inspect_for_inference`. If the subsequent inference interception fails (TLS handshake failure, client disconnect, request not allowed by policy, payload too large, missing context, or I/O error), a second `CONNECT` log is emitted with `action=deny` and a `reason` describing the failure. Successfully routed connections produce no second log. This two-log pattern gives operators visibility into why an `inspect_for_inference` decision ultimately resulted in a denial.
636
636
637
637
### SSRF protection (internal IP rejection)
638
638
@@ -651,7 +651,7 @@ enum InferenceOutcome {
651
651
}
652
652
```
653
653
654
-
Every exit path in `handle_inference_interception` produces an explicit outcome. The `Denied` variant carries a human-readable reason describing the failure. At the call site in `handle_tcp_connection`, `Denied` outcomes (and `Err` results) trigger a structured CONNECT deny log with the same fields as the initial decision log (see [Unified logging](#unified-logging)). The `route_inference_request` helper returns `Result<bool>` where `true` means the request was routed and `false` means it was a non-inference request that was denied inline.
654
+
Every exit path in `handle_inference_interception` produces an explicit outcome. The `Denied` variant carries a human-readable reason describing the failure. At the call site in `handle_tcp_connection`, `Denied` outcomes (and `Err` results) trigger a structured CONNECT deny log with the same fields as the initial decision log (see [Unified logging](#unified-logging)). The `route_inference_request` helper returns `Result<bool>` where `true` means the request was routed and `false` means the request was not allowed by policy and was denied inline.
655
655
656
656
The interception steps:
657
657
@@ -677,10 +677,10 @@ The interception steps:
677
677
6.**Response handling**:
678
678
- On success: the router's response (status code, headers, body) is formatted as an HTTP/1.1 response and sent back to the client after stripping response framing/hop-by-hop headers (`transfer-encoding`, `content-length`, `connection`, etc.)
679
679
- On router failure: the error is mapped to an HTTP status code via `router_error_to_http()` and returned as a JSON error body (see error table below)
- Non-inference requests: returns `403 Forbidden` with a JSON error body (`{"error": "connection not allowed by policy"}`)
682
682
683
-
7.**Connection lifecycle**: The handler loops to process multiple HTTP requests on the same connection (HTTP keep-alive). The loop ends when the client closes the connection or an unrecoverable error occurs. Once at least one request has been successfully routed (`routed_any` flag), subsequent failures (client disconnect, I/O error, payload too large, non-inference request) are treated as clean termination (`InferenceOutcome::Routed`) rather than denials.
683
+
7.**Connection lifecycle**: The handler loops to process multiple HTTP requests on the same connection (HTTP keep-alive). The loop ends when the client closes the connection or an unrecoverable error occurs. Once at least one request has been successfully routed (`routed_any` flag), subsequent failures (client disconnect, I/O error, payload too large, request not allowed by policy) are treated as clean termination (`InferenceOutcome::Routed`) rather than denials.
684
684
685
685
### Router error to HTTP mapping
686
686
@@ -1118,8 +1118,8 @@ The sandbox uses `miette` for error reporting and `thiserror` for typed errors.
1118
1118
| Inference interception: no compatible route | 400 Bad Request with JSON error body |
1119
1119
| Inference interception: backend timeout/unavailable | 503 Service Unavailable with JSON error body |
1120
1120
| Inference interception: backend protocol error | 502 Bad Gateway with JSON error body |
1121
-
| Inference interception: non-inference request (no prior routing) | 403 Forbidden with JSON error body + structured CONNECT deny log |
1122
-
| Inference interception: non-inference request (after prior routing) | 403 Forbidden with JSON error body (no deny log, connection counts as routed) |
1121
+
| Inference interception: request not allowed by policy (no prior routing) | 403 Forbidden with JSON error body + structured CONNECT deny log |
1122
+
| Inference interception: request not allowed by policy (after prior routing) | 403 Forbidden with JSON error body (no deny log, connection counts as routed) |
1123
1123
| Log push gRPC connection fails | Task prints to stderr and exits; logs not pushed for sandbox lifetime |
1124
1124
| Log push mpsc channel full (1024 lines) | Event dropped silently; logging never blocks |
0 commit comments