Skip to content

Commit c06117e

Browse files
authored
fix(proxy): return 403 for non-CONNECT requests, add deny logging, and revise error messages (#79)
Closes NVIDIA#42 - Change non-CONNECT proxy response from 405 to 403 to align with how CONNECT denials are surfaced - Add structured deny logging for non-CONNECT requests with hostname extraction from absolute-form URIs - Revise 7 user-facing error messages across proxy.rs and dev-sandbox-policy.rego to follow consistent principle: generic policy-deny messages for non-inference requests, descriptive messages for recognized inference endpoints - Update E2E test assertion to match new error message - Update architecture docs to reflect new behavior Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
1 parent 9099bc3 commit c06117e

5 files changed

Lines changed: 107 additions & 24 deletions

File tree

architecture/inference-routing.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ File mode does not spawn a refresh task -- routes are static for the sandbox lif
142142
Both route source modes degrade gracefully when routes are unavailable:
143143

144144
- **Empty routes in file mode**: If `routes: []` in the file, `build_inference_context()` returns `None` and inference routing is disabled. This is confirmed by the `build_inference_context_empty_route_file_returns_none` test.
145-
- **Empty routes in cluster mode**: If the initial cluster bundle has zero routes, the sandbox still creates `InferenceContext` with an empty cache and starts background refresh. Intercepted inference requests return `503` (`{"error": "no inference routes configured"}`) until a later refresh provides routes.
145+
- **Empty routes in cluster mode**: If the initial cluster bundle has zero routes, the sandbox still creates `InferenceContext` with an empty cache and starts background refresh. Intercepted inference requests return `503` (`{"error": "inference endpoint detected without matching inference route"}`) until a later refresh provides routes.
146146
- **Cluster mode errors**: `PermissionDenied` or `NotFound` errors (detected via string matching on the gRPC error message) indicate no inference policy is configured for this sandbox. The sandbox logs this and proceeds without inference routing. Other gRPC errors also result in graceful degradation: inference routing is disabled, but the sandbox starts normally.
147147
- **File mode errors**: Parse failures or missing files in standalone mode are fatal -- `build_inference_context()` propagates the error and the sandbox refuses to start. Only an empty-but-valid routes list is gracefully disabled.
148148

@@ -253,11 +253,11 @@ Built at sandbox startup in `crates/navigator-sandbox/src/lib.rs` by `build_infe
253253
- If `detect_inference_pattern()` matches:
254254
- Strip credential and framing/hop-by-hop headers (`Authorization`, `x-api-key`, `host`, `content-length`, and all hop-by-hop headers)
255255
- Acquire a read lock on the route cache
256-
- If routes are empty, return `503` JSON: `{"error": "no inference routes configured"}`
256+
- If routes are empty, return `503` JSON: `{"error": "inference endpoint detected without matching inference route"}`
257257
- Call `Router::proxy_with_candidates()` to select a route and forward the request locally
258258
- Return the backend's response to the client (response hop-by-hop and framing headers are stripped before formatting)
259259
- If no pattern matches:
260-
- Return a `403` JSON error: `{"error": "only inference API calls are allowed on this connection"}`
260+
- Return a `403` JSON error: `{"error": "connection not allowed by policy"}`
261261
- If the router call fails:
262262
- Map the `RouterError` to an HTTP status via `router_error_to_http()` and return a JSON error
263263

@@ -634,8 +634,8 @@ The inference routing migration is a breaking protocol change. The `ProxyInferen
634634
| `InferenceContext` missing | Error: "InspectForInference requires inference context (router + routes)" |
635635
| TLS state not configured | Error: "InspectForInference requires TLS state for client termination" |
636636
| Request exceeds 10 MiB buffer | `413` Payload Too Large response to client |
637-
| Non-inference request on intercepted connection | `403` JSON error: `{"error": "only inference API calls are allowed on this connection"}` |
638-
| No routes in cache | `503` JSON error: `{"error": "no inference routes configured"}` |
637+
| Non-inference request on intercepted connection | `403` JSON error: `{"error": "connection not allowed by policy"}` |
638+
| No routes in cache | `503` JSON error: `{"error": "inference endpoint detected without matching inference route"}` |
639639
| Router returns `NoCompatibleRoute` | `400` JSON error |
640640
| Backend timeout or connection failure | `503` JSON error |
641641
| Backend protocol error or internal error | `502` JSON error |

architecture/sandbox.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,7 @@ Uses the same input JSON shape as `evaluate_network()`. Evaluates the `data.navi
270270

271271
- `"allow"` -- endpoint + binary explicitly matched in a network policy
272272
- `"inspect_for_inference"` -- no policy match but `inference.allowed_routes` is non-empty
273-
- `"deny"` -- no matching policy and no inference routing configured
273+
- `"deny"` -- network connections not allowed by policy
274274

275275
The Rego logic:
276276
1. If `network_policy_for_request` exists (endpoint + binary match), return `"allow"`
@@ -582,7 +582,7 @@ Startup steps:
582582

583583
### Request parsing
584584

585-
The proxy reads up to 8192 bytes (`MAX_HEADER_BYTES`) looking for `\r\n\r\n`. It validates the method is `CONNECT` (returning 405 for anything else) and parses the `host:port` target.
585+
The proxy reads up to 8192 bytes (`MAX_HEADER_BYTES`) looking for `\r\n\r\n`. It validates the method is `CONNECT` (returning 403 for anything else with a structured log) and parses the `host:port` target.
586586

587587
### Control-plane bypass
588588

@@ -632,7 +632,7 @@ The `action` field carries the matched policy name (for `Allow` and `InspectForI
632632

633633
Every CONNECT request produces an `info!()` log line with all context: source/destination addresses, binary path, PID, ancestor chain, cmdline paths, action (`allow`, `inspect_for_inference`, or `deny`), engine, matched policy, and deny reason.
634634

635-
For `InspectForInference` connections, the initial log records `action=inspect_for_inference`. If the subsequent inference interception fails (TLS handshake failure, client disconnect, non-inference request, payload too large, missing context, or I/O error), a second `CONNECT` log is emitted with `action=deny` and a `reason` describing the failure. Successfully routed connections produce no second log. This two-log pattern gives operators visibility into why an `inspect_for_inference` decision ultimately resulted in a denial.
635+
For `InspectForInference` connections, the initial log records `action=inspect_for_inference`. If the subsequent inference interception fails (TLS handshake failure, client disconnect, request not allowed by policy, payload too large, missing context, or I/O error), a second `CONNECT` log is emitted with `action=deny` and a `reason` describing the failure. Successfully routed connections produce no second log. This two-log pattern gives operators visibility into why an `inspect_for_inference` decision ultimately resulted in a denial.
636636

637637
### SSRF protection (internal IP rejection)
638638

@@ -651,7 +651,7 @@ enum InferenceOutcome {
651651
}
652652
```
653653

654-
Every exit path in `handle_inference_interception` produces an explicit outcome. The `Denied` variant carries a human-readable reason describing the failure. At the call site in `handle_tcp_connection`, `Denied` outcomes (and `Err` results) trigger a structured CONNECT deny log with the same fields as the initial decision log (see [Unified logging](#unified-logging)). The `route_inference_request` helper returns `Result<bool>` where `true` means the request was routed and `false` means it was a non-inference request that was denied inline.
654+
Every exit path in `handle_inference_interception` produces an explicit outcome. The `Denied` variant carries a human-readable reason describing the failure. At the call site in `handle_tcp_connection`, `Denied` outcomes (and `Err` results) trigger a structured CONNECT deny log with the same fields as the initial decision log (see [Unified logging](#unified-logging)). The `route_inference_request` helper returns `Result<bool>` where `true` means the request was routed and `false` means the request was not allowed by policy and was denied inline.
655655

656656
The interception steps:
657657

@@ -677,10 +677,10 @@ The interception steps:
677677
6. **Response handling**:
678678
- On success: the router's response (status code, headers, body) is formatted as an HTTP/1.1 response and sent back to the client after stripping response framing/hop-by-hop headers (`transfer-encoding`, `content-length`, `connection`, etc.)
679679
- On router failure: the error is mapped to an HTTP status code via `router_error_to_http()` and returned as a JSON error body (see error table below)
680-
- Empty route cache: returns `503` JSON error (`{"error": "no inference routes configured"}`)
681-
- Non-inference requests: returns `403 Forbidden` with a JSON error body (`{"error": "only inference API calls are allowed on this connection"}`)
680+
- Empty route cache: returns `503` JSON error (`{"error": "inference endpoint detected without matching inference route"}`)
681+
- Non-inference requests: returns `403 Forbidden` with a JSON error body (`{"error": "connection not allowed by policy"}`)
682682

683-
7. **Connection lifecycle**: The handler loops to process multiple HTTP requests on the same connection (HTTP keep-alive). The loop ends when the client closes the connection or an unrecoverable error occurs. Once at least one request has been successfully routed (`routed_any` flag), subsequent failures (client disconnect, I/O error, payload too large, non-inference request) are treated as clean termination (`InferenceOutcome::Routed`) rather than denials.
683+
7. **Connection lifecycle**: The handler loops to process multiple HTTP requests on the same connection (HTTP keep-alive). The loop ends when the client closes the connection or an unrecoverable error occurs. Once at least one request has been successfully routed (`routed_any` flag), subsequent failures (client disconnect, I/O error, payload too large, request not allowed by policy) are treated as clean termination (`InferenceOutcome::Routed`) rather than denials.
684684

685685
### Router error to HTTP mapping
686686

@@ -1118,8 +1118,8 @@ The sandbox uses `miette` for error reporting and `thiserror` for typed errors.
11181118
| Inference interception: no compatible route | 400 Bad Request with JSON error body |
11191119
| Inference interception: backend timeout/unavailable | 503 Service Unavailable with JSON error body |
11201120
| Inference interception: backend protocol error | 502 Bad Gateway with JSON error body |
1121-
| Inference interception: non-inference request (no prior routing) | 403 Forbidden with JSON error body + structured CONNECT deny log |
1122-
| Inference interception: non-inference request (after prior routing) | 403 Forbidden with JSON error body (no deny log, connection counts as routed) |
1121+
| Inference interception: request not allowed by policy (no prior routing) | 403 Forbidden with JSON error body + structured CONNECT deny log |
1122+
| Inference interception: request not allowed by policy (after prior routing) | 403 Forbidden with JSON error body (no deny log, connection counts as routed) |
11231123
| Log push gRPC connection fails | Task prints to stderr and exits; logs not pushed for sandbox lifetime |
11241124
| Log push mpsc channel full (1024 lines) | Event dropped silently; logging never blocks |
11251125
| Log push gRPC stream breaks | Push loop exits, flushes remaining batch |

crates/navigator-sandbox/src/proxy.rs

Lines changed: 88 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,13 @@ async fn handle_tcp_connection(
229229
let target = parts.next().unwrap_or("");
230230

231231
if method != "CONNECT" {
232-
respond(&mut client, b"HTTP/1.1 405 Method Not Allowed\r\n\r\n").await?;
232+
let target_host = extract_host_from_uri(target);
233+
info!(
234+
method = %method,
235+
target_host = %target_host,
236+
"Non-CONNECT proxy request denied"
237+
);
238+
respond(&mut client, b"HTTP/1.1 403 Forbidden\r\n\r\n").await?;
233239
return Ok(());
234240
}
235241

@@ -748,7 +754,7 @@ async fn handle_inference_interception(
748754

749755
let Some(ctx) = inference_ctx else {
750756
return Ok(InferenceOutcome::Denied {
751-
reason: "missing inference context".to_string(),
757+
reason: "connection not allowed by policy".to_string(),
752758
});
753759
};
754760

@@ -805,7 +811,7 @@ async fn handle_inference_interception(
805811
routed_any = true;
806812
} else if !routed_any {
807813
return Ok(InferenceOutcome::Denied {
808-
reason: "non-inference request".to_string(),
814+
reason: "connection not allowed by policy".to_string(),
809815
});
810816
}
811817

@@ -861,7 +867,7 @@ async fn route_inference_request(
861867
let routes = ctx.routes.read().await;
862868

863869
if routes.is_empty() {
864-
let body = serde_json::json!({"error": "no inference routes configured"});
870+
let body = serde_json::json!({"error": "inference endpoint detected without matching inference route"});
865871
let body_bytes = body.to_string();
866872
let response = format_http_response(
867873
503,
@@ -891,7 +897,7 @@ async fn route_inference_request(
891897
write_all(tls_client, &response).await?;
892898
}
893899
Err(e) => {
894-
warn!(error = %e, "Local inference routing failed");
900+
warn!(error = %e, "inference endpoint detected but upstream service failed");
895901
let (status, msg) = router_error_to_http(&e);
896902
let body = serde_json::json!({"error": msg});
897903
let body_bytes = body.to_string();
@@ -909,10 +915,9 @@ async fn route_inference_request(
909915
info!(
910916
method = %request.method,
911917
path = %request.path,
912-
"Non-inference request denied (inference-only mode)"
918+
"connection not allowed by policy"
913919
);
914-
let body =
915-
serde_json::json!({"error": "only inference API calls are allowed on this connection"});
920+
let body = serde_json::json!({"error": "connection not allowed by policy"});
916921
let body_bytes = body.to_string();
917922
let response = format_http_response(
918923
403,
@@ -1226,6 +1231,33 @@ fn query_allowed_ips(
12261231
}
12271232
}
12281233

1234+
/// Extract the hostname from an absolute-form URI used in plain HTTP proxy requests.
1235+
///
1236+
/// For example, `"http://example.com/path"` yields `"example.com"` and
1237+
/// `"http://example.com:8080/path"` yields `"example.com"`. Returns `"unknown"`
1238+
/// if the URI cannot be parsed.
1239+
fn extract_host_from_uri(uri: &str) -> String {
1240+
// Absolute-form URIs look like "http://host[:port]/path"
1241+
// Strip the scheme prefix, then extract the authority (host[:port]) before the first '/'.
1242+
let after_scheme = uri.find("://").map(|i| &uri[i + 3..]).unwrap_or(uri);
1243+
let authority = after_scheme.split('/').next().unwrap_or(after_scheme);
1244+
// Strip port if present (handle IPv6 bracket notation)
1245+
let host = if authority.starts_with('[') {
1246+
// IPv6: [::1]:port
1247+
authority
1248+
.find(']')
1249+
.map(|i| &authority[..=i])
1250+
.unwrap_or(authority)
1251+
} else {
1252+
authority.split(':').next().unwrap_or(authority)
1253+
};
1254+
if host.is_empty() {
1255+
"unknown".to_string()
1256+
} else {
1257+
host.to_string()
1258+
}
1259+
}
1260+
12291261
fn parse_target(target: &str) -> Result<(String, u16)> {
12301262
let (host, port_str) = target
12311263
.split_once(':')
@@ -1698,4 +1730,52 @@ mod tests {
16981730
"expected 'not in allowed_ips' in error: {err}"
16991731
);
17001732
}
1733+
1734+
// --- extract_host_from_uri tests ---
1735+
1736+
#[test]
1737+
fn test_extract_host_from_http_uri() {
1738+
assert_eq!(
1739+
extract_host_from_uri("http://example.com/path"),
1740+
"example.com"
1741+
);
1742+
}
1743+
1744+
#[test]
1745+
fn test_extract_host_from_https_uri() {
1746+
assert_eq!(
1747+
extract_host_from_uri("https://api.openai.com/v1/chat/completions"),
1748+
"api.openai.com"
1749+
);
1750+
}
1751+
1752+
#[test]
1753+
fn test_extract_host_from_uri_with_port() {
1754+
assert_eq!(
1755+
extract_host_from_uri("http://example.com:8080/path"),
1756+
"example.com"
1757+
);
1758+
}
1759+
1760+
#[test]
1761+
fn test_extract_host_from_uri_ipv6() {
1762+
assert_eq!(extract_host_from_uri("http://[::1]:8080/path"), "[::1]");
1763+
}
1764+
1765+
#[test]
1766+
fn test_extract_host_from_uri_no_path() {
1767+
assert_eq!(extract_host_from_uri("http://example.com"), "example.com");
1768+
}
1769+
1770+
#[test]
1771+
fn test_extract_host_from_uri_empty() {
1772+
assert_eq!(extract_host_from_uri(""), "unknown");
1773+
}
1774+
1775+
#[test]
1776+
fn test_extract_host_from_uri_malformed() {
1777+
// Gracefully handles garbage input
1778+
let result = extract_host_from_uri("not-a-uri");
1779+
assert!(!result.is_empty());
1780+
}
17011781
}

dev-sandbox-policy.rego

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ deny_reason := reason if {
5454
reason := concat("; ", all_reasons)
5555
}
5656

57-
deny_reason := "no matching policy and no inference routing configured" if {
57+
deny_reason := "network connections not allowed by policy" if {
5858
input.network
5959
input.exec
6060
not network_policy_for_request

e2e/python/test_inference_routing.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,10 @@ def call_chat_completions() -> str:
125125
assert initial.exit_code == 0, f"stderr: {initial.stderr}"
126126
initial_output = initial.stdout.strip()
127127
assert initial_output.startswith("http_error_503"), initial_output
128-
assert "no inference routes configured" in initial_output, initial_output
128+
assert (
129+
"inference endpoint detected without matching inference route"
130+
in initial_output
131+
), initial_output
129132

130133
inference_client.create(
131134
name=route_name,

0 commit comments

Comments
 (0)