Skip to content

Commit f37b69b

Browse files
authored
feat(sandbox): auto-detect TLS and terminate unconditionally for credential injection (#544)
* feat(sandbox): auto-detect TLS and terminate unconditionally for credential injection Closes #533 The proxy now auto-detects TLS by peeking the first bytes of each connection. When TLS is detected, it terminates unconditionally — enabling credential injection and optional L7 inspection without requiring explicit 'tls: terminate' in the policy.
1 parent 79c1ce1 commit f37b69b

File tree

20 files changed

+1051
-232
lines changed

20 files changed

+1051
-232
lines changed

.agents/skills/generate-sandbox-policy/examples.md

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,13 @@
22

33
Examples organized by detail tier — from minimal (just host + intent) to full (complete API docs).
44

5+
> **TLS note:** TLS termination is automatic. The proxy auto-detects TLS by
6+
> peeking the first bytes of each connection, so there is no need to specify
7+
> `tls: terminate` in policies. The `tls: terminate` and `tls: passthrough`
8+
> values are deprecated. If you have an edge case where auto-detection must
9+
> be bypassed, you can set `tls: skip` to disable TLS interception for that
10+
> endpoint.
11+
512
---
613

714
## Minimal Tier Examples (host + intent, no API docs)
@@ -23,7 +30,7 @@ network_policies:
2330
- { path: /usr/local/bin/claude }
2431
```
2532
26-
No `protocol`, `tls`, `rules`, or `access` — this is pure L4 (host:port + binary identity check).
33+
No `protocol`, `rules`, or `access` — this is pure L4 (host:port + binary identity check).
2734

2835
---
2936

@@ -41,7 +48,6 @@ network_policies:
4148
- host: api.github.com
4249
port: 443
4350
protocol: rest
44-
tls: terminate
4551
enforcement: enforce
4652
access: read-only
4753
binaries:
@@ -64,7 +70,6 @@ network_policies:
6470
- host: integrate.api.nvidia.com
6571
port: 443
6672
protocol: rest
67-
tls: terminate
6873
enforcement: enforce
6974
access: full
7075
binaries:
@@ -109,13 +114,11 @@ network_policies:
109114
- host: api.github.com
110115
port: 443
111116
protocol: rest
112-
tls: terminate
113117
enforcement: enforce
114118
access: read-only
115119
- host: api.gitlab.com
116120
port: 443
117121
protocol: rest
118-
tls: terminate
119122
enforcement: enforce
120123
access: read-only
121124
binaries:
@@ -155,7 +158,6 @@ network_policies:
155158
- host: api.openai.com
156159
port: 443
157160
protocol: rest
158-
tls: terminate
159161
enforcement: enforce
160162
rules:
161163
- allow:
@@ -202,7 +204,6 @@ network_policies:
202204
- host: integrate.api.nvidia.com
203205
port: 443
204206
protocol: rest
205-
tls: terminate
206207
enforcement: enforce
207208
rules:
208209
- allow:
@@ -236,7 +237,6 @@ network_policies:
236237
- host: api.github.com
237238
port: 443
238239
protocol: rest
239-
tls: terminate
240240
enforcement: enforce
241241
rules:
242242
- allow:
@@ -291,7 +291,6 @@ Endpoints:
291291
- Methods: GET, HEAD, OPTIONS only
292292
- Paths: All paths (user wants to browse freely)
293293
- This maps exactly to the `read-only` preset
294-
- Port 443 + L7 rules → needs `tls: terminate`
295294
296295
### Output
297296
@@ -303,7 +302,6 @@ network_policies:
303302
- host: api.github.com
304303
port: 443
305304
protocol: rest
306-
tls: terminate
307305
enforcement: enforce
308306
access: read-only
309307
binaries:
@@ -343,7 +341,6 @@ Endpoints:
343341
- Scope: `integrate.api.nvidia.com:443`
344342
- Methods: POST on `/v1/chat/completions`, GET on `/v1/models` and `/v1/models/*`
345343
- No preset fits — need explicit rules
346-
- Port 443 + L7 → `tls: terminate`
347344
- Two binaries
348345

349346
### Output
@@ -356,7 +353,6 @@ network_policies:
356353
- host: integrate.api.nvidia.com
357354
port: 443
358355
protocol: rest
359-
tls: terminate
360356
enforcement: enforce
361357
rules:
362358
- allow:
@@ -490,7 +486,6 @@ paths:
490486
- Tasks: GET, POST, PUT, DELETE on `/projects/*/tasks` and `/projects/*/tasks/*`
491487
- Members: GET only on `/projects/*/members`
492488
- Admin: No rules = denied by default
493-
- Port 443 + L7 → `tls: terminate`
494489

495490
### Output
496491

@@ -502,7 +497,6 @@ network_policies:
502497
- host: pm-api.example.com
503498
port: 443
504499
protocol: rest
505-
tls: terminate
506500
enforcement: enforce
507501
rules:
508502
# Projects — full CRUD
@@ -606,7 +600,6 @@ network_policies:
606600
- host: metrics.corp.com
607601
port: 443
608602
protocol: rest
609-
tls: terminate
610603
enforcement: enforce
611604
rules:
612605
- allow:
@@ -747,7 +740,6 @@ An exact IP is treated as `/32` — only that specific address is permitted.
747740
- host: api.github.com
748741
port: 443
749742
protocol: rest
750-
tls: terminate
751743
enforcement: enforce
752744
access: read-only
753745
binaries:
@@ -849,7 +841,6 @@ network_policies:
849841
- host: api.github.com
850842
port: 443
851843
protocol: rest
852-
tls: terminate
853844
enforcement: enforce
854845
access: read-only
855846
binaries:
@@ -861,7 +852,6 @@ network_policies:
861852
- host: api.anthropic.com
862853
port: 443
863854
protocol: rest
864-
tls: terminate
865855
enforcement: enforce
866856
access: full
867857
binaries:

architecture/gateway-security.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -420,7 +420,7 @@ This section defines the primary attacker profiles, what the current design prot
420420

421421
Separate from the cluster mTLS infrastructure, each sandbox has an independent TLS capability for inspecting outbound HTTPS traffic. This is documented here for completeness because it involves a distinct, per-sandbox PKI.
422422

423-
When a sandbox policy configures `tls: terminate` on an endpoint, the sandbox proxy performs TLS man-in-the-middle inspection:
423+
The sandbox proxy automatically detects and terminates TLS on outbound HTTPS connections by peeking the first bytes of each tunnel. This enables credential injection and L7 inspection without requiring explicit policy configuration. The proxy performs TLS man-in-the-middle inspection:
424424

425425
1. **Ephemeral sandbox CA**: a per-sandbox CA (`CN=OpenShell Sandbox CA, O=OpenShell`) is generated at sandbox startup. This CA is completely independent of the cluster mTLS CA.
426426
2. **Trust injection**: the sandbox CA is written to the sandbox filesystem and injected via `NODE_EXTRA_CA_CERTS` and `SSL_CERT_FILE` so processes inside the sandbox trust it.

architecture/policy-advisor.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ The `mechanistic_mapper` module (`crates/openshell-sandbox/src/mechanistic_mappe
5959
- Port recognition (well-known ports like 443, 5432 get a boost)
6060
- SSRF origin (SSRF denials get lower confidence)
6161
6. Generates security notes for private IPs, database ports, and ephemeral port ranges
62-
7. If L7 request samples are present, generates specific L7 rules (method + path) with `protocol: rest` and `tls: terminate` (plumbed but not yet fed data — see issue #205)
62+
7. If L7 request samples are present, generates specific L7 rules (method + path) with `protocol: rest` (TLS termination is automatic — no `tls` field needed). Plumbed but not yet fed data — see issue #205.
6363

6464
The mapper runs in `flush_proposals_to_gateway` after the aggregator drains. It produces `PolicyChunk` protos that are sent alongside the raw `DenialSummary` protos to the gateway.
6565

architecture/sandbox.md

Lines changed: 82 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,10 @@ All paths are relative to `crates/openshell-sandbox/src/`.
2727
| `sandbox/linux/seccomp.rs` | Syscall filtering via BPF on `SYS_socket` |
2828
| `bypass_monitor.rs` | Background `/dev/kmsg` reader for iptables bypass detection events |
2929
| `sandbox/linux/netns.rs` | Network namespace creation, veth pair setup, bypass detection iptables rules, cleanup on drop |
30-
| `l7/mod.rs` | L7 types (`L7Protocol`, `TlsMode`, `EnforcementMode`, `L7EndpointConfig`), config parsing, validation, access preset expansion |
30+
| `l7/mod.rs` | L7 types (`L7Protocol`, `TlsMode`, `EnforcementMode`, `L7EndpointConfig`), config parsing, validation, access preset expansion, deprecated `tls` value handling |
3131
| `l7/inference.rs` | Inference API pattern detection (`detect_inference_pattern()`), HTTP request/response parsing and formatting for intercepted inference connections |
32-
| `l7/tls.rs` | Ephemeral CA generation (`SandboxCa`), per-hostname leaf cert cache (`CertCache`), TLS termination/connection helpers |
33-
| `l7/relay.rs` | Protocol-aware bidirectional relay with per-request OPA evaluation |
32+
| `l7/tls.rs` | Ephemeral CA generation (`SandboxCa`), per-hostname leaf cert cache (`CertCache`), TLS termination/connection helpers, `looks_like_tls()` auto-detection |
33+
| `l7/relay.rs` | Protocol-aware bidirectional relay with per-request OPA evaluation, credential-injection-only passthrough relay |
3434
| `l7/rest.rs` | HTTP/1.1 request/response parsing, body framing (Content-Length, chunked), deny response generation |
3535
| `l7/provider.rs` | `L7Provider` trait and `L7Request`/`BodyLength` types |
3636

@@ -674,11 +674,26 @@ sequenceDiagram
674674
else All IPs public
675675
P->>U: TCP connect (resolved addrs)
676676
P-->>S: HTTP/1.1 200 Connection Established
677-
alt L7 config present
678-
P->>P: TLS termination / protocol detection
679-
P->>P: Per-request L7 evaluation
680-
else L4-only
677+
alt tls: skip
681678
P->>P: copy_bidirectional (raw tunnel)
679+
else Auto-detect
680+
P->>P: Peek first bytes
681+
alt TLS detected
682+
P->>P: TLS terminate (MITM)
683+
alt L7 config present
684+
P->>P: relay_with_inspection (per-request L7 evaluation)
685+
else No L7 config
686+
P->>P: relay_passthrough_with_credentials (credential injection)
687+
end
688+
else HTTP detected
689+
alt L7 config present
690+
P->>P: relay_with_inspection
691+
else No L7 config
692+
P->>P: relay_passthrough_with_credentials
693+
end
694+
else Neither TLS nor HTTP
695+
P->>P: copy_bidirectional (raw tunnel)
696+
end
682697
end
683698
end
684699
end
@@ -876,20 +891,45 @@ flowchart TD
876891

877892
`ResolvedRoute` has a custom `Debug` implementation in `crates/openshell-router/src/config.rs` that redacts the `api_key` field, printing `[REDACTED]` instead of the actual value. This prevents key leakage in log output and debug traces.
878893

879-
### Post-decision: L7 dispatch or raw tunnel (`Allow` path)
894+
### Post-decision: auto-TLS detection, L7 dispatch, or raw tunnel (`Allow` path)
880895

881-
After a CONNECT is allowed, the SSRF check passes, and the upstream TCP connection is established:
896+
After a CONNECT is allowed, the SSRF check passes, and the upstream TCP connection is established, the proxy determines how to handle the tunnel traffic. TLS detection is automatic — the proxy peeks the first bytes of the client stream to decide.
882897

883898
1. **Query L7 config**: `query_l7_config()` asks the OPA engine for `matched_endpoint_config`. If the endpoint has a `protocol` field, parse it into `L7EndpointConfig`.
884899

885-
2. **L7 inspection** (if config present):
886-
- Clone the OPA engine for per-tunnel evaluation (`clone_engine_for_tunnel()`)
887-
- Build `L7EvalContext` with host, port, policy name, binary path, ancestors, cmdline paths
888-
- Branch on TLS mode:
889-
- `TlsMode::Terminate`: MITM via `tls_terminate_client()` + `tls_connect_upstream()`, then `relay_with_inspection()`
890-
- `TlsMode::Passthrough`: Peek first bytes on raw TCP; if `looks_like_http()` matches, run `relay_with_inspection()`; reject on protocol mismatch
900+
2. **Check for `tls: skip`**: If the endpoint has `tls: skip`, bypass all auto-detection and relay raw bytes via `copy_bidirectional()`. This is the escape hatch for client-cert mTLS or non-standard protocols.
891901

892-
3. **L4-only** (no L7 config): `tokio::io::copy_bidirectional()` for a raw tunnel
902+
3. **Peek and auto-detect**: Read up to 8 bytes from the client stream via `TcpStream::peek()`. Classify the traffic using `looks_like_tls()` (checks for TLS ClientHello record: byte 0 = `0x16`, bytes 1-2 = TLS version `0x03xx`) and `looks_like_http()` (checks for HTTP method prefix).
903+
904+
4. **TLS detected** (`is_tls = true`):
905+
- Terminate TLS unconditionally via `tls_terminate_client()` + `tls_connect_upstream()`. This happens for all HTTPS endpoints, not just those with L7 config.
906+
- If L7 config is present: clone the OPA engine (`clone_engine_for_tunnel()`), run `relay_with_inspection()` for per-request policy evaluation.
907+
- If no L7 config: run `relay_passthrough_with_credentials()` — parses HTTP minimally to inject credentials (via `SecretResolver`) and log requests, but does not evaluate L7 OPA rules. This enables credential injection on all HTTPS endpoints without requiring `protocol` in the policy.
908+
- If TLS state is not configured: fall back to raw `copy_bidirectional()` with a warning.
909+
910+
5. **Plaintext HTTP detected** (`is_http = true`, `is_tls = false`):
911+
- If L7 config present: clone OPA engine, run `relay_with_inspection()` directly on the plaintext streams.
912+
- If no L7 config: run `relay_passthrough_with_credentials()` for credential injection and observability.
913+
914+
6. **Neither TLS nor HTTP**: Raw `copy_bidirectional()` tunnel (binary protocols, SSH-over-CONNECT, etc.).
915+
916+
```mermaid
917+
flowchart TD
918+
A["CONNECT allowed + upstream connected"] --> B["Query L7 config"]
919+
B --> C{"tls: skip?"}
920+
C -- Yes --> D["Raw copy_bidirectional"]
921+
C -- No --> E["Peek first bytes"]
922+
E --> F{"looks_like_tls?"}
923+
F -- Yes --> G["TLS terminate client + upstream"]
924+
G --> H{"L7 config?"}
925+
H -- Yes --> I["relay_with_inspection"]
926+
H -- No --> J["relay_passthrough_with_credentials<br/>(credential injection, no L7 rules)"]
927+
F -- No --> K{"looks_like_http?"}
928+
K -- Yes --> L{"L7 config?"}
929+
L -- Yes --> M["relay_with_inspection"]
930+
L -- No --> N["relay_passthrough_with_credentials"]
931+
K -- No --> O["Raw copy_bidirectional<br/>(binary protocol)"]
932+
```
893933

894934
## L7 Protocol-Aware Inspection
895935

@@ -918,7 +958,7 @@ flowchart LR
918958
| Type | Definition | Purpose |
919959
|------|-----------|---------|
920960
| `L7Protocol` | `Rest`, `Sql` | Supported application protocols |
921-
| `TlsMode` | `Passthrough`, `Terminate` | TLS handling strategy |
961+
| `TlsMode` | `Auto` (default), `Skip` | TLS handling strategy`Auto` peeks first bytes and terminates if TLS is detected; `Skip` bypasses detection entirely |
922962
| `EnforcementMode` | `Audit`, `Enforce` | What to do on L7 deny (log-only vs block) |
923963
| `L7EndpointConfig` | `{ protocol, tls, enforcement }` | Per-endpoint L7 configuration |
924964
| `L7Decision` | `{ allowed, reason, matched_rule }` | Result of L7 evaluation |
@@ -943,39 +983,58 @@ Expansion happens in `expand_access_presets()` before the Rego engine loads the
943983
**Errors** (block startup):
944984
- `rules` and `access` both specified on same endpoint
945985
- `protocol` specified without `rules` or `access`
946-
- `tls: terminate` without a `protocol`
947986
- `protocol: sql` with `enforcement: enforce` (SQL parsing not available in v1)
948987
- Empty `rules` array (would deny all traffic)
949988

950989
**Warnings** (logged):
951-
- `protocol: rest` on port 443 without `tls: terminate` (L7 rules ineffective on encrypted traffic)
990+
- `tls: terminate` or `tls: passthrough` on any endpoint (deprecated — TLS termination is now automatic; use `tls: skip` to disable)
991+
- `tls: skip` with L7 rules on port 443 (L7 inspection cannot work on encrypted traffic)
952992
- Unknown HTTP method in rules
953993

954-
### TLS termination
994+
### TLS termination (auto-detect)
955995

956996
**File:** `crates/openshell-sandbox/src/l7/tls.rs`
957997

958-
TLS termination enables the proxy to inspect HTTPS traffic by performing MITM decryption.
998+
TLS termination is automatic. The proxy peeks the first bytes of every CONNECT tunnel and terminates TLS whenever a ClientHello is detected. This enables credential injection and L7 inspection on all HTTPS endpoints without requiring explicit `tls: terminate` in the policy. The `tls` field defaults to `Auto`; use `tls: skip` to opt out entirely (e.g., for client-cert mTLS to upstream).
959999

9601000
**Ephemeral CA lifecycle:**
9611001
1. At sandbox startup, `SandboxCa::generate()` creates a self-signed CA (CN: "OpenShell Sandbox CA") using `rcgen`
9621002
2. The CA cert PEM and a combined bundle (system CAs + sandbox CA) are written to `/etc/openshell-tls/`
9631003
3. The sandbox CA cert path is set as `NODE_EXTRA_CA_CERTS` (additive for Node.js)
9641004
4. The combined bundle is set as `SSL_CERT_FILE`, `REQUESTS_CA_BUNDLE`, `CURL_CA_BUNDLE` (replaces defaults for OpenSSL, Python requests, curl)
9651005

1006+
**TLS auto-detection** (`looks_like_tls()`):
1007+
- Peeks up to 8 bytes from the client stream
1008+
- Checks for TLS ClientHello pattern: byte 0 = `0x16` (ContentType::Handshake), byte 1 = `0x03` (TLS major version), byte 2 ≤ `0x04` (minor version, covering SSL 3.0 through TLS 1.3)
1009+
- Returns `false` for plaintext HTTP, SSH, or other binary protocols
1010+
9661011
**Per-hostname leaf cert generation:**
9671012
- `CertCache` maps hostnames to `CertifiedLeaf` structs (cert chain + private key)
9681013
- First request for a hostname generates a leaf cert signed by the sandbox CA via `rcgen`
9691014
- Cache has a hard limit of 256 entries; on overflow, the entire cache is cleared (sufficient for sandbox scale)
9701015
- Each leaf cert chain contains two certs: the leaf and the CA
9711016

972-
**Connection flow:**
1017+
**Connection flow (when TLS is detected):**
9731018
1. `tls_terminate_client()`: Accept TLS from the sandboxed client using a `ServerConfig` with the hostname-specific leaf cert. ALPN: `http/1.1`.
9741019
2. `tls_connect_upstream()`: Connect TLS to the real upstream using a `ClientConfig` with Mozilla root CAs (`webpki_roots`). ALPN: `http/1.1`.
975-
3. Proxy now holds plaintext on both sides and runs `relay_with_inspection()`.
1020+
3. Proxy now holds plaintext on both sides. If L7 config is present, runs `relay_with_inspection()`. Otherwise, runs `relay_passthrough_with_credentials()` for credential injection without L7 evaluation.
9761021

9771022
System CA bundles are searched at well-known paths: `/etc/ssl/certs/ca-certificates.crt` (Debian/Ubuntu), `/etc/pki/tls/certs/ca-bundle.crt` (RHEL), `/etc/ssl/ca-bundle.pem` (openSUSE), `/etc/ssl/cert.pem` (Alpine/macOS).
9781023

1024+
### Credential-injection-only relay
1025+
1026+
**File:** `crates/openshell-sandbox/src/l7/relay.rs` (`relay_passthrough_with_credentials()`)
1027+
1028+
When TLS is auto-terminated but no L7 policy (`protocol` + `access`/`rules`) is configured on the endpoint, the proxy enters a passthrough mode that still provides value: it parses HTTP requests minimally to rewrite credential placeholders (via `SecretResolver`) and logs each request for observability. This relay:
1029+
1030+
1. Reads each HTTP request from the client via `RestProvider::parse_request()`
1031+
2. Logs the request method, path, host, and port at `info!()` level (tagged `"HTTP relay (credential injection)"`)
1032+
3. Forwards the request to upstream via `relay_http_request_with_resolver()`, which rewrites headers containing `openshell:resolve:env:*` placeholders with actual provider credential values
1033+
4. Relays the upstream response back to the client
1034+
5. Loops for HTTP keep-alive; exits on client close or non-reusable response
1035+
1036+
This enables credential injection on all HTTPS endpoints automatically, without requiring the policy author to add `protocol: rest` and `access: full` just to get credentials injected.
1037+
9791038
### REST protocol provider
9801039

9811040
**File:** `crates/openshell-sandbox/src/l7/rest.rs`

0 commit comments

Comments
 (0)