Skip to content

Commit 4f7321c

Browse files
committed
fix(scan): degrade proxy batch validation 400s to per-package GETs
The proxy batch endpoint validates its component list all-or-nothing, so a single crawled PURL of a type the server doesn't recognize (e.g. pkg:jsr/... from the Deno crawler) rejected the whole chunk and, when every chunk contained one, flipped the entire scan to exit 1. The docker e2e deno JSR test caught this against the production proxy: scan crawled 4 jsr packages, the one batch POST 400'd, and the scan errored where the old per-package GET path had quietly tolerated the unresolvable PURLs one by one. A validation 400 on the proxy batch POST now degrades the chunk to the legacy per-package GET path, whose per-PURL error swallowing is the scan semantic that predates the batch optimization: the valid subset still resolves, the exotic PURLs stay patch-less, and the scan succeeds. Rate limits, auth statuses, and over-capacity 503s still surface. is_batch_unsupported keeps detecting deployment-level absence only; the two fallback causes are logged distinctly. Verified against the real patches-api.socket.dev with a synthetic JSR layout: scan now exits 0 with scannedPackages=4 and the debug trace shows batch POST -> validation 400 -> 4 tolerated per-package GETs. Assisted-by: Claude Code:claude-fable-5
1 parent 625188e commit 4f7321c

4 files changed

Lines changed: 120 additions & 40 deletions

File tree

CHANGELOG.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,13 @@ in this file — see `.github/workflows/release.yml` (`version` job).
5151
- **Token-less `scan` now batch-queries the public proxy.** Proxy-mode scans
5252
POST `{proxy}/patch/batch` (one request per `--batch-size` chunk, mirroring
5353
the authenticated `/v0/orgs/{slug}/patches/batch` endpoint) instead of
54-
issuing one `GET /patch/by-package/:purl` per package. Against proxies that
55-
predate the batch endpoint, the client transparently degrades to the legacy
56-
per-package GET path; rate limits and over-capacity 503s still surface
57-
instead of silently degrading. (MINOR)
54+
issuing one `GET /patch/by-package/:purl` per package. The client
55+
transparently degrades to the legacy per-package GET path against proxies
56+
that predate the batch endpoint, and when the all-or-nothing batch
57+
validation rejects a chunk (e.g. a crawled PURL type the server doesn't
58+
recognize, such as `pkg:jsr/…` — per-package queries tolerate those
59+
individually, so one exotic package can't fail a whole scan). Rate limits
60+
and over-capacity 503s still surface instead of silently degrading. (MINOR)
5861

5962
## [3.2.0] — 2026-05-29
6063

crates/socket-patch-cli/CLI_CONTRACT.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Beyond the globals above, each subcommand defines a small set of local arguments
6969

7070
`scan --prune` opts into garbage collection. When set, `scan` removes manifest entries for packages no longer present in the crawl, then deletes orphan blob, diff, and package-archive files from `.socket/`. Off by default (v3.0) so a temporary uninstall doesn't silently destroy manifest state.
7171

72-
`scan` queries the patch API in `--batch-size` chunks. Authenticated runs POST `/v0/orgs/{slug}/patches/batch`; token-less runs POST `{proxy}/patch/batch` on the public proxy and degrade to per-package `GET /patch/by-package/:purl` requests only when the deployed proxy predates the batch endpoint (legacy proxies answer the POST with their `400 "Unsupported endpoint"` catch-all). Validation errors, rate limits, and over-capacity 503s surface instead of silently degrading.
72+
`scan` queries the patch API in `--batch-size` chunks. Authenticated runs POST `/v0/orgs/{slug}/patches/batch`; token-less runs POST `{proxy}/patch/batch` on the public proxy and degrade to per-package `GET /patch/by-package/:purl` requests in two cases: the deployed proxy predates the batch endpoint (legacy proxies answer the POST with their `400 "Unsupported endpoint"` catch-all), or the all-or-nothing batch validation rejects the chunk (e.g. a crawled PURL type the server doesn't recognize, such as `pkg:jsr/…` — the per-package path tolerates those individually, preserving the pre-batch scan semantics). Rate limits and over-capacity 503s surface instead of silently degrading.
7373

7474
`scan --sync` is sugar for `--apply --prune` — the canonical single-flag bot invocation. `scan --json --sync --yes` discovers, applies, and reconciles state in one pass.
7575

crates/socket-patch-core/src/api/client.rs

Lines changed: 50 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -297,12 +297,12 @@ impl ApiClient {
297297
}
298298

299299
// Public proxy: prefer the POST /patch/batch endpoint; degrade to
300-
// individual per-package GET requests against proxies that predate
301-
// it (see `is_batch_unsupported`).
300+
// individual per-package GET requests when the deployed proxy
301+
// predates it or when batch validation rejects the chunk (see
302+
// `proxy_batch_post` for the decision table).
302303
match self.proxy_batch_post(purls).await? {
303304
Some(response) => Ok(response),
304305
None => {
305-
debug_log("proxy batch endpoint unavailable; falling back to individual queries");
306306
self.search_patches_batch_via_individual_queries(purls)
307307
.await
308308
}
@@ -312,12 +312,24 @@ impl ApiClient {
312312
/// Internal: POST the batch search to the public proxy's
313313
/// `/patch/batch` endpoint.
314314
///
315-
/// Returns `Ok(None)` when the deployed proxy predates the batch
316-
/// endpoint (see [`is_batch_unsupported`]) so the caller can degrade to
317-
/// the legacy per-package GET path. Auth / rate-limit statuses are
318-
/// classified via `classify_auth_error` exactly like the JSON
319-
/// transport — 401/403 keep feeding `is_fallback_candidate` and 429
320-
/// stays visible — and any other failure surfaces as an error.
315+
/// Returns `Ok(None)` when the caller should degrade to the legacy
316+
/// per-package GET path, in two situations:
317+
///
318+
/// 1. The deployed proxy predates the batch endpoint (see
319+
/// [`is_batch_unsupported`]).
320+
/// 2. The batch endpoint rejected the chunk with a validation `400`.
321+
/// Batch validation is all-or-nothing, so a single crawled PURL of
322+
/// a type the server doesn't recognize (e.g. `pkg:jsr/…` from the
323+
/// Deno crawler) rejects every package in the chunk. The
324+
/// per-package GET path tolerates such PURLs individually — each
325+
/// failure is swallowed per-package — which is the scan semantic
326+
/// that predates the batch optimization and must be preserved: one
327+
/// exotic package must not turn a whole scan into an error.
328+
///
329+
/// Auth / rate-limit statuses are classified via `classify_auth_error`
330+
/// exactly like the JSON transport — 401/403 keep feeding
331+
/// `is_fallback_candidate` and 429 stays visible — and any other
332+
/// failure (including over-capacity 503s) surfaces as an error.
321333
async fn proxy_batch_post(
322334
&self,
323335
purls: &[String],
@@ -356,7 +368,22 @@ impl ApiClient {
356368
}
357369

358370
let text = resp.text().await.unwrap_or_default();
359-
if is_batch_unsupported(status, &text) {
371+
let fallback_reason = if is_batch_unsupported(status, &text) {
372+
Some("proxy batch endpoint unavailable")
373+
} else if status == StatusCode::BAD_REQUEST {
374+
// All-or-nothing batch validation rejected the chunk; the
375+
// per-package path resolves the valid subset (see doc above).
376+
Some("proxy batch validation rejected the chunk")
377+
} else {
378+
None
379+
};
380+
if let Some(reason) = fallback_reason {
381+
debug_log(&format!(
382+
"{} (status {}: {}); falling back to individual queries",
383+
reason,
384+
status.as_u16(),
385+
text
386+
));
360387
return Ok(None);
361388
}
362389
Err(ApiError::Other(format!(
@@ -870,12 +897,14 @@ fn classify_auth_error(status: StatusCode, use_public_proxy: bool) -> Option<Api
870897
///
871898
/// The `"Unsupported endpoint"` marker is a cross-repo contract with the
872899
/// depscan firewall-api-proxy: its catch-all answers unknown routes with
873-
/// `400 {"error":"Unsupported endpoint",...}`, while genuine batch
874-
/// validation failures use different wording and must surface to the
875-
/// operator instead of silently degrading. Likewise for 503, only the
876-
/// "Patch API is not configured" body (patch endpoints disabled) degrades —
877-
/// an over-capacity 503 ("Service temporarily over capacity") surfaces
878-
/// rather than amplifying load tenfold via the per-package fallback.
900+
/// `400 {"error":"Unsupported endpoint",...}`. Batch validation failures
901+
/// use different wording and are deliberately NOT matched here — the
902+
/// caller (`proxy_batch_post`) still degrades them to the per-package
903+
/// path, but logs them as a chunk-validation rejection rather than a
904+
/// missing endpoint. For 503, only the "Patch API is not configured"
905+
/// body (patch endpoints disabled) degrades — an over-capacity 503
906+
/// ("Service temporarily over capacity") surfaces rather than amplifying
907+
/// load tenfold via the per-package fallback.
879908
fn is_batch_unsupported(status: StatusCode, body: &str) -> bool {
880909
match status {
881910
StatusCode::BAD_REQUEST => body.contains("Unsupported endpoint"),
@@ -1613,7 +1642,11 @@ mod tests {
16131642
}
16141643

16151644
#[test]
1616-
fn is_batch_unsupported_surfaces_genuine_validation_400() {
1645+
fn is_batch_unsupported_does_not_match_validation_400() {
1646+
// Validation 400s are not "endpoint missing" — the caller still
1647+
// degrades them to the per-package path (all-or-nothing batch
1648+
// validation must not fail a whole scan over one exotic PURL),
1649+
// but via the chunk-validation branch with its own log line.
16171650
assert!(!is_batch_unsupported(
16181651
StatusCode::BAD_REQUEST,
16191652
r#"{"error":"Invalid PURL format"}"#,

crates/socket-patch-core/tests/proxy_batch_e2e.rs

Lines changed: 62 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -143,34 +143,78 @@ async fn proxy_batch_degrades_when_patch_api_unconfigured_503() {
143143
}
144144

145145
#[tokio::test]
146-
async fn proxy_batch_validation_400_surfaces_without_fallback() {
146+
async fn proxy_batch_validation_400_degrades_to_per_package_gets() {
147+
// The batch endpoint validates the component list all-or-nothing, so a
148+
// chunk mixing a supported PURL with one the server doesn't recognize
149+
// (e.g. pkg:jsr/… from the Deno crawler) is rejected wholesale. The
150+
// valid subset must still resolve via the per-package path instead of
151+
// failing the scan.
147152
let server = MockServer::start().await;
148153
Mock::given(method("POST"))
149154
.and(path("/patch/batch"))
150-
.respond_with(
151-
ResponseTemplate::new(400).set_body_json(json!({ "error": "Invalid PURL format" })),
152-
)
155+
.respond_with(ResponseTemplate::new(400).set_body_json(json!({
156+
"error": {
157+
"message": "Invalid PURL format. Must include ecosystem type and package name.",
158+
"details": null
159+
}
160+
})))
153161
.expect(1)
154162
.mount(&server)
155163
.await;
156-
mount_by_package(&server, by_package_response_body(), 0).await;
164+
mount_by_package(&server, by_package_response_body(), 1).await;
157165

158166
let client = proxy_client(&server.uri());
159-
let err = client
167+
let resp = client
160168
.search_patches_batch(None, &[PURL.to_string()])
161169
.await
162-
.expect_err("a genuine validation 400 must surface, not silently degrade");
163-
164-
match &err {
165-
ApiError::Other(msg) => {
166-
assert!(msg.contains("400"), "must embed the status; got: {msg}");
167-
assert!(
168-
msg.contains("Invalid PURL format"),
169-
"must embed the body; got: {msg}"
170-
);
171-
}
172-
other => panic!("validation 400 must be Other; got: {other:?}"),
173-
}
170+
.expect("validation 400 must degrade to per-package GETs, not error");
171+
172+
assert_eq!(resp.packages.len(), 1, "fallback results must be assembled");
173+
assert_eq!(resp.packages[0].purl, PURL);
174+
}
175+
176+
#[tokio::test]
177+
async fn proxy_batch_validation_400_with_failing_gets_yields_empty_ok() {
178+
// Regression shape of the Deno JSR docker e2e: every crawled PURL is a
179+
// type the server rejects (pkg:jsr/…), so the batch 400s AND each
180+
// per-package GET 400s. The per-package path swallows those individual
181+
// failures, so the scan-level result is an empty success — not an
182+
// error that flips the whole scan's exit code.
183+
let server = MockServer::start().await;
184+
Mock::given(method("POST"))
185+
.and(path("/patch/batch"))
186+
.respond_with(ResponseTemplate::new(400).set_body_json(json!({
187+
"error": {
188+
"message": "Invalid PURL format. Must include ecosystem type and package name.",
189+
"details": null
190+
}
191+
})))
192+
.expect(1)
193+
.mount(&server)
194+
.await;
195+
Mock::given(method("GET"))
196+
.and(path_regex(r"^/patch/by-package/.*$"))
197+
.respond_with(ResponseTemplate::new(400).set_body_json(json!({
198+
"error": {
199+
"message": "Invalid PURL format. Must include ecosystem type and package name.",
200+
"details": null
201+
}
202+
})))
203+
.expect(1)
204+
.mount(&server)
205+
.await;
206+
207+
let client = proxy_client(&server.uri());
208+
let resp = client
209+
.search_patches_batch(None, &["pkg:jsr/@std/path@0.220.0".to_string()])
210+
.await
211+
.expect("per-purl failures are swallowed; the batch call must not error");
212+
213+
assert!(
214+
resp.packages.is_empty(),
215+
"an unresolvable PURL yields no packages, not an error"
216+
);
217+
assert!(!resp.can_access_paid_patches);
174218
}
175219

176220
#[tokio::test]

0 commit comments

Comments
 (0)