test(integration): harden post-create exec test vs echo-proxy bring-up race (bd openlock-eh8)#55
Merged
Merged
Conversation
…p race (bd openlock-eh8) post-create-exec-proxy.test.ts was the one echo-mode integration test still on bare `curl -sf`, missing the `--retry 5 --retry-all-errors` hardening its siblings (harness-cred-inject, openrouter-opencode-cred-inject) received in #38. That gap is why it became the recurring exit-56 flake locus on #52/#53/#54. exit 56 = curl CURLE_RECV_ERROR from the in-container echo proxy on first egress, relayed faithfully through ssh (ssh's own transport failures are 255, never 56) — NOT an ssh transport drop as previously suspected. `curl -s` was muting curl's error, which presented as "empty stdout/stderr". - add `--retry 5 --retry-all-errors --retry-delay 1` to the post-create exec test - switch all three flaking tests `-sf` -> `-sSf` so a retries-exhausted failure surfaces curl's real error instead of being silently muted bd openlock-eh8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fixes the recurring exit-56 live-integration flake (bd
openlock-eh8) that went red on three consecutivemainpushes (#52, #53, #54), each time passing on retry.Root cause (prior diagnosis was wrong)
The earlier theory was an "ssh transport delivered no data / supervisor stdout-over-ssh delivery race." That's incorrect:
ssh exited with status 56⇒ the remotecurlexited 56 =CURLE_RECV_ERROR(and the historical 35 =CURLE_SSL_CONNECT_ERROR).curl -smuting curl's error message — not a dropped transport. The supervisor/ssh path worked fine; it carried curl's exit code back faithfully.The real cause is curl racing the in-container echo-proxy + CA-bundle bring-up on the first egress. This was already diagnosed and mitigated in the two sibling echo-mode tests (
harness-cred-inject,openrouter-opencode-cred-inject) via--retry 5 --retry-all-errors --retry-delay 1(added in #38) — their comment literally says "transient TLS/network failures (exit 35/56) … races the supervisor's CA-bundle + echo-proxy bring-up."post-create-exec-proxy.test.ts(added in #34 foropenlock-hnp, the post-createsandbox execpath) never received that hardening — it was the one test still on barecurl -sf, which is exactly why it became the new flake locus. It even waits forwaitForSandboxReadyfirst, confirming that readiness (/bin/trueexecs) ≠ egress/proxy ready.Change
--retry 5 --retry-all-errors --retry-delay 1topost-create-exec-proxy.test.ts, matching its siblings. (Doesn't weaken the routing assertion — a real proxy-bypass regression still fails after retries.)-sf→-sSfso a retries-exhausted failure surfaces curl's real error instead of muting it.Verification
bun run lint✅ (exit 0) ·bun run typecheck✅ ·bun testunit delta zero (the three edited tests areskipIf(!LIVE)).Follow-up
Filed
openlock-g4x(P3): the test--retrymasks the bring-up race, but the production attach path may expose a real harness to the same first-egress transient. Worth an egress-aware readiness probe — but pin whether prod is actually exposed before adding product-side retries.