Skip to content

Resolve URLs by using multiple services#542

Open
9seconds wants to merge 1 commit into
masterfrom
multiple-ip-detectors
Open

Resolve URLs by using multiple services#542
9seconds wants to merge 1 commit into
masterfrom
multiple-ip-detectors

Conversation

@9seconds
Copy link
Copy Markdown
Owner

This PR has an intention of resolving URLs by using multiple endpoints that identify an IP address of the service. This is handy if one service is blocked for some reason.

The detection mechanism follows this logic:

  1. It tries to access all services in parallel
  2. If service respond with some error (like, no route to host for IPv6), then we accurately collect those errors and return a merged one
  3. In case of the first IP resolved, we immediately return it.

Also, this PR refactors how access and SNI check are performed.

Closes #529

This PR has an intention of resolving URLs by using multiple endpoints
that identify an IP address of the service. This is handy if one service
is blocked for some reason.

The detection mechanism follows this logic:

1. It tries to access all services in parallel
2. If service respond with some error (like, no route to host for IPv6),
   then we accurately collect those errors and return a merged one
3. In case of the first IP resolved, we immediately return it.

Also, this PR refactors how access and SNI check are performed.
@9seconds 9seconds requested a review from dolonet May 27, 2026 14:13
@9seconds 9seconds force-pushed the multiple-ip-detectors branch 2 times, most recently from 069fb08 to 2145159 Compare May 27, 2026 14:18
@dolonet
Copy link
Copy Markdown
Collaborator

dolonet commented May 29, 2026

CI is green across the board and this Closes #529 as intended — multiple detection endpoints with merged-error collection is exactly what that issue asked for. The mtg access and getIP parts read well.

One behavioral regression I want to flag before this lands, in runSNICheck (internal/cli/sni_check.go), because it bites precisely the users #529 targets — those who never set public-ipv4/public-ipv6.

Both family goroutines now share a single ctx from context.WithCancelCause (sni_check.go:28), and a per-family detection failure calls cancelCause(err) on it (:58 for tcp4, :76 for tcp6):

ip, err = getIP(ctx, ntw, "tcp6")
if err != nil {
    cancelCause(err)
}

Consider a secret host with both A and AAAA records on a server with IPv4-only egress (a common setup). The tcp6 goroutine's getIP fails and calls cancelCause(err). The load-bearing consequence is unconditional:

wg.Wait(); return res, context.Cause(ctx) (:88) then returns a non-nil error regardless of whether the tcp4 detection succeeded — any single family's failure poisons the whole result. (Secondarily, since the ctx is shared, the cancel can also abort the still-in-flight tcp4 HTTP request — but the error return alone is enough.)

Both callers treat that error as fatal: checkSecretHost (doctor.go:396) prints cannot resolve DNS name of <host> and returns false, and warnSNIMismatch (run_proxy.go:221) logs SNI-DNS check: cannot resolve secret hostname and skips the check. So a server that is actually fine on IPv4 now fails the SNI check outright.

On master this was graceful per-family degradation (the behavior #505/#528 settled on): getIP returned nil rather than erroring, an undetected family was simply non-blocking — warnSNIMismatch had v6Match := res.OurIPv6 == nil || res.IPv6Match, and checkSecretHost keyed off IPv4Match || IPv6Match. This PR turns "one family undetectable" into a hard failure.

I checked this against both branches with a small harness driving the real runSNICheck (faked resolver + Network): for a dual-stack host whose A record matches the server's public IPv4 on IPv4-only egress, master returns a clean ✅ match while this branch returns the hard cannot resolve DNS name failure. It also fires when only public-ipv4 is set (the unset v6 family still trips cancelCause), so it is not limited to the no-config case.

Minimal fix, your call on the shape: give each family its own derived context (so one family's getIP failure can't cancel the other), or simply don't let a per-family getIP error call cancelCause on the shared ctx — and only return an error when both families fail to produce an IP. The OurIP4 == "" && OurIP6 == "" condition the callers already have is the natural place to land that.

Two smaller notes:

  • The doctor message is misleading on this path. When DNS resolved fine and only public-IP detection failed, it still prints cannot resolve DNS name of <host>. Even with the fix above, "cannot detect public IP" would be the accurate framing there.

  • Non-blocking nit, your design call: the new per-family tplEDNSSNIMatch ({{ .ip }}) drops the combined IPv4+IPv6 context that doctor: surface both public IPs in SNI-DNS mismatch message #505 added (... public IP is <v4> (IPv4) / <v6> (IPv6) ...). Reasonable to drop if you prefer one line per family — just flagging it's an intentional-looking output change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Public IP detection depends on a single hardcoded endpoint (ifconfig.co)

2 participants