Continuation of a side note in #2 from @famewolf.
Context
v1.17.2 improved the healthcheck logic in two big ways: a longer default wait (600s), respect for the image's own Healthcheck.StartPeriod, and no auto-rollback when the container is state=running, health=starting. That covers the slow-startup case.
The remaining edge case: containers that report health=unhealthy even when they're working fine — usually because the healthcheck command itself is brittle, not because the app is broken.
@famewolf noticed this in his VPN-sidecar stack:
"some of the dependents do NOT work with health checks. I don't know why but they showed as unhealthy even when working fine."
Probable root cause: healthcheck commands that hit localhost:<port> inside a VPN-routed network namespace can be finicky — curl inside Sonarr's namespace points at Gluetun's local interface, and small inconsistencies (DNS, IP family, TLS) can flip the result without indicating an actual app fault.
What we could add
A per-container flag, similar to the existing "ask before major update" toggle:
- "Trust running state over healthcheck"
- When set, after an update we check only
state == "running" and ignore health.status
- Falls back gracefully to the current behaviour when the flag isn't set
This is opt-in per-container — the default stays "trust the healthcheck". Web UI flag on the container detail page, persisted in /data/.
Open questions
- Naming:
ignore_healthcheck reads aggressive. trust_running reads softer. Bikeshed.
- Should this be a group-level flag too? Probably not — different containers in the same group can have different healthcheck quirks.
What I'd like to know
- @famewolf — confirm the symptom is still present with v1.17.2. If yes, this is worth shipping. If the smarter wait already papered over it, maybe not.
- Anyone else with
network_mode: container:* setups: are you seeing similar phantom-unhealthy on dependents?
Continuation of a side note in #2 from @famewolf.
Context
v1.17.2 improved the healthcheck logic in two big ways: a longer default wait (600s), respect for the image's own
Healthcheck.StartPeriod, and no auto-rollback when the container isstate=running, health=starting. That covers the slow-startup case.The remaining edge case: containers that report
health=unhealthyeven when they're working fine — usually because the healthcheck command itself is brittle, not because the app is broken.@famewolf noticed this in his VPN-sidecar stack:
Probable root cause: healthcheck commands that hit
localhost:<port>inside a VPN-routed network namespace can be finicky —curlinside Sonarr's namespace points at Gluetun's local interface, and small inconsistencies (DNS, IP family, TLS) can flip the result without indicating an actual app fault.What we could add
A per-container flag, similar to the existing "ask before major update" toggle:
state == "running"and ignorehealth.statusThis is opt-in per-container — the default stays "trust the healthcheck". Web UI flag on the container detail page, persisted in
/data/.Open questions
ignore_healthcheckreads aggressive.trust_runningreads softer. Bikeshed.What I'd like to know
network_mode: container:*setups: are you seeing similar phantom-unhealthy on dependents?