Update docker_run health wait default#23628
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: e952ce9 | Docs | Datadog PR Page | Give us feedback! |
Codecov Report❌ Patch coverage is Additional details and impacted files🚀 New features to boost your workflow:
|
Co-authored-by: Cursor <cursoragent@cursor.com>
|
|
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4777e9b217
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def docker_run( | ||
| compose_file=None, | ||
| waith_for_health=False, | ||
| wait_for_health=True, |
There was a problem hiding this comment.
Opt out TLS FIPS from health waiting
With this default set to True, dd_fips_environment now runs tls/tests/fips/docker-compose.yml through docker compose up --wait; Docker documents --wait as waiting for services to be running or healthy. That FIPS compose file has healthchecks that execute curl, but tls/tests/fips/Dockerfile only installs openssl and bash, so the services become unhealthy and the fixture fails before the previous sleep=20 path can yield. The non-FIPS TLS fixture was opted out in this commit, but the FIPS fixture still needs wait_for_health=False or a working healthcheck.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Thanks, this was a real failure. We did not opt FIPS TLS out because unlike the non-FIPS fixture, this compose stack does not rely on a successful one-shot container; the services are meant to stay running and be health-checked.
Instead, we fixed the broken healthcheck by replacing the curl probe with an openssl s_client probe, using binaries already present in the Alpine-based image. I also validated the FIPS stack locally with docker compose up --wait --build.
Co-authored-by: Cursor <cursoragent@cursor.com>
Review from sarah-witt is dismissed. Related teams and files:
- agent-integrations
- tls/tests/fips/docker-compose.yml
Validation ReportAll 20 validations passed. Show details
|
* Update docker_run health wait default Co-authored-by: Cursor <cursoragent@cursor.com> * Opt out one-shot compose setups from health waiting Co-authored-by: Cursor <cursoragent@cursor.com> * Fix tls formatting * Fix compose health checks for Envoy and KrakenD Co-authored-by: Cursor <cursoragent@cursor.com> * Fix NiFi and Postgres compose health checks Co-authored-by: Cursor <cursoragent@cursor.com> * Fix KrakenD and Marathon compose health checks Co-authored-by: Cursor <cursoragent@cursor.com> * Fix Postgres base compose health check timing Co-authored-by: Cursor <cursoragent@cursor.com> * Fix TLS FIPS compose health checks Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> 03b790e
What does this PR do?
Updates
docker_runandComposeFileUpto use the correctly spelledwait_for_healthargument and to wait for Docker Compose service health by default.The misspelled
waith_for_healthkeyword is removed, and in-repo callers that explicitly opted into health waiting were updated to usewait_for_health. Unit tests cover the new default, explicit opt-out, and rejection of the old typo.This also fixes or documents the compose environments that CI showed were incompatible with Docker Compose
--wait:http_checkand non-FIPStls: opt out withwait_for_health=Falsebecause their compose stacks include a successful one-shotcert-buildercontainer.tls: replacecurl-based container healthchecks with anopenssl s_clientprobe so the checks use binaries already present in the Alpine-based test image and validate the TLS listener directly.kong: opt out withwait_for_health=Falsebecause its migration containers are expected to exit successfully before the long-running service is ready.krakend: installcurlin the FastAPI test image, make the KrakenD service wait for the API service to become healthy, and usewgetfor the KrakenD container healthcheck because the upstream KrakenD image does not includecurl.envoy: fix the API v3 service sidecar to use the v3 router filter config and pin Werkzeug to a Flask 2.1-compatible version so the backend service starts reliably.nifi: update the container-side healthcheck to target NiFi through the container hostname instead oflocalhost, matching how NiFi binds its HTTPS listener.postgres: add a startup grace period to the base and replication compose healthchecks so Docker does not mark containers unhealthy during expected init, restart, and replica bootstrap windows before/tmp/container_ready.txtis written.marathon: adjust the Mesos slave test container to avoid obsolete systemd/cgroup-v1 Docker containerizer assumptions while preserving the Marathon API test environment.Motivation
The previous
waith_for_healthkeyword was misspelled and defaulted toFalse, so most compose-backed test environments skipped Docker Compose--waitunless they opted in manually.Enabling health waiting by default makes compose-level failures visible before tests run. The follow-up integration changes either fix stale healthchecks/service setup or explicitly preserve the previous behavior where a compose stack intentionally contains one-shot containers.
Validation
ddev --no-interactive test datadog_checks_dev -- -k dockerdocker compose up -d --force-recreate --waitfor the affected KrakenD, Marathon, Postgres, and TLS FIPS stacks.Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is mergedMade with Cursor