Skip to content

Update docker_run health wait default#23628

Merged
AAraKKe merged 8 commits intomasterfrom
aarakke/ddev-docker-run-health-default
May 10, 2026
Merged

Update docker_run health wait default#23628
AAraKKe merged 8 commits intomasterfrom
aarakke/ddev-docker-run-health-default

Conversation

@AAraKKe
Copy link
Copy Markdown
Contributor

@AAraKKe AAraKKe commented May 7, 2026

What does this PR do?

Updates docker_run and ComposeFileUp to use the correctly spelled wait_for_health argument and to wait for Docker Compose service health by default.

The misspelled waith_for_health keyword is removed, and in-repo callers that explicitly opted into health waiting were updated to use wait_for_health. Unit tests cover the new default, explicit opt-out, and rejection of the old typo.

This also fixes or documents the compose environments that CI showed were incompatible with Docker Compose --wait:

  • http_check and non-FIPS tls: opt out with wait_for_health=False because their compose stacks include a successful one-shot cert-builder container.
  • FIPS tls: replace curl-based container healthchecks with an openssl s_client probe so the checks use binaries already present in the Alpine-based test image and validate the TLS listener directly.
  • kong: opt out with wait_for_health=False because its migration containers are expected to exit successfully before the long-running service is ready.
  • krakend: install curl in the FastAPI test image, make the KrakenD service wait for the API service to become healthy, and use wget for the KrakenD container healthcheck because the upstream KrakenD image does not include curl.
  • envoy: fix the API v3 service sidecar to use the v3 router filter config and pin Werkzeug to a Flask 2.1-compatible version so the backend service starts reliably.
  • nifi: update the container-side healthcheck to target NiFi through the container hostname instead of localhost, matching how NiFi binds its HTTPS listener.
  • postgres: add a startup grace period to the base and replication compose healthchecks so Docker does not mark containers unhealthy during expected init, restart, and replica bootstrap windows before /tmp/container_ready.txt is written.
  • marathon: adjust the Mesos slave test container to avoid obsolete systemd/cgroup-v1 Docker containerizer assumptions while preserving the Marathon API test environment.

Motivation

The previous waith_for_health keyword was misspelled and defaulted to False, so most compose-backed test environments skipped Docker Compose --wait unless they opted in manually.

Enabling health waiting by default makes compose-level failures visible before tests run. The follow-up integration changes either fix stale healthchecks/service setup or explicitly preserve the previous behavior where a compose stack intentionally contains one-shot containers.

Validation

  • ddev --no-interactive test datadog_checks_dev -- -k docker
  • Targeted compose checks with docker compose up -d --force-recreate --wait for the affected KrakenD, Marathon, Postgres, and TLS FIPS stacks.
  • Targeted integration checks for Marathon and Postgres after their compose changes.
  • Draft PR CI has been used to identify and iterate on affected integrations.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Made with Cursor

Co-authored-by: Cursor <cursoragent@cursor.com>
@datadog-official
Copy link
Copy Markdown
Contributor

datadog-official Bot commented May 7, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 97.30%
Overall Coverage: 87.37% (+0.13%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: e952ce9 | Docs | Datadog PR Page | Give us feedback!

@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 97.29730% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 90.84%. Comparing base (e639016) to head (e952ce9).
⚠️ Report is 9 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

⚠️ Major version bump
The changelog type changed or removed was used in this Pull Request, so the next release will bump major version. Please make sure this is a breaking change, or use the fixed or added type instead.

AAraKKe and others added 2 commits May 7, 2026 16:21
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@AAraKKe AAraKKe marked this pull request as ready for review May 7, 2026 16:30
@AAraKKe AAraKKe requested review from a team as code owners May 7, 2026 16:30
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4777e9b217

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

def docker_run(
compose_file=None,
waith_for_health=False,
wait_for_health=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Opt out TLS FIPS from health waiting

With this default set to True, dd_fips_environment now runs tls/tests/fips/docker-compose.yml through docker compose up --wait; Docker documents --wait as waiting for services to be running or healthy. That FIPS compose file has healthchecks that execute curl, but tls/tests/fips/Dockerfile only installs openssl and bash, so the services become unhealthy and the fixture fails before the previous sleep=20 path can yield. The non-FIPS TLS fixture was opted out in this commit, but the FIPS fixture still needs wait_for_health=False or a working healthcheck.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this was a real failure. We did not opt FIPS TLS out because unlike the non-FIPS fixture, this compose stack does not rely on a successful one-shot container; the services are meant to stay running and be health-checked.

Instead, we fixed the broken healthcheck by replacing the curl probe with an openssl s_client probe, using binaries already present in the Alpine-based image. I also validated the FIPS stack locally with docker compose up --wait --build.

sarah-witt
sarah-witt previously approved these changes May 7, 2026
Co-authored-by: Cursor <cursoragent@cursor.com>
@temporal-github-worker-1 temporal-github-worker-1 Bot dismissed sarah-witt’s stale review May 8, 2026 07:44

Review from sarah-witt is dismissed. Related teams and files:

  • agent-integrations
    • tls/tests/fips/docker-compose.yml
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 8, 2026

Validation Report

All 20 validations passed.

Show details
Validation Description Status
agent-reqs Verify check versions match the Agent requirements file
ci Validate CI configuration and Codecov settings
codeowners Validate every integration has a CODEOWNERS entry
config Validate default configuration files against spec.yaml
dep Verify dependency pins are consistent and Agent-compatible
http Validate integrations use the HTTP wrapper correctly
imports Validate check imports do not use deprecated modules
integration-style Validate check code style conventions
jmx-metrics Validate JMX metrics definition files and config
labeler Validate PR labeler config matches integration directories
legacy-signature Validate no integration uses the legacy Agent check signature
license-headers Validate Python files have proper license headers
licenses Validate third-party license attribution list
metadata Validate metadata.csv metric definitions
models Validate configuration data models match spec.yaml
openmetrics Validate OpenMetrics integrations disable the metric limit
package Validate Python package metadata and naming
readmes Validate README files have required sections
saved-views Validate saved view JSON file structure and fields
version Validate version consistency between package and changelog

View full run

@AAraKKe AAraKKe added this pull request to the merge queue May 10, 2026
Merged via the queue into master with commit 03b790e May 10, 2026
348 checks passed
@AAraKKe AAraKKe deleted the aarakke/ddev-docker-run-health-default branch May 10, 2026 08:41
@dd-octo-sts dd-octo-sts Bot added this to the 7.79.0 milestone May 10, 2026
github-actions Bot pushed a commit that referenced this pull request May 10, 2026
* Update docker_run health wait default

Co-authored-by: Cursor <cursoragent@cursor.com>

* Opt out one-shot compose setups from health waiting

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix tls formatting

* Fix compose health checks for Envoy and KrakenD

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix NiFi and Postgres compose health checks

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix KrakenD and Marathon compose health checks

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix Postgres base compose health check timing

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix TLS FIPS compose health checks

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com> 03b790e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants