Add timeout to debugger captures by bwoebi · Pull Request #4003 · DataDog/dd-trace-php

bwoebi · 2026-06-22T11:17:36Z

Adding DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MS config to enforce limits on capture times.

datadog-datadog-prod-us1-2 · 2026-06-22T11:25:10Z

Tests

⚠️ Warnings

🚦 5 Pipeline jobs failed

Profiling ASAN/UBSAN Tests | PHP 8.5 zts UBSAN (arm-8core-linux)

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.2]

DataDog/apm-reliability/dd-trace-php | Zend Abstract Interface Tests: [8.3, nts]

View all 5 failed jobs.

❄️ 2 New flaky tests detected

tmp/build_extension/tests/ext/live-debugger/debugger_log_probe_capture_timeout.phpt (Live debugger log probe capture timeout with large data structure) from PHP.tmp.build_extension.tests.ext.live.debugger

tmp/build_extension/tests/ext/live-debugger/debugger_log_probe_capture_timeout.phpt (Live debugger log probe capture timeout with large data structure) from php.tmp.build_extension.tests.ext.live.debugger

View in Flaky Test Management

ℹ️ Info

No other issues found (see more)

🧪 All tests passed

🔄 Datadog auto-retried 2 jobs - 2 passed on retry

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 54.08% (-0.04%)

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 89a1dcd | Docs | Datadog PR Page | Give us feedback!}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 793daeec61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T11:27:15Z

+    usec = dd_find_lowest_dealine_timer();
+#endif
+    struct itimerval it = {
+        .it_value    = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },


Use microseconds-per-second for setitimer

On the non-Linux setitimer path, usec is already in microseconds, so tv_sec must divide by 1,000,000 rather than 10,000,000. When DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MS is configured above 999 ms on macOS/BSD, values like 1000 ms or 2000 ms produce {0, 0} and disarm the timeout, while other multi-second values fire much too early; the same conversion should be fixed in the stop/re-arm paths as well.

Useful? React with 👍 / 👎.

yeah same here indeed

pr-commenter · 2026-06-22T12:37:57Z

Benchmarks [ tracer ]

Benchmark execution time: 2026-06-29 16:36:22

Comparing candidate commit 89a1dcd in PR branch bob/debugger-limit with baseline commit 303fa81 in branch master.

Found 0 performance improvements and 7 performance regressions! Performance is the same for 187 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

🟩 = significantly better candidate vs. baseline
🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:EmptyFileBench/benchEmptyFileBaseline

🟥 execution_time [+84.337µs; +286.443µs] or [+2.709%; +9.199%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization

🟥 execution_time [+4.196µs; +6.924µs] or [+4.143%; +6.837%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

🟥 execution_time [+2.843µs; +5.537µs] or [+2.751%; +5.356%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

🟥 execution_time [+56.720ns; +130.080ns] or [+3.816%; +8.751%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

🟥 execution_time [+78.046ns; +150.354ns] or [+5.377%; +10.358%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

🟥 execution_time [+70.971ns; +146.629ns] or [+4.788%; +9.893%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4

🟥 execution_time [+43.686ns; +136.314ns] or [+2.910%; +9.080%]

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

Leiyks · 2026-06-29T11:24:33Z

+    if (next_deadline != ~0ull) { // re-arm the timer, for ZTS concurrency
+        uint64_t usec = (next_deadline - now_ns) / 1000ull;
+        struct itimerval it = {
+            .it_value    = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },


Suggested change

.it_value = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },

.it_value = { .tv_sec = usec / 1000000, .tv_usec = usec % 1000000 },

there is an extra 0 here no ?

Leiyks · 2026-06-29T11:25:47Z

+    usec = dd_find_lowest_dealine_timer();
+#endif
+    struct itimerval it = {
+        .it_value    = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },


yeah same here indeed

Leiyks · 2026-06-29T11:47:21Z

+#ifdef __linux__
+#include <sys/syscall.h>
+#elif defined(ZTS)
+uint64_t dd_find_lowest_dealine_timer(void) {


Suggested change

uint64_t dd_find_lowest_dealine_timer(void) {

uint64_t dd_find_lowest_deadline_timer(void) {

needs to be changed in other places as well :D

Leiyks · 2026-06-29T11:52:58Z

+
+void dd_stop_debugger_timeout(void) {
+    if (DDTRACE_G(capture_timer_handle)) {
+        DeleteTimerQueueTimer(NULL, DDTRACE_G(capture_timer_handle), NULL);


Suggested change

DeleteTimerQueueTimer(NULL, DDTRACE_G(capture_timer_handle), NULL);

DeleteTimerQueueTimer(NULL, DDTRACE_G(capture_timer_handle), INVALID_HANDLE_VALUE);

We can use this value to block until any running callback completes first

Leiyks · 2026-06-29T11:55:31Z

+#if !defined(__linux__) && defined(ZTS)
+    } ZEND_HASH_FOREACH_END();
+    if (next_deadline != ~0ull) { // re-arm the timer, for ZTS concurrency
+        uint64_t usec = (next_deadline - now_ns) / 1000ull;


Suggested change

uint64_t usec = (next_deadline - now_ns) / 1000ull;

uint64_t usec = next_deadline > now_ns ? (next_deadline - now_ns) / 1000ull : 0;

Should we check just in case ?

That check happens on line 92: if (now_ns >= deadline) {

oh yeah indeed, missed it

Leiyks · 2026-06-29T11:56:03Z

+        struct timespec now;
+        clock_gettime(CLOCK_THREAD_CPUTIME_ID, &now);
+        uint64_t now_ns = (uint64_t)now.tv_sec * 1000000000ULL + (uint64_t)now.tv_nsec;
+        usec = (next_deadline - now_ns) / 1000ull;


Suggested change

usec = (next_deadline - now_ns) / 1000ull;

usec = next_deadline > now_ns ? (next_deadline - now_ns) / 1000ull : 0;

same here

Leiyks · 2026-06-29T11:57:29Z

+#endif
+
+// SIGEV_THREAD_ID delivers SIGVTALRM to exactly this thread, not a random one (critical for ZTS).
+void dd_start_debugger_timeout(void) {


I think we should add guards to check if a timer is already active first before starting a new one

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

Leiyks

lgtm 👍

bwoebi requested a review from a team as a code owner June 22, 2026 11:17

chatgpt-codex-connector Bot reviewed Jun 22, 2026

View reviewed changes

bwoebi force-pushed the bob/debugger-limit branch 2 times, most recently from efc9646 to db71d0a Compare June 25, 2026 16:26

bwoebi added 2 commits June 25, 2026 19:20

Add timeout to debugger captures

dbeb004

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

Fix alpine build

92fe6c4

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

bwoebi force-pushed the bob/debugger-limit branch from db71d0a to 92fe6c4 Compare June 25, 2026 17:20

Windows timers have less resolution, increase amount

c6da6b1

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

bwoebi force-pushed the bob/debugger-limit branch from a3a2bcb to c6da6b1 Compare June 26, 2026 19:08

Leiyks reviewed Jun 29, 2026

View reviewed changes

Address code review

89a1dcd

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

Leiyks approved these changes Jun 29, 2026

View reviewed changes

	.it_value = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },
	.it_value = { .tv_sec = usec / 1000000, .tv_usec = usec % 1000000 },

	uint64_t dd_find_lowest_dealine_timer(void) {
	uint64_t dd_find_lowest_deadline_timer(void) {

	DeleteTimerQueueTimer(NULL, DDTRACE_G(capture_timer_handle), NULL);
	DeleteTimerQueueTimer(NULL, DDTRACE_G(capture_timer_handle), INVALID_HANDLE_VALUE);

	uint64_t usec = (next_deadline - now_ns) / 1000ull;
	uint64_t usec = next_deadline > now_ns ? (next_deadline - now_ns) / 1000ull : 0;

	usec = (next_deadline - now_ns) / 1000ull;
	usec = next_deadline > now_ns ? (next_deadline - now_ns) / 1000ull : 0;

Uh oh!

Conversation

bwoebi commented Jun 22, 2026

Uh oh!

datadog-datadog-prod-us1-2 Bot commented Jun 22, 2026 • edited by datadog-prod-us1-3 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pr-commenter Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [ tracer ]

Explanation

More details about the CI and significant changes

scenario:EmptyFileBench/benchEmptyFileBaseline

scenario:MessagePackSerializationBench/benchMessagePackSerialization

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

scenario:SamplingRuleMatchingBench/benchRegexMatching1

scenario:SamplingRuleMatchingBench/benchRegexMatching2

scenario:SamplingRuleMatchingBench/benchRegexMatching3

scenario:SamplingRuleMatchingBench/benchRegexMatching4

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Leiyks left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

datadog-datadog-prod-us1-2 Bot commented Jun 22, 2026 •

edited by datadog-prod-us1-3 Bot

Loading

pr-commenter Bot commented Jun 22, 2026 •

edited

Loading