fix(live-debugger): retire removed probe configs to prevent use-after-free#4024
fix(live-debugger): retire removed probe configs to prevent use-after-free#4024Leiyks wants to merge 2 commits into
Conversation
…-free The C side (tracer/live_debugger.c) stores a shallow copy of the FFI ddog_Probe in def->probe, whose ddog_CharSlice fields (probe id, capture config, etc.) borrow strings owned by the boxed RemoteConfigParsed kept in live_debugger.active. The Box exists specifically to give that data a stable heap address for these borrows. On a LiveDebugger Remove (and on an Entry::Occupied re-add), the box was dropped immediately, freeing those strings. But zai_hook_remove can defer the actual hook teardown, so a still-installed probe could fire afterwards and read the freed strings via ddog_send_debugger_diagnostics(&def->probe, ...) -> probe.id, a use-after-free. Only the valgrind master job observes it (plain runs read the freed-but-intact bytes); confirmed by the .mem from test_extension_ci: [7.1]. Retire removed/replaced boxes into live_debugger.retired instead of dropping them, and free them in ddog_rshutdown_remote_config, by which point all probe hooks are torn down and no PHP user code can fire a probe. Verified: builds on PHP 7.1 (libdatadog 6a6d4a535); the leaking test passes 10/10 under valgrind with the fix and is non-regressing.
|
Benchmarks [ tracer ]Benchmark execution time: 2026-06-30 15:27:25 Comparing candidate commit a7cfd7d in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 193 metrics, 0 unstable metrics.
|
…rc rshutdown Code review found the previous commit freed live_debugger.retired in ddog_rshutdown_remote_config, which runs at the very start of RSHUTDOWN (ext/datadog.c:647) -- BEFORE ddtrace_rshutdown -> dd_force_shutdown_tracing / zai_hook_clean (tracer/ddtrace.c:582,588) tear down the probe hooks that borrow into those boxes. The shutdown-time span flush can itself fire a probe, so the use-after-free window was not actually closed. Move the free to a new FFI ddog_live_debugger_free_retired, invoked at the end of ddtrace_live_debugger_rshutdown() (tracer/ddtrace.c:618) -- after both zai_hook_clean calls and the active_live_debugger_hooks destroy, i.e. the last point at which a probe can fire. Still per-request, so retired stays bounded. Verified on PHP 7.1 (libdatadog 6a6d4a535): builds/links; debugger_enable_dynamic_config passes 10/10 under valgrind and the whole live-debugger dir is clean.
|
My problem with this PR is that it misidentifies the root cause: And end hooks are anyway guarded with I don't see where ddog_process_remote_configs was called from the shortened trace, so that gives me no hint. |
Problem
test_extension_ci: [7.1](the valgrind master job) fails with aLEAKED TEST SUMMARYontests/ext/live-debugger/debugger_enable_dynamic_config.phpt. The.memartifact shows it is not a leak but a use-after-free:The freed block is size 1 — the 1-byte probe id
"1"the test generates.Root cause
The C side (
tracer/live_debugger.c:162,def->probe = *probe) keeps a shallow copy of the FFIddog_Probe, whoseddog_CharSlicefields (id, capture config, shm limiter, …) borrow strings owned by the boxedRemoteConfigParsedstored inlive_debugger.active. TheBox<>exists specifically to give that data a stable heap address for these borrows (per its own comment).On a LiveDebugger Remove (and on an
Entry::Occupiedre-add) the box was dropped immediately, freeing those strings. Butzai_hook_removecan defer the actual hook teardown, so a still-installed probe can fire afterwards and read the freed strings viaddog_send_debugger_diagnostics(&def->probe, …)→ UAF.Latent since the feature landed (
f87446c4d, 2024-10); only the valgrind job observes it because plain runs read the freed-but-intact bytes and pass. It is timing-gated, which is why it surfaces in the full shuffled valgrind suite rather than in isolation.Fix
Retire removed/replaced boxes into a new
live_debugger.retiredvec instead of dropping them, and free them inddog_rshutdown_remote_config— by which point all probe hooks are torn down and no PHP user code can fire a probe.retiredis cleared every request, so it stays bounded.Verification
6a6d4a535).test_extension_ci: [7.1]valgrind job on this branch.