Skip to content

PoC: feat(profiling): add heap-live profiling for memory leak detection#3623

Draft
realFlowControl wants to merge 4 commits intomasterfrom
florian/heap-live-profiling
Draft

PoC: feat(profiling): add heap-live profiling for memory leak detection#3623
realFlowControl wants to merge 4 commits intomasterfrom
florian/heap-live-profiling

Conversation

@realFlowControl
Copy link
Member

@realFlowControl realFlowControl commented Feb 4, 2026

Warning

do not merge, this is PoC

Description

Track allocations that survive across profile exports using heap-live-samples and heap-live-size sample types. Samples are emitted in batches at export time.

Enable via DD_PROFILING_HEAP_LIVE_ENABLED or datadog.profiling.heap_live_enabled (default disabled), only works when allocation profiling is active.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

PROF-13688

@github-actions github-actions bot added the profiling Relates to the Continuous Profiler label Feb 4, 2026
@datadog-official
Copy link

datadog-official bot commented Feb 4, 2026

⚠️ Tests

Fix all issues with Cursor

⚠️ Warnings

🧪 1025 Tests failed

    testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Fix with Cursor)

    testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Fix with Cursor)

testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 6985a03e0000000047109dd5691fb173
tid: 6985a03e00000000
hexProcessTraceId: 47109dd5691fb173
hexProcessSpanId: c8ecadb5a393d023
processTraceId: 5120766316237533555
processSpanId: 14478137897734361123

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
View all

ℹ️ Info

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 49b5835 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

Track allocations that survive across profile exports using heap-live-samples
and heap-live-size sample types. Samples are emitted in batches at export time.

Enabled via DD_PROFILING_HEAP_LIVE_ENABLED when allocation profiling is active.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pr-commenter
Copy link

pr-commenter bot commented Feb 4, 2026

Benchmarks [ profiler ]

Benchmark execution time: 2026-02-06 08:06:24

Comparing candidate commit 49b5835 in PR branch florian/heap-live-profiling with baseline commit 6843f96 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 28 metrics, 8 unstable metrics.

@realFlowControl realFlowControl force-pushed the florian/heap-live-profiling branch from bc087f2 to 817465a Compare February 4, 2026 21:03
@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.14%. Comparing base (6843f96) to head (49b5835).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3623      +/-   ##
==========================================
- Coverage   62.21%   62.14%   -0.08%     
==========================================
  Files         141      141              
  Lines       13387    13387              
  Branches     1753     1753              
==========================================
- Hits         8329     8319      -10     
- Misses       4260     4270      +10     
  Partials      798      798              

see 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6843f96...49b5835. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

realFlowControl and others added 2 commits February 5, 2026 08:09
- Use functional style (map + match) in collect_batched_heap_live_samples
- Only create ProfileIndex when heap-live tracking is enabled
- Replace 32 repetitive I/O profiling lines with a loop
- Use filter_map in sample type filter method
- Add early bail-out in free_allocation when heap-live is disabled

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace default SipHash with a simple bit-mixing hasher optimized for
pointer addresses. Since pointers are already well-distributed, we use
`ptr ^ (ptr >> 4)` instead of expensive cryptographic hashing.

This reduces overhead in untrack_allocation() which is called on every free.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pr-commenter
Copy link

pr-commenter bot commented Feb 5, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-02-06 08:47:53

Comparing candidate commit 49b5835 in PR branch florian/heap-live-profiling with baseline commit 6843f96 in branch master.

Found 1 performance improvements and 4 performance regressions! Performance is the same for 186 metrics, 3 unstable metrics.

scenario:ComposerTelemetryBench/benchTelemetryParsing

  • 🟩 execution_time [-1.897µs; -1.103µs] or [-14.823%; -8.614%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

  • 🟥 execution_time [+112.809ns; +149.391ns] or [+9.856%; +13.052%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

  • 🟥 execution_time [+102.670ns; +152.930ns] or [+8.935%; +13.309%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+76.869ns; +117.931ns] or [+6.586%; +10.105%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4

  • 🟥 execution_time [+108.622ns; +151.178ns] or [+9.463%; +13.171%]

…p-live

Add an AllocationFilter (lock-free bloom filter) that checks if a pointer
could possibly be tracked before doing the expensive DashMap lookup in
free_allocation(). Uses atomic operations with Relaxed ordering — no
locks needed.

This provides a fast path for 99.9%+ of free() calls that are for
non-tracked allocations, reducing overhead from hash computation and
lock acquisition to just two atomic loads and bit tests.

- AllocationFilter: 4KB fixed-size array of AtomicU64 (32768 bits)
- Two hash functions for ~5% false positive rate at max capacity
- Completely lock-free: fetch_or to set bits, load to test
- Mark filter BEFORE DashMap insert to avoid false negatives
- False positives are acceptable (just an extra DashMap lookup)
- Cleared on profile export and fork

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

profiling Relates to the Continuous Profiler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants