Skip to content

Conversation

@jbachorik
Copy link
Collaborator

@jbachorik jbachorik commented Feb 10, 2026

What does this PR do?:

Fixes critical alignment issues in stack walking that were causing ASAN violations during nightly runs.

Primary fixes:

  • Adds alignment checks before all pointer dereferences in stack unwinding - prevents misaligned memory access that triggers undefined behavior and ASAN errors
  • Adds frame size validation - prevents walking into invalid memory regions with corrupted frame sizes

Supporting improvements:

  • Adds deoptimization check before walking compiled frames
  • Applies SafeAccess::load consistently in DWARF unwinding paths
  • Enables ASAN/TSAN build configurations in Docker test script
  • Fixes nightly workflow's report-failures job artifact collection

Motivation:

A recent nightly ASAN run identified alignment violations in stack walking - https://github.com/DataDog/java-profiler/actions/runs/21851118212/job/63057859995#step:9:363

The issue occurred when sp (stack pointer) was not properly aligned before being dereferenced as a pointer. Without alignment checks, the profiler could attempt to read from misaligned addresses, causing:

  • Undefined behavior on architectures that require aligned access
  • ASAN violations flagging the memory safety issue
  • Potential crashes or data corruption

This PR systematically adds alignment validation at all pointer dereference sites in the stack walking code paths (walkVM, walkDwarf), ensuring memory is only accessed at properly aligned boundaries.

Additional Notes:

The alignment checks follow the pattern already established in the codebase but weren't consistently applied across all unwinding paths. Frame size validation prevents walking beyond reasonable frame boundaries (MAX_FRAME_SIZE_WORDS = 32768 words).

How to test the change?:

  1. Run local ASAN tests: ./gradlew testAsan
  2. Run Docker-based ASAN tests: ./utils/run-docker-tests.sh --config=asan --jdk=21
  3. Verify no ASAN violations are reported
  4. Existing tests should pass without degradation

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.
  • JIRA: PROF-13712

- Add frame size validation and alignment checks in stackWalker
- Replace raw pointer dereferences with SafeAccess::load in signal-critical paths
- Add deoptimization check before walking compiled frames
- Enable ASAN/TSAN configs in Docker test script
- Document Docker-based testing workflow in CLAUDE.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jbachorik jbachorik added the AI label Feb 10, 2026
@dd-octo-sts
Copy link

dd-octo-sts bot commented Feb 10, 2026

CI Test Results

Run: #21863805109 | Commit: c10a217 | Duration: 13m 30s (longest job)

All 40 test jobs passed

Status Overview

Platform 8 8-ibm 8-j9 8-libr 8-orcl 8-zing 11 11-j9 11-lib 11-zin 17 17-gra 17-j9 17-lib 17-zin 21 21-gra 21-lib 21-zin 25 25-gra 25-lib
glibc-aarch64/debug - - - - - - - - -
glibc-amd64/debug - - - - -
musl-aarch64/debug - - - - - - - - - - - - - - - - -
musl-amd64/debug - - - - - - - - - - - - - - - - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary

Metric Value
Total jobs 40
Passed 40
Failed 0

Updated: 2026-02-10 12:22:23 UTC

@jbachorik jbachorik changed the title Add SafeAccess protection and ASAN/TSAN Docker test support [WIP] Add SafeAccess protection and ASAN/TSAN Docker test support Feb 10, 2026
jbachorik and others added 2 commits February 10, 2026 12:21
- Use pattern matching to download all failures-* artifacts
- Add merge-multiple flag to combine artifacts
- Add graceful handling for missing failure files
- Fix JSON format in Slack webhook payload
- Add webhook existence check before posting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
MAX_FRAME_SIZE is in StackWalkValidation namespace, need explicit qualification before using declarations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jbachorik jbachorik changed the title [WIP] Add SafeAccess protection and ASAN/TSAN Docker test support Fix stack walking alignment issues causing ASAN violations Feb 10, 2026
@dd-octo-sts
Copy link

dd-octo-sts bot commented Feb 10, 2026

Scan-Build Report

User:runner@runnervmwffz4
Working Directory:/home/runner/work/java-profiler/java-profiler/ddprof-lib/src/test/make
Command Line:make -j4 all
Clang Version:Ubuntu clang version 18.1.3 (1ubuntu1)
Date:Tue Feb 10 11:54:45 2026

Bug Summary

Bug TypeQuantityDisplay?
All Bugs1
Unused code
Dead assignment1

Reports

Bug Group Bug Type ▾ File Function/Method Line Path Length
Unused codeDead assignmentlibraryPatcher_linux.cpppatch_library_unlocked941

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Integration Tests

All 40 integration tests passed

📊 Dashboard · 👷 Pipeline · 📦 0fc80262

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 memleak,alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak,alloc memleak,alloc
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 19 metrics, 0 unstable metrics.

@jbachorik jbachorik marked this pull request as ready for review February 10, 2026 12:49
@jbachorik jbachorik requested a review from a team as a code owner February 10, 2026 12:49
@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes alloc alloc
wall off off

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 22 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:philosophers better
[-2.786s; -1.450s] or [-18.703%; -9.738%]
unstable
[-596.767MB; +400.427MB] or [-53.031%; +35.584%]

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes wall wall
wall on on

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 20 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:scala-doku better
[-3.984s; -1.332s] or [-13.637%; -4.561%]
unstable
[-252815.989KB; +251181.411KB] or [-22.893%; +22.745%]

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes cpu,wall,alloc,memleak cpu,wall,alloc,memleak
wall on on

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 23 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:philosophers better
[-1443.066ms; -248.934ms] or [-10.219%; -1.763%]
unstable
[-533853.323KB; +532830.415KB] or [-51.753%; +51.654%]

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 cpu]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu cpu
wall off off

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 16 metrics, 21 unstable metrics.

scenario Δ mean execution_time Δ mean rss
scenario:renaissance:naive-bayes worse
[+454.356ms; +681.644ms] or [+3.126%; +4.689%]
unstable
[-524802.236KB; +526082.373KB] or [-52.149%; +52.277%]

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 memleak,alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak,alloc memleak,alloc
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 cpu]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu cpu
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 cpu,wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu,wall cpu,wall
wall on on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak memleak
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 cpu,wall,alloc,memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes cpu,wall,alloc,memleak cpu,wall,alloc,memleak
wall on on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [aarch64 alloc]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc on on
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes alloc alloc
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes wall wall
wall on on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 memleak]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu off off
iterations 5 5
java "11.0.28" "11.0.28"
memleak on on
modes memleak memleak
wall off off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2026

Benchmarks [x86_64 cpu,wall]

Parameters

Baseline Candidate
config baseline candidate
ddprof 1.37.0 1.38.0-jb_asan-SNAPSHOT
See matching parameters
Baseline Candidate
alloc off off
cpu on on
iterations 5 5
java "11.0.28" "11.0.28"
memleak off off
modes cpu,wall cpu,wall
wall on on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant