Skip to content

Harden the view layer against malformed input (found via real arm64 binaries)#37

Merged
everettjf merged 2 commits into
masterfrom
claude/review-and-plan-g4p9o
May 23, 2026
Merged

Harden the view layer against malformed input (found via real arm64 binaries)#37
everettjf merged 2 commits into
masterfrom
claude/review-and-plan-g4p9o

Conversation

@everettjf
Copy link
Copy Markdown
Owner

Summary

I sourced real macOS arm64 binaries by downloading arm64 Python wheels from PyPI (their .so extensions are real Mach-O arm64), then ran the full view-node dumper under ASan/UBSan over them plus thousands of truncated and byte-flipped variants. This surfaced — and this PR fixes — a batch of real crashes that the header-only moex-parse and the bundled samples never reached:

  • Invalid downcast in the rebase opcode view (REBASE_OPCODE_DO_REBASE_ADD_ADDR_ULEB was cast to the ADD_ADDR_ULEB wrapper).
  • Out-of-bounds table walks on truncated/crafted input, now all clamped to the mapped file: symbol table (nlist), indirect symbols & relocations, data-in-code, function starts, string table, section data sizes, and the DYLD_INFO rebase/bind opcode streams.
  • nsects not bounded to the segment command size → section structs read past the file during node construction.
  • Stack overflow from unbounded view-tree recursion (default max_depth=0) in the dumper and the GUI tree build → both now cap recursion depth.
  • ParseStringLiteral now only returns NUL-terminated strings (safe for std::string/strlen).
  • Signed LEB128 sign-extension used a negative left shift (UB) → unsigned shift.

Verified

  • ~1300 ASan/UBSan fuzz iterations over the real arm64 binaries: no OOB access, no OOM (the count-bounding also eliminated malformed-input OOM).
  • Code Directory/cdhash (SHA-256 path) validated on a real signed arm64 binary as a side effect.
  • Full tests/regression/run_all.sh passes; builds with Qt6 + Capstone.

Known remaining (tracked, not fixed here)

A pervasive but benign misaligned-read UB from raw-pointer dereferences into the mapping (does not fault on x86_64/arm64); fully eliminating it needs the NodeData migration.

https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok


Generated by Claude Code

claude added 2 commits May 23, 2026 20:22
…inaries)

Ran the full view-node dumper under ASan/UBSan over real macOS arm64 binaries
(Python extension .so from arm64 wheels) plus thousands of truncated and
byte-flipped variants. This surfaced — and this commit fixes — a batch of real
crashes that the header-only moex-parse and the bundled samples never reached:

- Wrong static_cast in the rebase opcode view (REBASE_OPCODE_DO_REBASE_ADD_
  ADDR_ULEB cast to the ADD_ADDR_ULEB wrapper) — undefined downcast.
- Unbounded table walks reading past a truncated/crafted file: symbol table
  (nlist), indirect symbols & relocations, data-in-code, function starts,
  string table, section data sizes, and the DYLD_INFO rebase/bind opcode
  streams. Each is now clamped to the mapped file.
- nsects not bounded to the segment command size, so section structs could be
  read past the file during node construction.
- Stack overflow from unbounded view-tree recursion (default max_depth=0) in
  the dumper walks and the GUI tree build; both now cap recursion depth.
- ParseStringLiteral now only returns NUL-terminated strings so callers can
  safely treat them as C strings.
- Signed LEB128 sign-extension used a negative left shift (UB); use an
  unsigned shift.

After the fixes, ~1300 ASan/UBSan fuzz iterations over the real binaries report
no out-of-bounds access and no out-of-memory. (A pervasive but benign
misaligned-read UB from raw pointer dereferences into the mapping remains; it
does not fault on x86_64/arm64 and is tracked for the NodeData migration.)

https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok
@everettjf everettjf merged commit 665c9ed into master May 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants