Skip to content

Harden view-node dumping against malformed input#30

Merged
everettjf merged 1 commit into
masterfrom
claude/review-and-plan-g4p9o
May 22, 2026
Merged

Harden view-node dumping against malformed input#30
everettjf merged 1 commit into
masterfrom
claude/review-and-plan-g4p9o

Conversation

@everettjf
Copy link
Copy Markdown
Owner

Summary

Extends the crash regression to run the full --cli view-node dumper (text and JSON) over the malformed corpus, exercising the rich parsing layer (load commands, symbols, code signature, ObjC/Swift metadata) that moex-parse alone never reaches. This immediately surfaced — and this PR fixes — two real crashes on truncated binaries:

  • SIGSEGV: GetStringByStrX built a std::string straight from the string-table pointer, so a truncated or oversized strx ran strlen past the end of the mapping. Now bounded to both the string-table size and the mapped file via memchr. Affects every node that resolves symbol names (symbol table, disassembly, xref, swift).
  • SIGABRT: JSON export aborted because the bundled nlohmann::json (3.1.2) throws on invalid UTF-8, and cell values can contain raw bytes. All strings written to JSON are now sanitized to valid UTF-8 (malformed bytes replaced).

Test plan

  • tests/regression/run_all.sh passes (crash regression now includes 28 view-node dumper runs)
  • Stress: both samples × 15 truncation points × 2 formats (60 combos) — zero crashes
  • Stress: 120 byte-flip mutations of the fat sample via the dumper — zero crashes
  • Builds clean on Linux with Qt6 + Capstone

https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok


Generated by Claude Code

Extend the crash regression to also run the full --cli view-node dumper (text
and JSON) over the malformed corpus, which exercises the rich parsing layer
(load commands, symbols, code signature, ObjC/Swift metadata) that moex-parse
alone never reaches. This immediately surfaced two crashes on truncated
binaries:

- GetStringByStrX built a std::string straight from the string-table pointer,
  so a truncated or oversized strx ran strlen off the end of the mapping. Bound
  the read to both the string-table size and the mapped file via memchr.
- JSON export aborted because nlohmann::json (3.1.2) throws on invalid UTF-8,
  and cell values can contain raw bytes. Sanitize every string written to JSON
  to valid UTF-8, replacing malformed bytes.

https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok
@everettjf everettjf merged commit 4d0909e into master May 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants