Harden view-node dumping against malformed input#30
Merged
Conversation
Extend the crash regression to also run the full --cli view-node dumper (text and JSON) over the malformed corpus, which exercises the rich parsing layer (load commands, symbols, code signature, ObjC/Swift metadata) that moex-parse alone never reaches. This immediately surfaced two crashes on truncated binaries: - GetStringByStrX built a std::string straight from the string-table pointer, so a truncated or oversized strx ran strlen off the end of the mapping. Bound the read to both the string-table size and the mapped file via memchr. - JSON export aborted because nlohmann::json (3.1.2) throws on invalid UTF-8, and cell values can contain raw bytes. Sanitize every string written to JSON to valid UTF-8, replacing malformed bytes. https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends the crash regression to run the full
--cliview-node dumper (text and JSON) over the malformed corpus, exercising the rich parsing layer (load commands, symbols, code signature, ObjC/Swift metadata) thatmoex-parsealone never reaches. This immediately surfaced — and this PR fixes — two real crashes on truncated binaries:GetStringByStrXbuilt astd::stringstraight from the string-table pointer, so a truncated or oversizedstrxranstrlenpast the end of the mapping. Now bounded to both the string-table size and the mapped file viamemchr. Affects every node that resolves symbol names (symbol table, disassembly, xref, swift).nlohmann::json(3.1.2) throws on invalid UTF-8, and cell values can contain raw bytes. All strings written to JSON are now sanitized to valid UTF-8 (malformed bytes replaced).Test plan
tests/regression/run_all.shpasses (crash regression now includes 28 view-node dumper runs)https://claude.ai/code/session_013kBiVXftgoEsyGVyrvfGok
Generated by Claude Code