fix(xml): decode XML character entities in attribute values (fixes #2877, #1199)#2971
Open
MaxwellM34 wants to merge 1 commit into
Open
fix(xml): decode XML character entities in attribute values (fixes #2877, #1199)#2971MaxwellM34 wants to merge 1 commit into
MaxwellM34 wants to merge 1 commit into
Conversation
Fixes software-mansion#2877 and software-mansion#1199. SVG attribute values containing XML numeric character references (e.g. 
, 
) or the standard named entities (& < etc) were passed through unchanged by the JS parser and ended up in the native renderer, which throws an 'UnexpectedData' error in native code. Because the throw happens on the native side, neither React error boundaries nor the <SvgXml onError> prop can catch it — the whole app crashes. This commit adds an exported decodeXmlEntities(value) helper and calls it inside getAttributeValue() so every parsed attribute value is fully decoded before reaching the AST (and the native renderer). Decoded: - the five standard XML named entities: & < > " ' - decimal numeric character references: &#NNN; - hex numeric character references: &#xHHH; / &#XHHH; (including 4-byte code points like 😀) Unknown or malformed references are left intact rather than dropped, so a typo'd entity remains visible in the output rather than silently disappearing. Adds 10 tests in __tests__/xml.test.tsx covering the decoder in isolation plus three integration tests through parse(), including the exact path-d string from the software-mansion#2877 reproduction case. Signed-off-by: Maxwellm34 <maxwellmcinnis123@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fixes #2877 (and the older, never-fixed #1199 — same root cause).
SVG attribute values containing XML numeric character references like

and
, or the five standard named entities (&<>"'), were passed through unchanged by the JS parser insrc/xml.tsxand ended up in the native renderer.The native side cannot handle raw entity references in attribute values (especially
don<path>) and throws anUnexpectedDataerror in native code. Because the throw happens on the native side, neither React error boundaries nor the<SvgXml onError>prop can catch it, and the whole app crashes — exactly what the issue describes.Reproduction
The exact SVG from the issue body, condensed:
Before this PR: native crash, neither the error boundary nor
onErrorfires.After this PR:
path dis decoded to"M0,0 L10,10"before reaching native; renders cleanly.What I changed
src/xml.tsx:decodeXmlEntities(value: string): stringthat handles:amp,lt,gt,quot,apos).&#NNN;).&#xHHH;/&#XHHH;, including 4-byte code points like😀).getAttributeValue()now callsdecodeXmlEntitieson the raw value before returning, so every parsed attribute comes out fully decoded.__tests__/xml.test.tsx(new file):decodeXmlEntities(named entities, decimal/hex refs, 4-byte code points, unknown-ref preservation, the exact path-d string fromUnexpectedDatacannot be caught by an error boundary #2877).parse().All 10 pass. The existing
css.test.tsxsnapshot failures on this branch are pre-existing onmain(stale snapshots from a priorclass→classNamechange) and unrelated to this PR.Design notes I want to flag for review
he(the de-facto JS entity decoder)? Two reasons: this parser is intentionally zero-dependency, andhedecodes the full HTML5 named-entity set, which for SVG (XML) is incorrect — etc. are not valid XML entities and should arguably stay raw. The 5-entity XML-strict approach also has a much smaller footprint. makes the bad input visible to the developer.getAttributeValuerather than wherever the value is consumed? Centralizing it at the parser boundary means every downstream consumer (web view, native view, AST→React transform) sees a normalized string. Decoding at each consumer would be both repetitive and easy to miss.This contribution was AI-assisted (Claude). The fix and tests were drafted with LLM help and reviewed against the issue repro before submission. Happy to revise anything — or close if the approach isn't what you'd want here.