feat(converter): render EMF+ images via embedded bitmaps (SD-2503)#3214
Open
gpardhivvarma wants to merge 3 commits intosuperdoc-dev:mainfrom
Open
feat(converter): render EMF+ images via embedded bitmaps (SD-2503)#3214gpardhivvarma wants to merge 3 commits intosuperdoc-dev:mainfrom
gpardhivvarma wants to merge 3 commits intosuperdoc-dev:mainfrom
Conversation
EMF+ payloads use GDI+ drawing records that the rtf.js renderer doesn't implement, so prior to this change every EMF+ image rendered as an "Unable to render EMF+ image" placeholder. Most real-world EMF+ files generated by Office (cover slides, charts, illustrations) embed a complete PNG/JPEG inside an EmfPlusObject(Image) record with BitmapDataType=Compressed. Walk the EMR_COMMENT records in the EMF stream, parse the inner EMF+ records, reassemble continuation series via the TotalObjectSize prefix on the first chunk (MS-EMFPLUS § 2.3.5.1), and return the embedded image directly. Pure-vector and pixel-format EMF+ images still fall back to the placeholder — a full GDI+ rasterizer is out of scope here. Closes superdoc-dev#3172
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7900e7f5eb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…(SD-2503) Per MS-EMFPLUS § 2.3.5.1, when ContinueBit=1 the EmfPlusObject record header is 16 bytes — TotalObjectSize sits between Size and DataSize and is present on every continued record (not only the first). The previous implementation read offset 8 as DataSize for continued records, which is actually TotalObjectSize, and treated the first 4 bytes of ObjectData as TotalObjectSize. The synthetic continuation tests built buffers with the same wrong layout, so they passed without exercising the bug. Real EMF+ files written by Office (the multi-record cover-image case) follow the spec layout, so the prior code would have either bailed on the bounds check or copied from the wrong offset and fallen through to the placeholder. Now: ContinueBit=1: Type(2) Flags(2) Size(4) TotalObjectSize(4) DataSize(4) ObjectData ContinueBit=0: Type(2) Flags(2) Size(4) DataSize(4) ObjectData Tests rebuild the synthetic buffers with the correct layout and add coverage for a 3+ record continuation series.
Contributor
Author
|
@caio-pizzol please review this |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Renders EMF+ images that embed a compressed bitmap (PNG/JPEG/GIF) instead of falling back to the placeholder SVG. Most real-world EMF+ files generated by Office — cover slides, charts, illustrations — wrap a complete PNG or JPEG inside an
EmfPlusObject(Image)record. Walking the EMF+ stream and pulling that bitmap out gives pixel-perfect rendering without implementing a GDI+ rasterizer.Closes #3172.
What changed
packages/super-editor/src/editors/v1/core/super-converter/v3/handlers/wp/helpers/metafile-converter.jsextractBitmapFromEmfPlus(buffer): walksEMR_COMMENTrecords carrying EMF+ payloads, scans inner EMF+ records forEmfPlusObject(Image)entries, reassembles continuation series via theTotalObjectSizeprefix on the first chunk per MS-EMFPLUS § 2.3.5.1, and parses the resultingEmfPlusImage/EmfPlusBitmapto extract the encoded image bytes.parseEmfPlusImageObject(bytes): validatesImage.Type=BitmapandBitmap.Type=Compressed, then returns the embedded PNG/JPEG/GIF as a data URI.detectCompressedImageFormat(bytes)helper using PNG/JPEG/GIF magic bytes.convertEmfToSvgbetween the existing classicEMR_STRETCHDIBITSpath and the EMF+ placeholder, so the placeholder remains the final fallback for pure-vector EMF+.70forEMR_COMMENTinto a named constant shared with the existingisEmfPlusdetector.Spec correctness
Per MS-EMFPLUS § 2.3.5.1:
ContinueBit=1,ObjectData = TotalObjectSize | first slice.ContinueBit=1, raw appended bytes.ContinueBit=0— the parser keys off this to flush.ContinueBit=1on the last record, the parser flushes early onceTotalObjectSizebytes are accumulated.What this does NOT cover
EmfPlusBitmap— also still hits the placeholder; rasterizing raw pixel buffers requires the same infrastructure.Acceptance criteria
originalSrc/originalExtensionwhen a metafile is converted; not changed).Test plan
pnpm exec vitest run src/editors/v1/core/super-converter/v3/handlers/wp/helpers/metafile-converter.test.js— 14/14 pass.pnpm exec vitest run src/editors/v1/core/super-converter/v3/handlers/wp/helpers/— 293/293 pass (no regressions in adjacent helpers).pnpm exec prettier --checkon both modified files — clean.m3 proposal.docx) in the editor and verify the cover image renders.