-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
Documents generated by WmlComparer with move operations (w:moveFrom/w:moveTo) cause Microsoft Word to display an "unreadable content" warning when opened.
Steps to Reproduce
- Compare two documents where text has been moved using
WmlComparer.Compare() - Open the resulting document in Microsoft Word
- Word shows: "Word found unreadable content in [filename]. Do you want to recover the contents of this document?"
Root Cause Analysis
After extensive debugging (30+ attempts documented), we found that:
-
Move operations are fragile in OOXML: Word is extremely strict about how
w:moveFrom,w:moveTo,w:moveFromRangeStart,w:moveFromRangeEnd,w:moveToRangeStart,w:moveToRangeEndelements must be structured. -
Duplicate revision IDs: Docxodus sometimes generates duplicate
w:idattributes across revision elements (e.g., bothw:moveFromandw:delcan haveid="21"). ECMA-376 requires these to be unique. -
Various fixes attempted but failed:
- Deduplicating move operations by name
- Ensuring RangeStart/RangeEnd ID pairing
- Sequential ID renumbering
- Converting
delTexttotinsidemoveFrom - PowerTools/Clippit document rebuilding
Working Workaround
The only reliable fix we found is to convert all move operations to simple del/ins:
// Convert MoveFromRun to DeletedRun
foreach (var moveFrom in root.Descendants<MoveFromRun>().ToList())
{
var del = new DeletedRun
{
Author = moveFrom.Author?.Value,
Date = moveFrom.Date?.Value,
Id = moveFrom.Id?.Value
};
// Move all children to the new del element
foreach (var child in moveFrom.ChildElements.ToList())
{
child.Remove();
del.AppendChild(child);
}
moveFrom.InsertAfterSelf(del);
moveFrom.Remove();
}
// Convert MoveToRun to InsertedRun (similar pattern)
// Remove all move range markers
root.Descendants<MoveFromRangeStart>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveFromRangeEnd>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveToRangeStart>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveToRangeEnd>().ToList().ForEach(e => e.Remove());Trade-off
This sacrifices the semantic "move" information (where text came from/went to) in favor of document validity. Users will see:
- Strikethrough text at original location (deletion)
- Underlined text at new location (insertion)
Instead of:
- Green strikethrough (moved from)
- Green double-underline (moved to)
Environment
- Docxodus 5.4.0
- .NET 10
- Microsoft Word for Mac (and likely Windows)
Suggestion
Consider adding a WmlComparerSettings option like ConvertMovesToDelIns = true to give users an easy way to opt into this workaround when Word compatibility is critical.