Skip to content

Move operations cause Word 'unreadable content' warning #96

@arthrod

Description

@arthrod

Summary

Documents generated by WmlComparer with move operations (w:moveFrom/w:moveTo) cause Microsoft Word to display an "unreadable content" warning when opened.

Steps to Reproduce

  1. Compare two documents where text has been moved using WmlComparer.Compare()
  2. Open the resulting document in Microsoft Word
  3. Word shows: "Word found unreadable content in [filename]. Do you want to recover the contents of this document?"

Root Cause Analysis

After extensive debugging (30+ attempts documented), we found that:

  1. Move operations are fragile in OOXML: Word is extremely strict about how w:moveFrom, w:moveTo, w:moveFromRangeStart, w:moveFromRangeEnd, w:moveToRangeStart, w:moveToRangeEnd elements must be structured.

  2. Duplicate revision IDs: Docxodus sometimes generates duplicate w:id attributes across revision elements (e.g., both w:moveFrom and w:del can have id="21"). ECMA-376 requires these to be unique.

  3. Various fixes attempted but failed:

    • Deduplicating move operations by name
    • Ensuring RangeStart/RangeEnd ID pairing
    • Sequential ID renumbering
    • Converting delText to t inside moveFrom
    • PowerTools/Clippit document rebuilding

Working Workaround

The only reliable fix we found is to convert all move operations to simple del/ins:

// Convert MoveFromRun to DeletedRun
foreach (var moveFrom in root.Descendants<MoveFromRun>().ToList())
{
    var del = new DeletedRun
    {
        Author = moveFrom.Author?.Value,
        Date = moveFrom.Date?.Value,
        Id = moveFrom.Id?.Value
    };
    
    // Move all children to the new del element
    foreach (var child in moveFrom.ChildElements.ToList())
    {
        child.Remove();
        del.AppendChild(child);
    }
    
    moveFrom.InsertAfterSelf(del);
    moveFrom.Remove();
}

// Convert MoveToRun to InsertedRun (similar pattern)

// Remove all move range markers
root.Descendants<MoveFromRangeStart>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveFromRangeEnd>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveToRangeStart>().ToList().ForEach(e => e.Remove());
root.Descendants<MoveToRangeEnd>().ToList().ForEach(e => e.Remove());

Trade-off

This sacrifices the semantic "move" information (where text came from/went to) in favor of document validity. Users will see:

  • Strikethrough text at original location (deletion)
  • Underlined text at new location (insertion)

Instead of:

  • Green strikethrough (moved from)
  • Green double-underline (moved to)

Environment

  • Docxodus 5.4.0
  • .NET 10
  • Microsoft Word for Mac (and likely Windows)

Suggestion

Consider adding a WmlComparerSettings option like ConvertMovesToDelIns = true to give users an easy way to opt into this workaround when Word compatibility is critical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions