Skip to content

Lutaml integration#123

Open
andrew2net wants to merge 15 commits into
mainfrom
lutaml-integration
Open

Lutaml integration#123
andrew2net wants to merge 15 commits into
mainfrom
lutaml-integration

Conversation

@andrew2net
Copy link
Copy Markdown
Contributor

No description provided.

andrew2net and others added 15 commits March 12, 2026 23:38
* Preserve XML-fragment markup in Bibcollection title/author

Port of fd20c9d (#125) to the v2/lutaml-integration branch.

Switch Bibcollection.from_xml to read the collection title and author
via inner_html instead of Nokogiri's .text, so the in-memory strings
keep their XML-fragment form (markup + entities intact). Apply the
strip_html Liquid filter on the HTML <title> tag position so the
browser tab title stays plain text. Adds find_html to ElementFinder
alongside find_text. Adds a regression spec with markup and &amp; in
both the collection title and the author name.

Refs metanorma/isodoc#785.
Port the write-path fix from #128 to v2/lutaml-integration. The read-path
half of #128 (find_html + strip_html) was already ported via #127; this
commit ports the remaining to_xml escaping.

bibcollection.rb to_xml was writing the collection title and author
directly into XML without escaping, producing bare & in the output
when the values came from YAML (e.g. name: "A test & playground ...").
A bare & is invalid XML; libxml2 in recovery mode emits FATAL
"xmlParseEntityRef: no name" and then silently drops all subsequent
&amp; entities in the same document — corrupting every individual
document title's & in the collection index HTML output.

Add a private xml_escape helper that escapes only unencoded & (not
already-encoded &amp;, &#nnn;, &#xhh;) and leaves inline markup tags
(<em>, <strong>, etc.) untouched, so valid HTML fragments round-tripped
via find_html pass through unchanged.

Fixes metanorma/isodoc#785: metanorma/isodoc#785

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant