Skip to content

Conversation

@xijo
Copy link
Owner

@xijo xijo commented Jan 20, 2026

Summary

  • Fixes issue Untidy tags produce invalid Markdown #99: Adjacent emphasis tags like <em>wo</em><em>rd</em> were producing invalid markdown _wo__rd_
  • Adds merge_adjacent_emphasis() method in Cleaner to merge adjacent identical emphasis markers
  • Handles both underscore emphasis (_X__Y__XY_) and strong emphasis (**X****Y****XY**)

Test plan

  • Added unit tests for merge_adjacent_emphasis in cleaner_spec.rb
  • Added integration tests for adjacent em tags in em_spec.rb
  • Verified all 223 existing tests still pass
  • Manual verification with examples from the issue

🤖 Generated with Claude Code

When HTML contains adjacent emphasis tags like <em>wo</em><em>rd</em>,
the output was _wo__rd_ which is invalid markdown. The double underscore
breaks parsing in most markdown renderers.

Add merge_adjacent_emphasis() method in Cleaner that merges adjacent
identical emphasis markers during post-processing:
- _a__b_ → _ab_
- **a****b** → **ab**

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants