Skip to content

Add transliteration for O with ogonek#733

Merged
radar merged 1 commit into
masterfrom
transliterate-o-ogonek
Jun 17, 2026
Merged

Add transliteration for O with ogonek#733
radar merged 1 commit into
masterfrom
transliterate-o-ogonek

Conversation

@radar

@radar radar commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Problem

I18n.transliterate("CAIǪUE") returns "CAI?UE" instead of "CAIOUE".

Ǫ/ǫ (U+01EA/01EB) and Ǭ/ǭ (U+01EC/01ED) are O-with-ogonek letters in the Latin Extended-B block. The default DEFAULT_APPROXIMATIONS table only covers Latin-1 Supplement and Latin Extended-A (up to ž, U+017E), so these fall through to the replacement character ?. This is inconsistent with e.g. Ų/ų (U+0172/0173), which sit in Extended-A and transliterate correctly to U/u.

Fix

Add the four O-with-ogonek pairs to the table.

I18n.transliterate("CAIǪUE") # => "CAIOUE"
I18n.transliterate("ǫ Ǭ ǭ")  # => "o O o"

Fixes #732

Ǫ/ǫ (U+01EA/01EB) and Ǭ/ǭ (U+01EC/01ED) live in Latin Extended-B,
which the default approximations table did not cover, so they
transliterated to the replacement character "?" instead of "O"/"o".

Fixes #732

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@radar radar merged commit 0c2d1e1 into master Jun 17, 2026
27 of 28 checks passed
@radar radar deleted the transliterate-o-ogonek branch June 17, 2026 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] transliteration of a word with O with an ogonek

1 participant