Problem
Wordle Global currently treats every language's writing system as a simple alphabet: one character = one tile = one color. This works perfectly for Latin, Cyrillic, Arabic, and other scripts where each letter is an atomic unit. But for composition-based scripts — where characters are built from smaller phonetic components — this model breaks down.
Korean (the immediate case)
Korean syllable blocks are composed of 2-3 jamo (consonants and vowels):
- 한 = ㅎ (initial) + ㅏ (vowel) + ㄴ (final)
- 글 = ㄱ (initial) + ㅡ (vowel) + ㄹ (final)
Our current approach (PR #155) decomposes words into individual jamo and puts one jamo per tile — 5 jamo per word. This creates several problems:
- Unnatural grid: Korean speakers see
ㅎ ㅏ ㄴ ㄱ ㅡ ㄹ (6 strokes) instead of 한 글 (2 syllables). It doesn't look like Korean.
- Unpredictable word length: A 2-syllable word could be 4, 5, or 6 jamo depending on whether syllables have final consonants. "4-letter Korean Wordle" doesn't map to any natural Korean concept.
- Doesn't scale: Variable word lengths, phrase-of-the-day, or difficulty modes (3-syllable → 5-syllable) are impossible because jamo count ≠ syllable count.
- 65-character keyboard: Because compound vowels (ㅘ, ㅙ) and compound jongseong (ㄺ, ㄻ) are separate characters, the keyboard needs 40+ keys across 5 rows. 꼬들 (kordle.kr) avoids this by decomposing everything to 26 basic jamo, but then needs 6 cells per word.
- IME conflict: Physical Korean keyboards compose syllable blocks via the OS IME. Our game must either bypass the IME (current fix:
physical_key_map) or decompose composed input — both are workarounds for a data model that fights the writing system.
The same problem exists in other scripts
| Language |
Natural unit |
Components within unit |
Speakers |
| Korean |
Syllable block (한) |
Initial consonant + vowel + optional final consonant |
80M |
| Tamil |
Akshara (கா) |
Consonant + vowel mark (matra) |
80M |
| Hindi/Devanagari |
Akshara (क्षा) |
Consonant(s) + vowel mark, with conjuncts |
600M |
| Bengali |
Akshara (ক্ষা) |
Same as Hindi |
230M |
| Chinese |
Character (春) |
Pinyin: initial + final + tone |
1.1B |
| Thai |
Syllable (กาน) |
Consonant + vowel (multi-position) + tone mark |
60M |
| Khmer |
Syllable |
Base + subscript consonants + vowel |
16M |
Current approach (PR #155)
PR #155 fixes the immediate Korean keyboard bug (Unicode mismatch between Compatibility Jamo and Hangul Jamo) using the existing diacritic normalization system. It works but adds complexity:
diacritic_map with 50+ Jamo mappings
- 5-row keyboard with compound vowel and double consonant keys
- Blocklist of 129 words with compound jongseong that can't be typed on the default keyboard
physical_key_map to bypass IME for physical keyboards
- All of this to work around the fundamental mismatch between "one jamo per tile" and how Korean actually works
Proposed solution: sub-component tile coloring
The abstraction
Instead of decomposing characters into separate tiles, keep the natural linguistic unit as the tile and color its sub-components independently:
Current: [ㅎ] [ㅏ] [ㄴ] [ㄱ] [ㅡ] [ㄹ] (6 tiles, 1 color each)
🟩 🟩 🟩 🟨 ⬜ 🟩
Proposed: [한] [글] (2 tiles, 3 colors each)
ㅎ=🟩 ㅏ=🟩 ㄴ=🟩 ㄱ=🟨 ㅡ=⬜ ㄹ=🟩
The data model would be:
interface TileResult {
display: string; // "한" — what the player sees
components: string[]; // ["ㅎ", "ㅏ", "ㄴ"] — what gets compared
colors: ComponentColor[]; // ["correct", "correct", "correct"]
}
This single abstraction handles every script:
- Latin/Cyrillic/Arabic: 1 component per tile (current behavior, no change)
- Korean: 2-3 components (initial, vowel, final)
- Tamil/Hindi: 2-3 components (consonant, vowel mark, optional conjunct)
- Chinese: 3-4 components (character, pinyin initial, pinyin final, tone)
Rendering approaches (by complexity)
-
CSS diagonal gradient (simplest, 2 signals): Split tile diagonally — top-left = consonant color, bottom-right = vowel color. Used by Solladal (Tamil Wordle). ~5 lines of CSS.
-
CSS absolute positioning (medium, 3-5 signals): Main character centered, component indicators positioned around it as colored dots or small text. Used by 汉兜 (Handle) (Chinese Wordle).
-
SVG path decomposition (most polished, 3-5 signals): Decompose the font glyph into separate SVG paths per component, color each path independently. Used by 한들 (Handle) (Korean Wordle). Visually seamless — the syllable block looks normal but each jamo stroke is a different color. Requires a font with non-connected jamo paths.
Benefits
- Natural word lengths: "5-letter word" = 5 syllables for Korean, 5 aksharas for Tamil/Hindi
- Clean keyboard: Korean needs only 26 basic jamo keys (3 rows), IME works natively
- No blocklists: No compound jongseong keyboard gap — they compose naturally within syllable blocks
- Scalable: Variable word lengths, phrase-of-the-day, and difficulty modes all trivial
- More information per guess: 3 color signals per tile instead of 1 — richer feedback for the player
- Universal: One tile system handles every current and future script
Prior art
| Game |
Script |
Signals/cell |
Technique |
| 한들 (Handle) |
Korean |
3-5 |
SVG path decomposition |
| 汉兜 (Handle) |
Chinese |
5 |
CSS positioned spans |
| Solladal |
Tamil |
2 |
CSS diagonal gradient |
| Shabdle |
Hindi |
1 |
No sub-coloring (chose not to) |
| 꼬들 (Kordle) |
Korean |
1 |
Full decomposition (6 cells, avoids the problem) |
Scope
This is a significant frontend architecture change — not a quick fix. It involves:
- Tile data model: Extend from
string to { display, components, colors }
- Color algorithm: Compare at component level, not character level
- Rendering: Choose and implement a sub-coloring technique (CSS gradient → SVG path)
- Word list migration (Korean): Re-encode from decomposed jamo to syllable blocks
- Per-language decomposition config: Define how each script splits characters into components
PR #155 ships the immediate Korean fix using the current architecture. This issue tracks the long-term scalable solution.
Affected languages
Currently supported, would benefit:
- Korean (ko) — most impacted, current workarounds are complex
Not yet supported, would be unblocked:
- Tamil, Hindi, Bengali, Thai, Khmer, Chinese, Japanese (kana — simple case, no sub-coloring needed but same tile model)
Not affected (already work fine):
- All Latin, Cyrillic, Arabic, Greek, Hebrew, Georgian, Armenian scripts
Problem
Wordle Global currently treats every language's writing system as a simple alphabet: one character = one tile = one color. This works perfectly for Latin, Cyrillic, Arabic, and other scripts where each letter is an atomic unit. But for composition-based scripts — where characters are built from smaller phonetic components — this model breaks down.
Korean (the immediate case)
Korean syllable blocks are composed of 2-3 jamo (consonants and vowels):
Our current approach (PR #155) decomposes words into individual jamo and puts one jamo per tile — 5 jamo per word. This creates several problems:
ㅎ ㅏ ㄴ ㄱ ㅡ ㄹ(6 strokes) instead of한 글(2 syllables). It doesn't look like Korean.physical_key_map) or decompose composed input — both are workarounds for a data model that fights the writing system.The same problem exists in other scripts
Current approach (PR #155)
PR #155 fixes the immediate Korean keyboard bug (Unicode mismatch between Compatibility Jamo and Hangul Jamo) using the existing diacritic normalization system. It works but adds complexity:
diacritic_mapwith 50+ Jamo mappingsphysical_key_mapto bypass IME for physical keyboardsProposed solution: sub-component tile coloring
The abstraction
Instead of decomposing characters into separate tiles, keep the natural linguistic unit as the tile and color its sub-components independently:
The data model would be:
This single abstraction handles every script:
Rendering approaches (by complexity)
CSS diagonal gradient (simplest, 2 signals): Split tile diagonally — top-left = consonant color, bottom-right = vowel color. Used by Solladal (Tamil Wordle). ~5 lines of CSS.
CSS absolute positioning (medium, 3-5 signals): Main character centered, component indicators positioned around it as colored dots or small text. Used by 汉兜 (Handle) (Chinese Wordle).
SVG path decomposition (most polished, 3-5 signals): Decompose the font glyph into separate SVG paths per component, color each path independently. Used by 한들 (Handle) (Korean Wordle). Visually seamless — the syllable block looks normal but each jamo stroke is a different color. Requires a font with non-connected jamo paths.
Benefits
Prior art
Scope
This is a significant frontend architecture change — not a quick fix. It involves:
stringto{ display, components, colors }PR #155 ships the immediate Korean fix using the current architecture. This issue tracks the long-term scalable solution.
Affected languages
Currently supported, would benefit:
Not yet supported, would be unblocked:
Not affected (already work fine):