-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Currently, every transliteration table in Scriptshifter generates a database entry for each of the token pairs in the S2R and R2S sections, for the current table and all of its parents. This means that each table that inherits from one or more other tables creates duplicate entries for all its parents.
Until we had a few Cyrillic languages using table inheritance on a few hundred entries, this was not a problem. But now we have tens of nearly identical Cyrillic tables, and what's more concerning, several Indic languages are using the Devanagari base table which is over 8K lines. This creates an unnecessarily large and slow database.
This ticket is to restructure the DB tables so that each token pair is no longer bound to a single language and script, but rather uses a many-to-many relationship via a new join table. This would greatly reduce the number of entries and, possibly improve performance as the number of supported languages and scripts grows.