The current culture data from the original Time Machine project has an unclear origin. While we now have the ability to recompile it from JSON, which at least allows editing and fixing potential bugs, we should use an authoritative source to rebuild it from scratch and also have a clear attribution of the source.
A suitable origin would be Unicode's CLDR database: https://cldr.unicode.org/
It publishes culture data in XML format that should be straightforward to parse. It seems to cover more eventualities and more calendar types than the current culture data, it also lists eras correctly. First step would be to build an XML parser and write the data to JSON to build a proper diff, then compile XML straight to our binary format.
Update:
The current culture data from the original Time Machine project has an unclear origin. While we now have the ability to recompile it from JSON, which at least allows editing and fixing potential bugs, we should use an authoritative source to rebuild it from scratch and also have a clear attribution of the source.
A suitable origin would be Unicode's CLDR database: https://cldr.unicode.org/
It publishes culture data in XML format that should be straightforward to parse. It seems to cover more eventualities and more calendar types than the current culture data, it also lists eras correctly. First step would be to build an XML parser and write the data to JSON to build a proper diff, then compile XML straight to our binary format.
Update: