Build culture compiler for Unicode CLDR database

The current culture data from the original Time Machine project has an unclear origin. While we now have the ability to recompile it from JSON, which at least allows editing and fixing potential bugs, we should use an authoritative source to rebuild it from scratch and also have a clear attribution of the source.

A suitable origin would be Unicode's CLDR database: https://cldr.unicode.org/

It publishes culture data in XML format that should be straightforward to parse. It seems to cover more eventualities and more calendar types than the current culture data, it also lists eras correctly. First step would be to build an XML parser and write the data to JSON to build a proper diff, then compile XML straight to our binary format.

Update:
- Looking a bit more in-depth there's a JSON version of the CLDR database published here: https://github.com/unicode-org/cldr-json. The license is permissive enough for the data to be included in an open source package.
- There are interesting discrepancies between the CLDR notation and Time Machine's notation. The datetime format patterns can't be taken "as is".
- The month names in CLDR are only genitive it seems?
- This library could be helpful to interpret the database: https://www.npmjs.com/package/cldr
- Here is a documentation of the CLDR datetime notation: https://unicode-org.github.io/icu/userguide/format_parse/datetime/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build culture compiler for Unicode CLDR database #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Build culture compiler for Unicode CLDR database #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions