Census Batch 2026-06-24 + improve language name decoding#687
Merged
Conversation
Contributor
Cloudflare Pages previewPreview torn down on PR close: 1 deployment(s) deleted. |
There was a problem hiding this comment.
Pull request overview
This PR updates LangNav’s census ingestion and reporting tooling alongside a new batch of census/language-population data. It improves the census input/debug UX (including recommended language code paths), adds explicit handling for “unknown” collector types, and expands territory-based language breakdowns to include dependencies.
Changes:
- Enhanced Census input tool +
CensusLanguageCheckUI to surface recommended language code paths and improved debugging/copy workflows. - Added
CensusCollectorType.Unknownand updated metadata parsing to warn whencollectorTypeis missing; added a single-column “language names only” parsing mode. - Added/updated census and locale data (new official/unofficial census TSVs, locale additions, and organization entries).
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/widgets/reports/ReportCensusInputTool.tsx | Updates TSV editor behavior and metadata warning display logic in the Census input tool. |
| src/widgets/details/sections/LanguageSpeakersByTerritorySection.tsx | Includes dependency territories when deriving speaker-by-territory locales. |
| src/entities/language/population/LanguagePopulationFromLocales.tsx | Includes dependency territories when aggregating population breakdown from locales. |
| src/entities/census/parseCensusMetadata.ts | Adds single-column mode and introduces Unknown default collector type + warning. |
| src/entities/census/CensusTypes.tsx | Adds CensusCollectorType.Unknown and ranks it as lowest priority. |
| src/entities/census/CensusLanguageCheckRow.tsx | New table-row component to render recommended vs original language codes with highlighting. |
| src/entities/census/CensusLanguageCheck.tsx | Refactors/extends language code/name checking: recommendations, macrolanguage paths, single-column support, and clipboard export. |
| public/data/tc/organizations.tsv | Adds/updates organization entries used for census collector/presenter linking. |
| public/data/tc/locales.tsv | Adds new locale rows for Curaçao, Pitcairn, Qatar, Suriname, and Togo. |
| public/data/census/unofficial/censusList.txt | Registers additional unofficial AXL census TSVs. |
| public/data/census/unofficial/axl.tg.tsv | Adds new unofficial AXL Togo language population data. |
| public/data/census/unofficial/axl.sg.tsv | Updates Singapore AXL census display name. |
| public/data/census/unofficial/axl.qa.tsv | Adds new unofficial AXL Qatar language population data. |
| public/data/census/unofficial/axl.pn.tsv | Adds new unofficial AXL Pitcairn Islands language population data. |
| public/data/census/unofficial/axl.nf.tsv | Adds new unofficial AXL Norfolk Islands language population data. |
| public/data/census/unofficial/axl.gl.tsv | Updates Greenland AXL census display names. |
| public/data/census/unofficial/axl.cx.tsv | Updates Christmas Island AXL census display names. |
| public/data/census/unofficial/axl.cd.tsv | Updates Congo (Kinshasa) AXL census display names and author capitalization. |
| public/data/census/official/wf.tsv | Adds official Wallis & Futuna census dataset. |
| public/data/census/official/sr2012.tsv | Adds official Suriname 2012 census dataset. |
| public/data/census/official/cw2023.tsv | Adds official Curaçao 2023 census dataset. |
| public/data/census/official/censusList.txt | Registers new official census TSVs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
56234df to
63b3f16
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary: This adds new language population data for many countries and locales for them
Changes
CensusLanguageCheckwas getting big -- I should move more things out but for now I just movedCensusLanguageCheckRowOut of scope/Future work: Always more censuses :) Also I wonder if we will still do the language name -> language code tool since this emulates much of the functionality
Test Plan and Screenshots
How to test the changes in this PR: Tested various censuses in the census input tool, added them, verified the potential locales tool, etc. Below is a screenshot of the updates to the census input tool.