Skip to content

Census Batch 2026-06-24 + improve language name decoding#687

Merged
conradarcturus merged 2 commits into
masterfrom
census-2026-06-24
Jun 25, 2026
Merged

Census Batch 2026-06-24 + improve language name decoding#687
conradarcturus merged 2 commits into
masterfrom
census-2026-06-24

Conversation

@conradarcturus

Copy link
Copy Markdown
Collaborator

Summary: This adds new language population data for many countries and locales for them

Changes

  • User experience
  • Logical changes
    • New unknown collector type so we can check if it wasn't filled out.
  • Data
    • Added censuses for Curacao, Pitcairn, Wallis & Futuna, Qatar, Suriname, and Togo
    • Added locales for notable languages that we are now aware of
    • Added organizations
    • Some copy-edits to existing census imports to standardize names better and to fix un-annotated collector types
  • Refactors
    • CensusLanguageCheck was getting big -- I should move more things out but for now I just moved CensusLanguageCheckRow

Out of scope/Future work: Always more censuses :) Also I wonder if we will still do the language name -> language code tool since this emulates much of the functionality

Test Plan and Screenshots

How to test the changes in this PR: Tested various censuses in the census input tool, added them, verified the potential locales tool, etc. Below is a screenshot of the updates to the census input tool.

Page/View with link Description of Changes Screenshot Before Screenshot After
Census input tool New column showing correcting, highlights differences Screenshot 2026-06-24 at 22 01 44 Screenshot 2026-06-24 at 22 02 07

@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Cloudflare Pages preview

Preview torn down on PR close: 1 deployment(s) deleted.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates LangNav’s census ingestion and reporting tooling alongside a new batch of census/language-population data. It improves the census input/debug UX (including recommended language code paths), adds explicit handling for “unknown” collector types, and expands territory-based language breakdowns to include dependencies.

Changes:

  • Enhanced Census input tool + CensusLanguageCheck UI to surface recommended language code paths and improved debugging/copy workflows.
  • Added CensusCollectorType.Unknown and updated metadata parsing to warn when collectorType is missing; added a single-column “language names only” parsing mode.
  • Added/updated census and locale data (new official/unofficial census TSVs, locale additions, and organization entries).

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/widgets/reports/ReportCensusInputTool.tsx Updates TSV editor behavior and metadata warning display logic in the Census input tool.
src/widgets/details/sections/LanguageSpeakersByTerritorySection.tsx Includes dependency territories when deriving speaker-by-territory locales.
src/entities/language/population/LanguagePopulationFromLocales.tsx Includes dependency territories when aggregating population breakdown from locales.
src/entities/census/parseCensusMetadata.ts Adds single-column mode and introduces Unknown default collector type + warning.
src/entities/census/CensusTypes.tsx Adds CensusCollectorType.Unknown and ranks it as lowest priority.
src/entities/census/CensusLanguageCheckRow.tsx New table-row component to render recommended vs original language codes with highlighting.
src/entities/census/CensusLanguageCheck.tsx Refactors/extends language code/name checking: recommendations, macrolanguage paths, single-column support, and clipboard export.
public/data/tc/organizations.tsv Adds/updates organization entries used for census collector/presenter linking.
public/data/tc/locales.tsv Adds new locale rows for Curaçao, Pitcairn, Qatar, Suriname, and Togo.
public/data/census/unofficial/censusList.txt Registers additional unofficial AXL census TSVs.
public/data/census/unofficial/axl.tg.tsv Adds new unofficial AXL Togo language population data.
public/data/census/unofficial/axl.sg.tsv Updates Singapore AXL census display name.
public/data/census/unofficial/axl.qa.tsv Adds new unofficial AXL Qatar language population data.
public/data/census/unofficial/axl.pn.tsv Adds new unofficial AXL Pitcairn Islands language population data.
public/data/census/unofficial/axl.nf.tsv Adds new unofficial AXL Norfolk Islands language population data.
public/data/census/unofficial/axl.gl.tsv Updates Greenland AXL census display names.
public/data/census/unofficial/axl.cx.tsv Updates Christmas Island AXL census display names.
public/data/census/unofficial/axl.cd.tsv Updates Congo (Kinshasa) AXL census display names and author capitalization.
public/data/census/official/wf.tsv Adds official Wallis & Futuna census dataset.
public/data/census/official/sr2012.tsv Adds official Suriname 2012 census dataset.
public/data/census/official/cw2023.tsv Adds official Curaçao 2023 census dataset.
public/data/census/official/censusList.txt Registers new official census TSVs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/entities/census/CensusLanguageCheck.tsx
Comment thread src/entities/census/CensusLanguageCheck.tsx Outdated
Comment thread src/entities/census/CensusLanguageCheckRow.tsx Outdated
Comment thread src/widgets/reports/ReportCensusInputTool.tsx Outdated
@conradarcturus conradarcturus merged commit 3cdfe92 into master Jun 25, 2026
8 checks passed
@conradarcturus conradarcturus deleted the census-2026-06-24 branch June 25, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants