Plan for adding a msgcat CLI that closes the workflow gap vs go-i18n: extract (discover message keys from code and/or YAML) and merge (prepare translation files from a source language).
Goals
- Developers can discover which message keys are used in Go and ensure the source catalog has them.
- Translators get per-language files that list only missing/empty messages with source text as placeholder.
- Optional group (int or string) in YAML for organization (e.g.
group: "api"orgroup: 0); CLI and library support it. - CLI is additive; library gets a small, backward-compatible extension for group.
Scope
- New package or module:
cmd/msgcat(orinternal/msgcatcli) producing amsgcatbinary. - Two subcommands:
extract,merge. - Message format: current msgcat YAML (
default+setwith optionalcode) plus optional group (int or string)—see §7. - Small library extension for group support so YAML and CLI stay in sync.
Purpose: Discover message keys referenced in Go and optionally sync them into the source language YAML so every used key exists (with empty or placeholder content for new keys).
- Scan Go code for message key string literals in:
GetMessageWithCtx(ctx, "key", ...)WrapErrorWithCtx(ctx, err, "key", ...)GetErrorWithCtx(ctx, "key", ...)
- Keys are the first string literal argument to these functions. Support both direct string literals and concatenation of string literals (optional).
- Option A (keys only): Output the unique set of keys (e.g. one per line to stdout, or to a file). Use case: “what keys does the code use?” or input to tooling.
- Option B (sync to source YAML): Read the source language file (e.g.
en.yaml), add any key that appears in Go but not inset; new entries get emptyshort/long(or a comment/placeholder). Write back to source file or-out. Use case: “after adding new GetMessageWithCtx calls, update en.yaml so translators see new keys.”
We can implement both: e.g. extract -keys prints keys; extract -source en.yaml -out en.yaml syncs keys into that file.
- Walk directories for
*.go(respect-excludefor vendor, etc.). - Skip
_test.gounless-include-tests(default: skip). - Use
go/astto find call expressions whose selector isGetMessageWithCtx,WrapErrorWithCtx, orGetErrorWithCtxand package ismsgcat(or configurable import path). Extract first string argument (handle basic+concatenation if desired). - Dedupe keys.
- Keys-only mode: print keys (e.g. sorted) to stdout or
-out. - Sync mode: parse source YAML (reuse or mirror msgcat’s
Messagesstruct), merge in missing keys with empty short/long, write YAML (preserve order/comment where feasible or at least valid YAML).
| Flag | Description |
|---|---|
paths |
Directories or files to scan (default: .) |
-out |
Output file (keys mode: one key per line; sync mode: YAML path) |
-source |
Source language YAML path (enables sync mode; e.g. resources/messages/en.yaml) |
-format |
For keys: keys (one per line) or yaml (minimal YAML stub). For sync: ignored, output is YAML. |
-include-tests |
Include _test.go files (default: false) |
-msgcat-pkg |
Import path for msgcat (default: github.com/loopcontext/msgcat) so we detect the right calls |
Purpose: From a source language file (e.g. en.yaml), produce per-language translate files that contain every key from the source; for keys missing or empty in the target, use source short/long as placeholder so translators can fill them.
- Input:
- One source message file (e.g.
en.yaml). All keys from this file define the canonical set. - Optional target message files (e.g.
es.yaml,fr.yaml). If a target file exists, we use it to prefill already-translated entries.
- One source message file (e.g.
- Output:
- One file per target language:
translate.<lang>.yaml(e.g.translate.es.yaml). - Content: same structure as msgcat YAML (
default+set). For each key in source:- If target has a non-empty
shortandlongfor that key, use target’s content (considered translated). - Otherwise, use source’s
short/longas placeholder (and optionalcodefrom source).
- If target has a non-empty
- So translators open
translate.es.yaml, see English where Spanish is missing, and replace with Spanish. When done, they rename or copytranslate.es.yaml→es.yamlfor use at runtime. No separate “active” file in msgcat’s loader; the directory just hasen.yaml,es.yaml, etc.
- One file per target language:
- Target file missing: Treat as “no translations yet”; output
translate.<lang>.yamlwith all keys from source (source content as placeholder). - Key in target but not in source: Optionally drop (so translate file is “keys we care about”) or keep (document in plan). Recommend: drop so translate file is exactly “keys from source that need translation.”
- default block: Copy source
defaultinto each translate file so the file is valid; translators can replace with localized default. - code field: Copy from source when creating placeholder entries; if target had a value, keep target’s (or always use source for consistency—document choice).
- Parse source YAML into a structure that matches msgcat’s
Messages(default + set). - For each target language (from
-targetLangsand/or from existing*.yamlin a directory):- Parse target file if present.
- Build merged
set: for each key in source, if target has non-empty short and long, use target; else use source. - Write
translate.<lang>.yamlwith merged content (and default from source).
- Language tag comes from filename (e.g.
es.yaml→es) or from-targetLangs es,fr.
| Flag | Description |
|---|---|
-source |
Source message file (e.g. resources/messages/en.yaml) |
-targetLangs |
Comma-separated target language tags (e.g. es,fr). If not set, can infer from -targetDir .yaml (excluding source and translate.). |
-targetDir |
Directory containing target YAMLs (e.g. resources/messages). Optional; can pass target files as positional args. |
-outdir |
Where to write translate.<lang>.yaml (default: same dir as source or .) |
-translatePrefix |
Filename prefix for translation files (default: translate.) so output is translate.es.yaml. |
Initial setup
- Maintain source language (e.g.
en.yaml) with all messages. - Run:
msgcat extract -source resources/messages/en.yaml -out resources/messages/en.yamlwhen new keys are added in code (or run extract -keys and add keys manually).
Adding a new language (e.g. Spanish)
- Run:
msgcat merge -source resources/messages/en.yaml -targetLangs es -outdir resources/messages - Get
resources/messages/translate.es.yamlwith all keys and English placeholders. - Translators fill in Spanish.
- Rename/copy
translate.es.yaml→es.yamlin the same directory. msgcat loadses.yamlat runtime.
Adding new keys to en.yaml later
- Update
en.yaml(or run extract -source to add keys from Go). - Run merge again:
msgcat merge -source resources/messages/en.yaml -targetDir resources/messages -outdir resources/messages - New keys appear in existing
translate.es.yaml(and other targets) with English placeholders; existing translations are preserved in the merge output.
Optional: validate keys in code vs YAML
msgcat extract -out keys.txt(keys only).- Compare keys.txt to keys in en.yaml to find “in code but not in catalog” or “in catalog but not in code.”
-
Optional group (library)
AddOptionalGrouptype (unmarshal/marshal int or string), add optionalGrouptoMessages. No runtime behavior change. Tests for YAML round-trip. Enables CLI to preserve group in extract/merge. -
Extract (keys only)
AST walk, collect keys from the three API calls, output unique list. Tests with a few fixture .go files. -
Extract (sync to source YAML)
Parse msgcat YAML (reuse types or duplicate minimal struct in CLI to avoid coupling), add missing keys with empty short/long, preservegroup, write YAML. Tests with in-memory YAML. -
Merge
Parse source + target YAMLs, copy sourcegroupinto translate files, build translate.*.yaml per target, write to outdir. Tests with fixture YAMLs. -
CLI wiring
cmd/msgcat/main.gowith subcommands extract/merge, flags, and clear usage. Install withgo install github.com/loopcontext/msgcat/cmd/msgcat@latest. -
Docs
README section “CLI workflow (extract & merge)”, link from main README; add "Optional group"; optional CONTEXT7 update.
- Option A:
cmd/msgcat/in the same repo (same module). Binary ismsgcat. Dependencies: only stdlib + YAML parser (and go/ast). Prefer not to depend on msgcat package for parsing so CLI works even if YAML structs are internal; we can duplicate minimal YAML structs in the CLI or use a generic map + marshal. - Option B: Separate module
github.com/loopcontext/msgcat/cmd/msgcatorgithub.com/loopcontext/msgcat-cli. Same repo is simpler; same module keeps one go.mod.
Recommendation: same repo, same module, cmd/msgcat/ with minimal duplication of YAML structures (or import msgcat and use its Messages/RawMessage if they stay public and we don’t pull in heavy deps). If we want zero dependency on msgcat at parse time, the CLI can define its own messagesDoc struct for YAML and produce the same format.
- CLDR plurals: Merge does not need to understand plural forms; msgcat’s current
{{plural:count|singular|plural}}is preserved as literal strings in YAML. - Hash / change detection: We could add optional hash of source content per key to detect “translation was for old version” (like goi18n). Defer to a later iteration.
- Other formats (TOML/JSON): Only YAML in/out for now to match msgcat.
Purpose: Allow message files or entries to be tagged with a group that can be either an integer or a string (e.g. group: "api" or group: 0). Use for organization, filtering, or tooling—e.g. all API errors in group "api", or numeric groups for legacy systems.
-
Type
OptionalGroup
Same pattern asOptionalCode: a type that unmarshals from int or string in YAML. Internal representation can be string (e.g.0→"0","api"→"api"); marshal back as string, or preserve kind so that numeric input round-trips asgroup: 0and string asgroup: "api"(implementation choice). -
Where group lives
- File-level (recommended): Add optional
Group OptionalGrouptoMessages. In YAML, top-levelgroup: "api"orgroup: 0applies to the whole file. One group per file is the common case. - Per-entry (optional): Add optional
Group OptionalGrouptoRawMessage. In YAML, each key insetcan havegroup: "api"orgroup: 0to override or sub-categorize. Implement if needed after file-level is done.
- File-level (recommended): Add optional
-
Runtime behavior
The catalog does not interpret group; it is only stored and available for tooling or future use (e.g. filtering, export). No change toMessageorGetMessageWithCtxreturn type unless we later add a way to expose group (e.g.Message.Group). For this plan, adding the field to the YAML struct and parsing is enough. -
YAML example (file-level)
group: api default: short: Unexpected error long: ... set: error.not_found: short: Not found long: ...
Or numeric:
group: 0 default: short: Unexpected error set: greeting.hello: short: Hello long: ...
- Extract: When reading/writing source YAML (sync mode), preserve the existing
groupfield. When creating a new source file from keys only, omit group (or add a default) per project preference. - Merge: When building translate files, copy the source file’s
groupinto each outputtranslate.<lang>.yamlso the translated file has the same group as the source. If per-entry group is added later, copy from source entry when creating placeholders.
- OptionalGroup can live in the same package as
OptionalCode(e.g.code.goor newgroup.go). UnmarshalYAML: acceptint,int64,string; store as string for simplicity. MarshalYAML: if the string is numeric (e.g.strconv.Atoisucceeds), emit as int for readability; otherwise emit as string. That givesgroup: 0andgroup: "api"round-trip. - Validation: no uniqueness or allowed-values check; group is opaque to the library.
This plan is the single source of truth for implementing the extract/merge CLI workflow and optional group support in msgcat.