Support structured JSON in translation files #52

ritinae · 2025-08-01T18:46:57Z

This PR adjusts the lang file parsing, to allow nesting the keys. This added structure helps working with larger lang files, and allows utilizing e.g. code folding in editors.

The translation keys are generated as paths, based on the structure of the JSON. For example, I would like to be able to write JSON like this in the lang files:

{
  "foo": {
    "bar": "baz"
  },
  "yay": "wohoo!"
}

The resulting translation keys should then be "foo-bar" and "yay". This can almost already be achieved with the current parser, by writing the paths as parts of the keys:

{
  "foo-bar": "baz",
  "yay": "wohoo!",
}

However, this lacks structure, making it harder to read when the desired structure has multiple levels of nesting and/or when there are more keys around. The structured version has the added benefit of supporting code folding in editors, helping when working with larger sets of translation keys.

The example JSONs for the existing parser and for the updated parser both generate identical translation keys:

domain:foo-bar which maps to "baz"
and domain:yay which maps to "wohoo!"

The changes are opt-in. No other logic outside the parser is changed. Translation keys with special characters (e.g. here '-') are already perfectly valid with how the TranslationService currently behaves, meaning these changes provide more flexibility, while still allowing using the old "flat" lang files just as fine.

Implementation

The implementation itself is quite straightforward:

Instead of directly deserializing into Dictionary<string, string>, parse the json as a raw JToken tree when calling LoadEntries
Then, consume the tree:
1. If the current token is a JSON Object, for each JSON property on the object:
  1. append property name to the key, with - as the delimiter
  2. recursively call LoadEntries, with the appended key
2. If the current token is a string, load the string as a translation entry, with path to it (from the root of the JSON) as the key.
3. All other JSON tokens are treated as errors

The LoadEntry, and everything from there on functions exactly as before. Only the parsing step has changed.

Real-world example

For a bit more real-world example, I have a en.json file, which currently looks something like this:

{
    "settings-startup-title": "Startup settings",
    "settings-startup-mergestacksongroundenabled-label": "Enable Module: Merge Stacks on Ground",
    "settings-startup-mergestacksongroundenabled-comment": "Reduces lag by merging stacks on ground to larger clumps. Requires restart.",
    "settings-startup-anothermoduleenabled-label": "Enable Module: Another",
    "settings-startup-anothermoduleenabled-comment": "Does something else",
    "settings-mergestacksonground-title": "Merge Stacks on Ground - Settings",
    "settings-mergestacksonground-enablerendertweaks-label": "Custom rendering for merged stacks",
    "settings-mergestacksonground-enablerendertweaks-comment": "Renders the merged stacks as larger groups. Requires restart.",
    "settings-mergestacksonground-maxrenderedstacks-label": "Max rendered stacks",
    "settings-mergestacksonground-maxrenderedstacks-comment": "Maximum number of stacks rendered when stacks are merged.",
    "settings-anothermodule-title": "Another Module - Settings",
    "settings-anothermodule-dostuff-label": "Yes/No",
    "settings-anothermodule-dostuff-comment": "Maybe?"
}

This is still manageable, but one can see that it is quickly becoming quite a word salad. The proposed changes would allow me to write this as:

{
    "flat-keys-without-nesting-are-still": "Perfectly Valid",
    "the-nesting-structure-is": "Fully Opt-in",
    "settings": {
        "startup": {
            "title": "Startup settings",
            "mergestacksongroundenabled": {
                "label": "Enable Module: Merge Stacks on Ground",
                "comment": "Reduces lag by merging stacks on ground to larger clumps. Requires restart."
            },
            "anothermoduleenabled": {
                "label": "Enable Module: Another",
                "comment": "Does something else"
            }
        },
        "mergestacksonground": {
            "title": "Merge Stacks on Ground - Settings",
            "enablerendertweaks": {
                "label": "Custom rendering for merged stacks",
                "comment": "Renders the merged stacks as . Requires restart."
            },
            "maxrenderedstacks": {
                "label": "Max rendered stacks",
                "comment": "Maximum number of stacks rendered when stacks are merged."
            }
        },
        "anothermodule": {
            "title": "Another Module - Settings",
            "dostuff": {
                "label": "Yes/No",
                "comment": "Maybe?"
            }
        }
    },
}

pizza2004 · 2025-08-01T19:27:56Z

Would it be possible to make it use - instead of / for this? I feel this would better match with what we currently do if we were going to change this in vanilla.

Not that I am certain this change would be accepted. I think it might cause an issue with the Crowdin translation. I like it in concept, though.

ritinae · 2025-08-02T11:20:30Z

Would it be possible to make it use - instead of / for this?

Good point! I was thinking about the translation keys as paths, so resorted to / as a path-like separator, but looking at vanilla, the - is much more consistent.

ritinae · 2025-08-02T12:16:29Z

Did two quick fixes:

The naive string replacement fails if translation keys contain . characters, so we need to construct the key path/prefix manually to avoid this. The implementation is already traversing the tree in correct order, so this wasn't too hard to implement.
Fixed a bug which allowed parsing a JSON file with a singular string as a valid translation value with an empty key

Updated the PR description to reflect the separator character change (/ -> -), too!

SaculRennorb

Overall this PR seems reasonable and should allow for this new structured style of declaring translations, while not causing issues when parsing the old style.

Without further adjustments, using a StringBuilder as i've suggested might incur a small performance penalty for the old style, since it introduces an additional copy in that case.
This can likely be mitigated by passing it directly into the LoadEntry function, seeing as that function also does additional string manipulation.
That way it should be a net zero for the old style.

If you don't want to do that last part please leave a brief note to investigate that option in the future.

Opinions:

It is debatable if there is even much value in adding that new style from the loader side of things, since the gain of potentially more compact files is likely impacted by the loader having to reconstruct the actual keys. In general the number one priority for these files should be parsing speed, and if the format required to do so incurs usability issues, then these should be addressed by external tooling, not by a slower, more complicated loader.

I also personally find the nested style more difficult to read than the flat list of kvps, but i admit this is absolutely up to taste.

I think it is still reasonable to at least allow this style, as this should not introduce any noticeable maintenance overhead.

Localization/TranslationService.cs

ritinae · 2025-08-15T13:08:43Z

Without further adjustments, using a StringBuilder as i've suggested might incur a small performance penalty for the old style, since it introduces an additional copy in that case.
This can likely be mitigated by passing it directly into the LoadEntry function, seeing as that function also does additional string manipulation.

After a bit of head scratching on what LoadEntry actually does, I realized all keys processed during a single LoadEntries batch always share the same domain prefix. Therefore, now that everything is using a string builder to share the common path prefix, it was quite trivial to move the domain-prefix to that same shared string builder. That is, just append the domain to the start of the key before stepping into the recursive function.

The only complication was that the path separator prefixing (adding - between path parts) no longer could rely on path being empty for the first part of a key. To resolve that, a boolean flag as a parameter to the recursive function was a good enough solution. I could have checked the last character of the builder, but var isFirstPart = <is the last char of the builder a ':'> looked way messier than a simple boolean parameter, so I preferred the parameter over that.

SaculRennorb · 2025-08-15T13:14:27Z

This seems like a good idea, however the previous implementation using KeyWithDomain

vsapi/Localization/TranslationService.cs

Lines 606 to 613 in 4ed035d

    
           private static string KeyWithDomain(string key, string domain = GlobalConstants.DefaultDomain) 
        
           { 
        
               if (key.Contains(AssetLocation.LocationSeparator)) return key; 
        
               return new StringBuilder(domain) 
        
                   .Append(AssetLocation.LocationSeparator) 
        
                   .Append(key) 
        
                   .ToString(); 
        
           }

checks if the domain is already present before prepending it.

Is there a situation where this could be causing issues? Namely could the domain already be present in old style translations?

ritinae · 2025-08-15T13:16:22Z

Ah, right, that's an oversight. I think it would make sense to check if the domain exists already before appending it again. I'll add that check back, just in case 🤦‍♀️

EDIT: Right. It's not that simple if the key itself contains a domain. I'll think about a solution for a bit.

SaculRennorb · 2025-08-15T13:40:10Z

One option could be to always prepend the domain as you do right now, but before finalizing the key, count the domain separators. If there is more than one turn the slice starting from after the first one into the actual key, instead of using the whole buffer.

something along the lines of

string GetTrueKey(StringBuilder sb, string domain)
{
	var key = sb.ToString();
	if(key.IndexOf(AssetLocation.LocationSeparator, domain.Length + 1) >= 0) {
		// there is a second domain in here, from the key itself.
		return key.SubString(domain.Length + 1);
	}
	return key;
}

You could also do the actual counting, but passing the domain from the outer function might or might not be easier.

You could also start checking for the separator on each recursion level, since one separator will pollute all further levels, so in theory those deeper parts then don't need checks anymore - but this is getting into the weeds too much. Further optimization above the old complexity, and feature parity ofcourse, can be something for the team.

ritinae · 2025-08-15T15:27:09Z

Always appending the prefix feels quite hacky, but I could not come up with any alternative approaches, which would be as simple as it, so I think that's the way to go.

pizza2004 · 2025-08-16T04:56:56Z

Due to the way in which we merge these things, I kind of need you to squash all your changes into one commit, please. Then I can look it over and patch it into the internal repository for Tyron to decide whether he thinks it's a good change. Just FYI, we're close enough to stable that this wouldn't make it into 1.21 either way, so it'll likely be months before you see this in vanilla even if we do merge it.

ritinae · 2025-08-17T00:40:11Z

Squashed the commits. I can get the parser changes in by monkey patching, so I'm not holding my breath waiting for this to land 😄 I do acknowledge that these changes and any potential gains from them, are a heavily opinionated, so I fully understand if the changes eventually get rejected 👍

Adjust the lang file parsing to allow nesting of keys. Generate translation keys as paths based on the structure of the json. For example, given this JSON: { "foo": { "bar": "baz" }, "yay": "wohoo!" } This can *kind of* already be achieved with the current parser as: { "foo-bar": "baz", "yay": "wohoo!", } However, this lacks structure and is harder to read, especially if desired structure has multiple levels of nesting. The structured version has the added benefit of supporting code folding in editors, helping when working with larger sets of translation keys. The example JSONs for the existing parser and for the updated parser both generate identical translation keys: domain:foo-bar => "baz" domain:yay => "wohoo!" Therefore, the changes are mostly opt-in, giving more flexibility, while still allowing using the old "flat" lang files as-is.

pizza2004 requested a review from SaculRennorb August 1, 2025 19:29

ritinae force-pushed the feat/structured-lang-json branch from 6bbb914 to 80da595 Compare August 12, 2025 16:27

SaculRennorb requested changes Aug 13, 2025

View reviewed changes

Localization/TranslationService.cs Outdated Show resolved Hide resolved

Localization/TranslationService.cs Outdated Show resolved Hide resolved

ritinae force-pushed the feat/structured-lang-json branch from 9a88d8c to 45c4590 Compare August 15, 2025 13:07

SaculRennorb self-requested a review August 15, 2025 16:04

SaculRennorb approved these changes Aug 15, 2025

View reviewed changes

ritinae force-pushed the feat/structured-lang-json branch from 326970b to ae7e6ae Compare August 17, 2025 00:20

ritinae force-pushed the feat/structured-lang-json branch from ae7e6ae to 5cf5fdf Compare August 17, 2025 00:41

Support structured JSON in translation files #52

Are you sure you want to change the base?

Support structured JSON in translation files #52

Uh oh!

Conversation

ritinae commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Real-world example

Uh oh!

pizza2004 commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritinae commented Aug 2, 2025

Uh oh!

ritinae commented Aug 2, 2025

Uh oh!

SaculRennorb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ritinae commented Aug 15, 2025

Uh oh!

SaculRennorb commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritinae commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SaculRennorb commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritinae commented Aug 15, 2025

Uh oh!

pizza2004 commented Aug 16, 2025

Uh oh!

ritinae commented Aug 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ritinae commented Aug 1, 2025 •

edited

Loading

pizza2004 commented Aug 1, 2025 •

edited

Loading

SaculRennorb commented Aug 15, 2025 •

edited

Loading

ritinae commented Aug 15, 2025 •

edited

Loading

SaculRennorb commented Aug 15, 2025 •

edited

Loading