Skip to content

Fix and clarify messages related to selection of vowel rules#1736

Open
rpspringuel wants to merge 14 commits into
gregorio-project:developfrom
rpspringuel:vowelrules
Open

Fix and clarify messages related to selection of vowel rules#1736
rpspringuel wants to merge 14 commits into
gregorio-project:developfrom
rpspringuel:vowelrules

Conversation

@rpspringuel
Copy link
Copy Markdown
Contributor

As I was looking at the output messages of various tests, I noticed that the messages were regularly saying that Latin vowel rules weren't found in the vowel rules files and that gregorio was falling back to Latin instead. I found this kind of confusing, so I went looking at the code related to this and discovered that the rule appeared to be backwards: triggering the message required "latin" (in various forms) to be requested and not found. Really, this message should be used when any non-"latin" language is requested and not found.

This fixes that, and also introduces a distinct message to be used when the default Latin rules are used. I originally was going to skip the look-up loop if the language was Latin (as the rules for Latin are built-in to the executable), but I realized that users might want to customize the Latin rules themselves, in which case we should still be checking the vowel rule files for those customizations.

I have not, at this point, looked at how this affects the tests nor updated the documentation, so for the moment I'm simply looking for comments on this change.

@davidweichiang
Copy link
Copy Markdown
Contributor

Do "default" and "internal" mean the same thing? If so, I think the same word should be used. I can't really think of a clear way to distinguish the error messages and wonder if they really need to be distinguished.

@davidweichiang
Copy link
Copy Markdown
Contributor

Maybe “Selecting Latin instead of specified language” and “Using default rules for Latin instead of specified rules”?

@rpspringuel
Copy link
Copy Markdown
Contributor Author

The executable has a set of rules for Latin built into it and these are the rules that we fall back on when we can't find rules for the selected language in gregorio-vowels.dat. Additionally, if the selected language is Latin, we don't just jump straight to these rules, but still check gregorio-vowels.dat for Latin rules in case the user wants to customize them. If there are no custom Latin rules, then we use the builtin rules. So, while "default" and "internal" are referring to the same set of rules, when I say "default" I'm referring to the first process, while when I say "internal" I'm referring to the second. The difference in vocabulary is meant to distinguish not so much the rules being used, the reason for using those rules. And as I think about it, the first one should probably be a warning (because we're ignoring user input) while in the second case, we're just informing the user of what we're doing (which is most likely, exactly what they want us to do).

Ignoring user input really should be a Warning, so we split the messages about using internal latin rules based on why we're using them.
@rpspringuel rpspringuel added this to the 6.3.0 milestone Apr 19, 2026
When breaking this compound condition over multiple lines to make it easier to read, I accidentally introduced a stray open parenthesis at the beginning of the condition.
Comment thread src/characters.c Outdated
@rpspringuel rpspringuel marked this pull request as ready for review May 10, 2026 17:32
Comment thread src/characters.c Outdated
Comment thread src/characters.c Outdated
Comment thread src/characters.c Outdated
@rpspringuel
Copy link
Copy Markdown
Contributor Author

I've added some tests to look at the new messages. Looking at the messages produced, I think there's some further refinement needed here as the messages get really repetitive and potentially deceptive (especially with aliases). Please take a look at gregorio-project/gregorio-test#424 and tell me what you think.

@davidweichiang
Copy link
Copy Markdown
Contributor

The "Aliased to language in file" messages are wrong, and the fix depends on what the search order is supposed to be (both untested):

  • Change line
    for (p = filenames; status != RFPS_FOUND && (filename = *p); ++p) {
    so that the loop condition is status == RFPS_NOT_FOUND.
  • At the beginning of function
    void gregorio_vowel_tables_load(const char *const filename,
    , insert *status = RFPS_NOT_FOUND;.

The expected output of the first fix would be:

Looking for first in ./gregorio-vowels.dat
Aliasing first to second
Aliasing second to first
Aliased to first in ./gregorio-vowels.dat
Looking for first in ./gregorio-vowels.dat
Aliasing first to second
Aliasing second to first
Aliased to first in ./gregorio-vowels.dat
warning:Unable to resolve alias for first. Selecting Latin instead

The expected output of the second fix would be:

Looking for first in ./gregorio-vowels.dat
Aliasing first to second
Aliasing second to first
Aliased to first in ./gregorio-vowels.dat
Looking for first in /usr/local/texlive/texmf-local/tex/luatex/gregoriotex/gregorio-vowels.dat
Could not find first in /usr/local/texlive/texmf-local/tex/luatex/gregoriotex/gregorio-vowels.dat
etc.
Looking for first in ./gregorio-vowels.dat
Aliasing first to second
Aliasing second to first
Aliased to first in ./gregorio-vowels.dat
Looking for first in /usr/local/texlive/texmf-local/tex/luatex/gregoriotex/gregorio-vowels.dat
Could not find first in /usr/local/texlive/texmf-local/tex/luatex/gregoriotex/gregorio-vowels.dat
etc.
warning:Unable to resolve alias for first. Selecting Latin instead

The second fix preserves the existing logic, but I think the first way makes more sense. For example, suppose:

  • the local gregorio-vowels.dat has "alias b to c" to "alias a to b", in that order
  • the global gregorio-vowels.dat has rules for both b and c
    Then under the current code, if the user selects language a, they'll get the rules for b, with confusing messages.
    Under the first fix, they'll get the rules for c. Under the second fix, they'll get the rules for b.

Also note that if a chain of aliases is too long, then this algorithm would falsely detect a cycle.

When dealing with aliases, there was status bleed through from one file to the next, resulting in messages which were confusing.  filestatus (i.e. what we found in the file we're currently looking at) is distinguished from stats (i.e. what the overall results of the search are).  This prevents the bleed through.

Further, loops were only implicitly detected by taking more than 2 tries to look through the files.  This strategy keeps track of the language alias path and uses that to look explicitly for loops.  We've also increased the potential alias depth that is allowed.  Note: alias depth only matters if aliases are not defined in their proper order in the same file.
@rpspringuel
Copy link
Copy Markdown
Contributor Author

I've tried a new strategy which I think should make things clearer. Please look at the tests: gregorio-project/gregorio-test@0d210b9

I still need to make the expectations independent of my computer, so unless your installation pattern is identical to mine the tests will likely fail on your machine, but let me know if the expectations make sense.

Comment thread src/characters.c Outdated
strcasecmp(language, "la") == 0 ||
strcasecmp(language, "lat") == 0);

alias_found = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this variable might not be needed, since you can just check status == RFPS_ALIASED below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to remove that variable and found some other things that hopefully clear up the loop logic.

@davidweichiang
Copy link
Copy Markdown
Contributor

That looks good to me (and makes me wish this were in Lua).

This gets rid of variables that aren't needed and uses do...while structure to make the logic clearer.
@rpspringuel
Copy link
Copy Markdown
Contributor Author

Okay, the tests should now pass on any machine and not just one which is setup exactly like mine. I'll confirm that later this week when I have time to boot up one of my VMs. Right now I'm going for lunch and then a bike ride.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants