Skip to content

Refactor/format type improvements#14

Open
apiss2 wants to merge 11 commits intoglandfried:masterfrom
apiss2:refactor/format-type-improvements
Open

Refactor/format type improvements#14
apiss2 wants to merge 11 commits intoglandfried:masterfrom
apiss2:refactor/format-type-improvements

Conversation

@apiss2
Copy link

@apiss2 apiss2 commented Nov 21, 2025

This PR focuses on improving code quality, consistency, and type safety in TrueskillThroughTime.py.
All existing tests pass, and no behavioral changes to the external API or core logic are introduced.

Summary of Changes

  • Applied code formatting with ruff
    The entire file has been formatted using ruff to ensure consistent coding style.

  • Added type annotations (verified with mypy)
    Type hints have been added across major classes and methods, and the code now passes mypy checks.

  • Refactored GameType usage and location
    Moved GameType to a more appropriate location and replaced previous str-based comparisons with GameType enum values for improved type safety.

  • Removed redundant orden attribute from the Game class
    The attribute duplicated the behavior of the existing method, so it has been removed for clarity.

  • Eliminated aliasing such as g = self
    Removed unnecessary aliases to improve readability and enhance IntelliSense support.

  • Renamed methods to avoid name conflicts with member variables

    • Game.marginGame.get_margin
    • Game.likelihoodsGame.get_likelihoods
  • Fixed a potential bug in History.add_history
    Corrected an issue where self.batches was referenced instead of self.btimes.

  • Renamed classes to follow PEP 8 naming conventions

    • team_variableTeamVariable
    • diff_messagesDiffMessages

Notes

  • All pre-existing tests pass successfully.
  • These changes focus on maintainability, readability, and type correctness.
  • No functional changes to the TrueSkill Through Time algorithm itself are included.

@glandfried
Copy link
Owner

glandfried commented Nov 25, 2025

Do you have tested this version?

@apiss2
Copy link
Author

apiss2 commented Nov 26, 2025

Thank you for your comment!

I realized that the tests were actually failing early on, and I mistakenly thought they had all passed. I’ve now gone back and rerun the full test suite. (Regarding import trueskillthroughtime2, it seems the 2 is not needed, so I removed it and ran the tests again.)

As mentioned in the “Refactored GameType usage” section, the Game class now expects GameType instead of a str. Because of that, I updated the test code accordingly so that it matches the new, type-safe interface.

After making those adjustments, all tests passed except for test_game_discrete_1vs1. This single test still fails, but the rest of the suite runs successfully. If I’m not mistaken, test_game_discrete_1vs1 was already failing even with the original (pre-refactor) code — could you confirm if that’s correct?

If there’s anything else you’d like me to adjust, please let me know.

@glandfried
Copy link
Owner

I found other problems of my own in the v2. See the learning curves that we estimate in the examples/generic_plot.py dataset. They must be centered around 0, but we get a drift on time. I will return to version v1 until I could solve this drift.

@apiss2
Copy link
Author

apiss2 commented Nov 27, 2025

Hi, thanks again for taking the time to look into this and for pointing out the drift issue in v2.

I took a closer look at the failing test, and it seems the error originates inside Game.partial_evidence. Specifically, extremely large values generated via norm.rvs were passed through np.exp, and the resulting parameters were then fed into poisson.rvs, which caused numerical problems in some cases.

To address this, I removed the call to poisson.rvs entirely so that no random sampling occurs inside partial_evidence. This eliminates the risk of failures caused by extreme parameter values. Instead, I now compute the probability of observing r directly via poisson.pmf, and then average those probabilities. Conceptually, rather than counting how often a sampled value matches r, we evaluate “the likelihood of r under the generated λ.” This reduces variance substantially and makes it possible to handle very small positive values in a stable manner.

With this change applied, test_game_discrete_1vs1 now passes successfully on my side.

I also checked examples/generic_plot.py, and I was able to reproduce the negative drift you mentioned. I believe this arises from the use of sampling with replacement, which allows the same player to be drawn multiple times within a single generated match. Since random.choices performs sampling with replacement, switching to random.sample, which samples without replacement, should prevent this issue. Would this modification align with your expectations?

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants