Skip to content

Improve robustness of cohort diversity statistics and make cache version easier to manage#1246

Open
Vibhanshu230 wants to merge 1 commit intomalariagen:masterfrom
Vibhanshu230:anopheles_bug_handling
Open

Improve robustness of cohort diversity statistics and make cache version easier to manage#1246
Vibhanshu230 wants to merge 1 commit intomalariagen:masterfrom
Vibhanshu230:anopheles_bug_handling

Conversation

@Vibhanshu230
Copy link
Copy Markdown

This PR makes a few small improvements to cohort_diversity_stats() to make the function more robust and easier to maintain.

  • Replaced the hardcoded cache version string with a version constant so cache invalidation is easier if the computation logic changes later.
  • Added a check to ensure n_jack is smaller than the number of sites to avoid invalid jackknife behaviour.
  • Handled a possible divide-by-zero case when computing Tajima’s D if there are no segregating sites.

These changes do not affect the expected outputs in normal cases, but help prevent edge-case errors and make the cache behaviour clearer. I am happy to adjust the implementation if there is a preferred style for validation or cache versioning in this module.

… cohot version constant so cache entries can be safely invalidated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant