Making perplexity calculations consistent across sample length by mcleish7 · Pull Request #27 · EleutherAI/semantic-memorization

mcleish7 · 2024-07-13T15:25:26Z

When calculating perplexity, on lines 110 and 115 the code currently divides by len(token_probs) which is twice the length of token_probs[mid_index:] as mid_index = len(token_probs) // 2 but then on line 120, for the whole sequence, the code still divides by len(token_probs).
This means the is a slight inconsistency in the perplexity calculation.

I have corrected this by using a single function cases, also removing the repeated code.

…probabilities

CLAassistant · 2024-07-13T15:25:32Z

All committers have signed the CLA.

Creating single function to handle calculating perplexity from token …

2c29229

…probabilities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Making perplexity calculations consistent across sample length#27

Making perplexity calculations consistent across sample length#27
mcleish7 wants to merge 1 commit intoEleutherAI:masterfrom
mcleish7:patch-1

mcleish7 commented Jul 13, 2024

Uh oh!

CLAassistant commented Jul 13, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

mcleish7 commented Jul 13, 2024

Uh oh!

CLAassistant commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Jul 13, 2024 •

edited

Loading