-
Notifications
You must be signed in to change notification settings - Fork 4
Description
All data analysis has biases, because there are many ways to slice the data based on the hypothesis being tested or the analysis business rules. These biases are not wrong, but it is important to document them so the reader can decide if they agree with them.
For example in average-ranking-per-200-blocks-window, comparing Atom_Guide to layover_run, Atom_Guide had:
- 4.4x more missed blocks
- 4.4x more avgMissed per 200 block window
but Atom_Guide ranked higher owing to the compression effect of large groups of validators missing blocks at the same time as Atom_Guide - using the method described on Riot.
As @gamarin2 states "Your rank on a window really depends on how other perform during same window". While a perfectly valid method of slicing the data, it does mean that people who 'overachieved' by not missing blocks while many others around them did are not rewarded more. This is a conscious business rule which creates bias... and should be documented, along with the others in the analysis.