-
Notifications
You must be signed in to change notification settings - Fork 82
Extend MinutesPlayedAggregator with minutes played per possession state #479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Extend MinutesPlayedAggregator with minutes played per possession state #479
Conversation
|
I haven't fully digested this, but shouldn't the following also be in
I say the above with the notion that some data providers are really bad at tagging "OUT" events. Additionally (and I think the answer is no), but is there a way to account for the fact that the Also, it |
|
Thanks for your feedback @UnravelSports!
The results/event types you mentioned indeed also cause a dead ball, but strictly speaking, we don't need to include them for the results to be correct because of the following:
So if the provider correctly tags "OUT" events, then it is not necessary to include these results in the
But of course, if the
By excluding
All operations can currently be done on an EventDataset (the information about whether a player was on the pitch or not is extracted from his positions which include a start and end time). |
|
From my experience Opta is not great at tagging out events. I think it makes sense if we simply include everything that should create, or happen within a Dead Ball state simply to cover all bases. (I don't really see a downside to this). Using receival_timestamp should be fine if it exists indeed! |
|
I have updated the logic to include everything that should create a "dead ball". I also handled set pieces if the ball_state was not already "dead". Sometimes a set piece would occur without being preceded by a I ran into some other issues while working on this:
|
|
@UnravelSports @koenvo Any thoughts on this? |
|
A couple of this:
teams = list(team_minutes_played_map.keys())
assert team_minutes_played_map[teams[0]] == team_minutes_played_map[teams[1]]
minutes_played_per_possession_state = dataset.aggregate(
"minutes_played", breakdown_key="possession_state"
)I haven't looked at the actual inner workings. |
This PR represents an effort to implement the functionality described in Issue 476.
Some notes:
include_positionflag is no longer supported andplayerinsideMinutesPlayedis replaced by a more generickey. For example, this breaks the example in the documentation, since you can no longer accessitem.playerbut should useitem.key.player. How should we handle this?