This project was completed as part of the UBC DSCI 100: Introduction to Data Science course.
It explores how server resources can be optimized based on patterns in player connection behavior using clustering, visualization, and K-Nearest Neighbors (KNN) regression.
final.ipynb: Main notebook containing all code, visualizations, and modeling.players.csv: Metadata of 196 players, including experience, subscription status, and gameplay hours.sessions.csv: 1,535 session records, including session times and durations.
To predict and optimize server resource allocation by analyzing when players are most active.
This involves understanding the number of concurrent connections across days and hours.
- Converted time strings to
datetimeobjects - Calculated session durations
- Extracted
hourandday of weekfeatures - Removed sessions with 0 duration
- Plotted connections per hour and per day using Altair
- Created bubble charts to visualize activity heatmaps
- Identified late-night and weekend peak usage
- Used elbow method to determine optimal
k = 3 - Grouped time slots into
Low,Medium, andHighconnection density - Labeled data with cluster names for better interpretability
Evaluated three strategies using Root Mean Squared Percentage Error (RMSPE):
| Model | Average RMSPE | Notes |
|---|---|---|
| Full dataset (no clusters) | 0.42 | Least accurate |
| Full dataset + density labels | 0.23 | Most accurate |
| Cluster-specific models | 0.32 | Intermediate accuracy, more modular |
- Created interactive 3D Plotly surfaces to show connection density across hours and days
- Compared predicted vs actual patterns using KNN-regressed surfaces
- Peak activity between 11:30 PM and 4:30 AM, especially on Saturday 2:00 AM
- Lowest activity during weekday mornings
- Friday showed surprisingly low usage
- Density-aware KNN models performed significantly better than unclustered models
- Python (Jupyter Notebook)
pandas,numpyaltair,plotlyscikit-learn(KMeans, KNN, GridSearchCV)
JunHyun Kim
David Liu
Layni Janzen
Sydney Lee
This project was completed as part of UBC's DSCI 100 (Winter 2024) — Final Group Project (Group 10).