This project analyses delay patterns within the Toronto Transit Commission (TTC) subway system.
The analysis focuses on identifying relationships between delays and factors such as:
- station location
- subway lines
- time of day
- weather conditions
The results show that delays are primarily concentrated along Line 1 and Line 2 and are more closely related to operational factors than external ones such as precipitation.
This project was completed as part of a group Business Intelligence assignment.
My contribution focused on:
- data cleaning and preparation
- delay pattern analysis
- Power BI visualization
- station mapping
- line-based analysis
- hourly delay analysis
- weather correlation analysis
- solution recommendations
- Python
- Excel
- Power BI
- cleaned and standardized raw TTC delay data
- handled inconsistent line naming conventions
- created derived variables for analysis
- prepared datasets for visualization
- performed aggregation and pivot-based analysis
- analysed hourly delay distributions
- conducted precipitation and delay correlation analysis
- created geographic and temporal visualizations
- mapped high-delay stations
- analysed delay concentration by line and time
- 'TTC_Solution_Report.pdf` — final analysis and recommendations
viz.pbixfile — Power BI dashboard- ttc_project_data.xlsx — processed data and analysis
- BI_Project.ipynb — data cleaning and preparation
- Most delays are concentrated on Line 1 and Line 2
- Weather showed little correlation with delay volume
- Unexpected delay peaks were observed around 10 PM and 11 PM
- High-delay stations form clear geographic corridors along the TTC subway network
This approach can be applied to other systems where operations depend on location and scheduling, such as:
- airline scheduling
- logistics and delivery services
- public transportation systems