This project is focused on creating a predictive model for forecasting traffic flow on specific roads using historical traffic data. By leveraging various data science and machine learning techniques, the model aims to assist in anticipating traffic congestion and improving traffic management.
Traffic prediction is critical for urban planning, smart city solutions, and navigation systems. This repository contains all code, data, and notebooks related to the development, training, and evaluation of traffic prediction models. The data used in this project is collected from multiple roads, with traffic flow recorded every five minutes.
- Dataset/: Contains CSV files with recorded traffic data for different roads (e.g.,
train13519.csvfor Road X,train13619.csvfor Road Y, etc.). - Data_Preparation_New_Datasets/: Scripts and notebooks for cleaning and preparing new datasets for modeling.
- Traffic_Junc/: Additional scripts/data related to specific traffic junctions.
- Junc1_prediction.ipynb: Jupyter notebook demonstrating data preparation, model training, and prediction for one of the junctions.
- Second data set data preparation.ipynb: Notebook for cleaning and preparing a second dataset.
- junct_4.ipynb: Notebook focused on predictions and analysis for a fourth junction.
- traffic.ipynb: Main notebook for exploring data, feature engineering, modeling, and evaluation.
- Loading and Inspecting Data: Raw CSV files are loaded and inspected for missing values and consistency.
- Cleaning: Data cleaning is performed to handle missing or anomalous values and ensure uniform timestamps.
- Feature Engineering: Additional features such as time of day, day of week, and historical lags are engineered to improve model performance.
- Exploratory Data Analysis (EDA): Visualizations and statistics are generated to understand traffic patterns and correlations.
- Model Selection: Multiple models, including classical time series (ARIMA, SARIMA) and machine learning regressors (Random Forest, XGBoost), are evaluated.
- Training & Evaluation: Models are trained using prepared datasets, and evaluated with metrics such as MAE and RMSE.
- Prediction: The best models are used to forecast future traffic flow for each road/junction.
- Python 3.x
- Jupyter Notebook (Anaconda recommended)
- Required libraries: pandas, numpy, scikit-learn, matplotlib, seaborn, statsmodels, xgboost (install via pip or conda)
- Clone the repository
git clone https://github.com/Edwin574/Traffic-Prediction-Model.git
- Inspect and Explore the Dataset
- Navigate to the
Datasetfolder and review the CSV files.
- Navigate to the
- Run the Notebooks
- Open notebooks (e.g.,
traffic.ipynb,Junc1_prediction.ipynb) in Jupyter for step-by-step code and explanations.
- Open notebooks (e.g.,
- Modify and Experiment
- Adjust code for different roads or models as needed.
- Load a dataset (e.g.,
train13519.csv) in a notebook. - Clean and preprocess the data.
- Engineer relevant features (e.g., lag features).
- Split the data into train and test sets.
- Train various models and compare performance.
- Visualize predictions versus actual traffic data.
- @Edwin574
- @Rontim
- @Mokowz
- @Allanvince
This project is for educational and research purposes.
For questions, suggestions, or contributions, please open an issue or pull request.