Store Sales - Time Series Forecasting

Overview

This end-to-end project aims to forecast retail store sales based on historical data, helping businesses optimize inventory management, reduce waste, and improve revenue planning. It includes data preprocessing, feature engineering, model training, and deployment-ready code.

Key Features

Data Preparation: Addressed missing values, normalized time series, and created aggregated features for store and category levels.
Feature Engineering: Generated lag features, rolling averages, seasonal indicators, and holiday-based features to capture temporal patterns.
Model Optimization: Applied Optuna for hyperparameter tuning of XGBoost, reducing RMSLE by 15%.
Validation and Testing: Used TimeSeriesSplit for proper evaluation of sequential data and implemented Pytest for functional testing of core modules.
Deployment-Ready: Integrated CI/CD pipelines and containerized the project using Docker.

Results

Achieved an RMSLE of 0.75094, outperforming baseline methods such as moving averages and linear regression by 15%.
Forecasting accuracy provides actionable insights for inventory and demand planning.

Business Value

Inventory Optimization: Accurate sales forecasts reduce overstock and stockouts, minimizing storage costs and lost revenue.
Revenue Planning: Helps align inventory and workforce with expected sales patterns.
Strategic Insights: Enables better decision-making for promotions, pricing, and holiday planning.

Tools & Technologies

Programming Language: Python
Libraries: pandas, numpy, XGBoost, Optuna, Scikit-learn, Matplotlib, Seaborn
Tools: Jupyter Notebook, Pytest, Docker, CI/CD

How to Run

Option 1: Running Locally

Clone the repository:

git clone https://github.com/NasdormML/Store_Sales_Forecasting.git  
cd Store_Sales_Forecasting

Install the required dependencies:
```
pip install -r requirements.txt  
```
Run the main script:
```
python main.py  
```

Option 2: Using Docker

Build the Docker image:
```
docker build -t store-sales .  
```
Run the Docker container:
```
docker run -p 8080:8080 store-sales  
```

Data

Source: The dataset is publicly available on Kaggle. It includes historical sales data, product categories, holidays, and other relevant features.
Preprocessing:
- Cleaned missing values using forward filling and interpolation methods.
- Created aggregated features at category and store levels.
- Removed outliers and addressed data leakage risks.

Project Structure

time_series_project/  
├── .github/  
│   └── workflows/  
│       └── ci.yml               # CI/CD configuration file  
├── data/  
│   ├── processed/               # Preprocessed data ready for modeling  
│   ├── raw/                     # Raw input data  
├── models/                      # Saved trained models  
├── notebooks/                   # Jupyter notebooks for exploratory analysis  
│   ├── EDA.ipynb                # Exploratory Data Analysis  
│   └── store_sales_kaggle.ipynb # Additional exploratory analysis  
├── src/                         # Source code of the project  
│   ├── data_preparation.py      # Code for data preprocessing  
│   ├── model_prediction.py      # Code for generating predictions  
│   ├── model_training.py        # Code for training the model  
├── tests/                       # Unit and integration tests  
├── dockerfile                   # Dockerfile for containerizing the project  
├── main.py                      # Entry point for running the project  
├── README.md                    # Project description and documentation  
└── requirements.txt             # Python dependencies

Contact

If you have any questions or suggestions, feel free to reach out:

Email: nasdorm.ml@inbox.ru

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Store Sales - Time Series Forecasting

Overview

Key Features

Results

Business Value

Tools & Technologies

How to Run

Option 1: Running Locally

Option 2: Using Docker

Data

Project Structure

Contact

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
.github/workflows		.github/workflows
data		data
notebooks		notebooks
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dockerfile		dockerfile
main.py		main.py
requirements.txt		requirements.txt

License

NasdormML/Store_Sales_Forecasting

Folders and files

Latest commit

History

Repository files navigation

Store Sales - Time Series Forecasting

Overview

Key Features

Results

Business Value

Tools & Technologies

How to Run

Option 1: Running Locally

Option 2: Using Docker

Data

Project Structure

Contact

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Languages

Packages