Skip to content

Mohd-Shabir/data-analytics-sql-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

49 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SQL Data Analytics Project

A data analytics project built on top of a real data warehouse, using pure SQL to explore data, find patterns, and answer business questions about customers, products, sales, and visualize insights through an interactive Power BI dashboard.


πŸ“‘ Table of Contents


πŸ”— End-to-End Project Series

This project is part of a three-part data pipeline I built from scratch:

# Project What It Does Repo
1 Data Warehouse Built the data infrastructure β€” ETL pipelines, Bronze/Silver/Gold layers, Star Schema https://github.com/Mohd-Shabir/data-warehouse-sql-project-
2 Data Analytics Analysed the Gold layer data using EDA and Advanced SQL (this repo) β€”
3 Power BI Dashboard Built an interactive sales dashboard on top of the analytics (this repo) β€”

πŸ“– Executive Summary

This project is the second and third part of a three-part end-to-end data pipeline. Starting from the Gold layer produced by the Data Warehouse project, I first conducted Exploratory Data Analysis (EDA) across 6 scripts to understand the data β€” exploring dimensions, date ranges, KPIs, and rankings. I then applied Advanced Data Analytics (ADA) across 7 scripts to answer deeper business questions β€” analysing trends over time, cumulative growth, year-over-year performance, and customer/product segmentation. The final output was two reusable SQL views (report_customers and report_products) which I then connected directly to Power BI to build an interactive 3-page executive dashboard β€” segmenting the customer base, tracking high-value VIPs, and identifying top-performing product categories to help leadership optimize marketing spend and inventory.


🎯 Business Problem

The executive team lacked clear visibility into customer purchasing behaviors and product profitability. They needed to know:

# Business Question
1 Who are our most valuable customers (VIPs), and how do their purchasing habits differ from one-time buyers?
2 Do older demographics spend more than younger ones?
3 Which specific product categories are driving the actual revenue vs. just taking up warehouse space?

πŸ—ΊοΈ Analytics Roadmap

Analytics Roadmap


πŸ› οΈ Tech Stack

Tool Purpose
MySQL Database engine
DBeaver SQL client
Git / GitHub Version control
Power BI Dashboard

sql-data-analytics/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ gold_layer/                       Used for Analytics
β”‚   β”‚   β”œβ”€β”€ gold_dim_customers.csv
β”‚   β”‚   β”œβ”€β”€ gold_dim_products.csv
β”‚   β”‚   └── gold_fact_sales.csv
β”‚   └── report_layer/                     Used for Power BI Dashboard
β”‚       β”œβ”€β”€ gold.report_customers.csv     
β”‚       └── gold.report_products.csv      
β”‚
β”œβ”€β”€ docs/
β”‚   └── analytics_roadmap.png
β”‚
β”œβ”€β”€ powerbi_dashboard/
β”‚   β”œβ”€β”€ powerbi_dashboard.pbix
β”‚   β”œβ”€β”€ powerbi_dashboard.pdf
β”‚   β”œβ”€β”€ powerbi_dashboard1.png
β”‚   β”œβ”€β”€ powerbi_dashboard2.png
β”‚   └── powerbi_dashboard3.png
β”‚
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ 01_eda_database_exploration.sql
β”‚   β”œβ”€β”€ 02_eda_dimensions_exploration.sql
β”‚   β”œβ”€β”€ 03_eda_date_range_exploration.sql
β”‚   β”œβ”€β”€ 04_eda_measures_exploration.sql
β”‚   β”œβ”€β”€ 05_eda_magnitude_analysis.sql
β”‚   β”œβ”€β”€ 06_eda_ranking_analysis.sql
β”‚   β”œβ”€β”€ 07_ada_change_over_time_analysis.sql
β”‚   β”œβ”€β”€ 08_ada_cumulative_analysis.sql
β”‚   β”œβ”€β”€ 09_ada_performance_analysis.sql
β”‚   β”œβ”€β”€ 10_ada_part_to_whole_analysis.sql
β”‚   β”œβ”€β”€ 11_ada_data_segmentation.sql
β”‚   β”œβ”€β”€ 12_report_customers.sql
β”‚   └── 13_report_products.sql
β”‚
β”œβ”€β”€ LICENSE
β”‚
└── README.md

πŸ” Exploratory Data Analysis (EDA)

# Script What It Explores
1 Database Exploration Tables, columns, data types using INFORMATION_SCHEMA
2 Dimensions Exploration Unique countries, genders, and full product hierarchy
3 Date Range Exploration Order date boundaries, customer age range, shipping speed
4 Measures Exploration Core KPIs β€” total sales, orders, customers, average price
5 Magnitude Analysis Revenue and quantity grouped by category, country, gender
6 Ranking Analysis Top/bottom products and customers using RANK() and PARTITION BY

πŸ“Š Advanced Data Analytics (ADA)

# Script What It Answers
7 Change-Over-Time Monthly and quarterly sales trends, seasonality
8 Cumulative Analysis Running total sales and 3-month moving average
9 Performance Analysis Year-over-Year and Month-over-Month growth using LAG()
10 Part-to-Whole Revenue contribution % by category and country
11 Data Segmentation Customer loyalty, churn risk, and product cost segments
12 Report: Customers Final customer view β€” age group, VIP/Regular/New, AOV
13 Report: Products Final product view β€” performance segment, AOR, lifespan

πŸ“Š Advanced Data Analytics (ADA)

# Script What It Answers
7 Change-Over-Time Monthly and quarterly sales trends, seasonality
8 Cumulative Analysis Running total sales and 3-month moving average
9 Performance Analysis Year-over-Year and Month-over-Month growth using LAG()
10 Part-to-Whole Revenue contribution % by category and country
11 Data Segmentation Customer loyalty, churn risk, and product cost segments
12 Report: Customers Final customer view β€” age group, VIP/Regular/New, AOV
13 Report: Products Final product view β€” performance segment, AOR, lifespan

πŸ“‹ Business Reports

Two reusable VIEW objects created in the Gold layer, ready to connect to any BI tool:

  • gold.report_customers β€” Customer segments, churn risk, recency, average order value
  • gold.report_products β€” Product segments, cost range, average monthly revenue, lifespan

πŸ“Š Power BI Dashboard

Built an interactive 3-page sales dashboard on top of the Gold layer reporting views.

πŸ”— Pages

Page What It Shows
Executive Summary KPI cards, revenue trend, category breakdown, AOV by age group
Customer Intelligence Customer segments, age distribution, top 10 customers, avg monthly spend
Product Performance Product segments, revenue by subcategory, top 10 products, avg monthly revenue

🎨 Design

Element Value
Background #F7F6F2 warm white
Primary accent #C9A227 gold
Secondary #2A9D8F teal
Header #1A1A2E dark navy

πŸ“ Files

File Description
powerbi_dashboard.pbix Power BI source file
powerbi_dashboard.pdf Exported PDF of all 3 pages
powerbi_dashboard1.png Executive Summary screenshot
powerbi_dashboard2.png Customer Intelligence screenshot
powerbi_dashboard3.png Product Performance screenshot

πŸ› οΈ Tools

Tool Purpose
Power BI Dashboard building
DAX KPI measures
CSV Data source from Gold layer views

πŸ“ DAX Measures

Total Sales     = SUM(report_customers[total_sales])
Total Customers = DISTINCTCOUNT(report_customers[customer_key])
Total Orders    = SUM(report_customers[total_orders])
Total Products  = DISTINCTCOUNT(report_products[product_key])

πŸ“Έ Screenshots

Executive Summary Customer Intelligence Product Performance
Executive Summary Customer Intelligence Product Performance

πŸ“ˆ Results & Recommendations

# Business Question Finding Recommendation
1 Who are our most valuable customers (VIPs), and how do their purchasing habits differ from one-time buyers? 80% of customers are "New" (single purchases), while the tiny VIP segment (8.75%) drives massive value with an Average Order Value (AOV) of $2,600+ per order. Retain existing VIP customers through exclusive perks and early access offers, while implementing a loyalty program and personalized follow-up campaigns targeting New customers post-purchase β€” with tiered incentives designed to gradually move them toward Regular and VIP status.
2 Do older demographics spend more than younger ones? The 50+ age group dominates the customer base (12,400 customers) and maintains the highest AOV of $918 (vs. $794 for the 30-39 group). Tailor primary ad copy and imagery to resonate with the dominant 50+ demographic, while creating a secondary campaign targeting the 30-39 age group to diversify the customer base and reduce age-concentration risk.
3 Which product categories drive revenue vs. take up warehouse space? Road Bikes alone generate $14.5M while apparel/clothing items (like Jerseys and Shorts) generate less than $0.3M combined despite taking up catalog space. Reallocate marketing budget toward Road Bike accessories and cross-sell campaigns. For low-performing clothing SKUs, reduce inventory levels and run promotional pricing to clear stock before evaluating discontinuation on a case-by-case basis.

πŸ’‘ Key Insights

  • πŸ’° Total Revenue: $29.4M across all time
  • 🚴 Bikes drive ~97% of revenue β€” Road Bikes alone account for $14.52M
  • πŸ‘₯ 80% of customers are New segment β€” significant upsell opportunity for Regular/VIP conversion
  • πŸ† Top customer: Willie Xu β€” $13,489 in revenue across 6 orders
  • πŸ‘΄ 50+ age group dominates β€” 12,400 customers, the largest demographic
  • πŸ“¦ 130 active SKUs β€” with 50.77% classified as High-Performers

About

SQL analytics & Power BI dashboard built on the Gold layer from Part 1 (data-warehouse-sql-project). Covers EDA, advanced SQL (YoY, cumulative, segmentation), customer/product reports, and a 3-page interactive dashboard.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors