📊 E-Commerce Statistical Analysis Dashboard

Portfolio Project — Applying core statistical distributions and theorems to solve real-world e-commerce business problems.

🎯 Project Overview

This project demonstrates how statistical theory translates into business decisions in an e-commerce context. Using 1,200 synthetic customer transactions, four key statistical concepts are modeled, visualized, and interpreted through a business lens.

Tech Stack: Python · NumPy · Pandas · SciPy · Matplotlib

📁 Project Structure

ecommerce-statistical-analysis/
│
├── ecommerce_statistical_analysis.py   # Main analysis script
├── ecommerce_stats_dashboard.png       # Output dashboard (auto-generated)
└── README.md

🔢 Dataset

Synthetic dataset of 1,200 customer transactions with the following schema:

Column	Type	Description
`Customer_ID`	string	Unique customer identifier (e.g. CUST_00001)
`Purchase_Amount`	float	Transaction value in USD
`Conversion_Success`	int (0/1)	Whether the session led to a purchase
`Arrival_Time`	int	Number of orders arriving in a given hour

📐 Statistical Analyses

① Normal Distribution & Z-Score

Goal: Detect anomalous purchase amounts (outliers).

Purchase amounts modeled as N(μ=105.26, σ=101.40)
Customers with |Z| > 3 flagged as statistical outliers
22 outliers detected out of 1,200 transactions

💼 Business Insight: Flagged customers are candidates for VIP upselling programs or fraud review queues — protecting both revenue growth and loss prevention simultaneously.

② Binomial Distribution

Goal: Model conversion probability across a batch of sessions.

Parameters: n = 200 sessions, p = 0.05 conversion rate
Calculates P(X = k) for all k using scipy.stats.binom
Example: P(X = 8) = 0.1137

💼 Business Insight: Knowing the exact probability of hitting a conversion count helps marketing teams set data-backed KPIs and allocate ad spend without over- or under-estimating campaign outcomes.

③ Poisson Distribution

Goal: Model hourly order arrival rates and predict peak surges.

Average rate: λ = 12 orders/hour
Probability of a surge (>20 orders/hour): P(X > 20) = 0.0116

💼 Business Insight: Even a ~1% surge probability at scale means hundreds of understaffed hours per year — Poisson modeling lets logistics teams proactively schedule warehouse capacity before crunch hits.

④ Central Limit Theorem (CLT)

Goal: Demonstrate that sample means converge to Normal regardless of source distribution.

Source: Exponential distribution (heavily right-skewed)
Simulated 2,000 samples at sizes: n = 5, 30, 100, 500
As n increases, the sample mean distribution converges to N(μ≈50, σ→0)

💼 Business Insight: CLT justifies using small customer surveys to estimate population-wide spending patterns — enabling confident business decisions without surveying every customer.

🚀 How to Run

1. Clone the repository:

git clone https://github.com/thed700/ecommerce-statistical-analysis.git
cd ecommerce-statistical-analysis

2. Install dependencies:

pip install numpy pandas scipy matplotlib

3. Run the analysis:

python ecommerce_statistical_analysis.py

The script will:

Generate the synthetic dataset
Print statistical results to console
Save ecommerce_stats_dashboard.png to the current directory

📊 Key Results Summary

Analysis	Key Metric	Business Value
Normal + Z-Score	22 outliers detected (	Z
Binomial	P(X=8 \| n=200, p=0.05) = 0.1137	Realistic KPI setting
Poisson	P(surge>20/hr) = 0.0116	Proactive staffing
CLT	Exponential → Normal as n→∞	Survey-based inference

🛠 Skills Demonstrated

Statistical distribution modeling (Normal, Binomial, Poisson)
Hypothesis-driven outlier detection with Z-scores
Monte Carlo simulation for CLT demonstration
Publication-quality data visualization with Matplotlib
Translating statistical outputs into business recommendations

👤 Author

Akmal Raxmatov

GitHub: @thed700
Focus: Data Analytics · Economic Analysis · Statistical Modeling

This project is part of a self-directed data analytics portfolio targeting Junior Data Analyst roles.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ecommerce_statistical_analysis.py.swo		.ecommerce_statistical_analysis.py.swo
.ecommerce_statistical_analysis.py.swp		.ecommerce_statistical_analysis.py.swp
README.md		README.md
ecommerce_statistical_analysis.py		ecommerce_statistical_analysis.py
ecommerce_stats_dashboard.png		ecommerce_stats_dashboard.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 E-Commerce Statistical Analysis Dashboard

🎯 Project Overview

📁 Project Structure

🔢 Dataset

📐 Statistical Analyses

① Normal Distribution & Z-Score

② Binomial Distribution

③ Poisson Distribution

④ Central Limit Theorem (CLT)

🚀 How to Run

📊 Key Results Summary

🛠 Skills Demonstrated

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 E-Commerce Statistical Analysis Dashboard

🎯 Project Overview

📁 Project Structure

🔢 Dataset

📐 Statistical Analyses

① Normal Distribution & Z-Score

② Binomial Distribution

③ Poisson Distribution

④ Central Limit Theorem (CLT)

🚀 How to Run

📊 Key Results Summary

🛠 Skills Demonstrated

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages