Skip to content

Latest commit

 

History

History
216 lines (134 loc) · 5.11 KB

File metadata and controls

216 lines (134 loc) · 5.11 KB

🛒 Zepto E-commerce SQL Data Analysis Project

A complete PostgreSQL-based SQL project using a real-world e-commerce inventory dataset from Zepto, one of India’s fastest-growing quick-commerce startups. This project demonstrates end-to-end data analytics skills — from data exploration to business-focused insights — using real analyst techniques like CASE, JOINS, CTEs, and Subqueries.


📌 Project Highlights

  • ✅ Realistic SQL project with retail use-case
  • 🧹 Data cleaning & transformation (e.g. paise → rupees)
  • 📊 Business insights: revenue, stock status, product value
  • 🧠 Uses advanced SQL techniques: CASE, JOIN, CTEs, Subqueries
  • 💼 Perfect for interview prep, portfolio building, or SQL learning

📁 Dataset Summary

  • Source: Kaggle (scraped from Zepto’s mobile app)
  • Real-world e-commerce catalog structure
  • Duplicate product names with different packaging/weights
  • Each row = unique SKU (Stock Keeping Unit)

🔧 Columns

Column Description
sku_id Unique product ID
name Product name
category Product category (e.g. Fruits, Snacks)
mrp Max Retail Price (₹)
discountPercent % discount on MRP
discountedSellingPrice Final discounted price
availableQuantity Inventory units available
weightInGms Product weight (grams)
outOfStock Boolean - Is the product out of stock?
quantity No. of units per SKU

🚀 SQL Features Demonstrated

🧹 Data Cleaning

  • Removed products with MRP or price = 0
  • Converted mrp & discountedSellingPrice from paise to rupees
  • Handled null values and inconsistent entries

🔍 Data Exploration

  • Counted total rows
  • Identified distinct categories
  • Compared stock vs out-of-stock
  • Detected repeated product names (same product, different SKU)

📊 Business Queries

  • Top discounted products
  • High MRP but out-of-stock products
  • Revenue by category
  • Expensive products with low discount
  • Avg. discount by category
  • Price-per-gram value ranking

✅ 1. Top 10 products with highest discount %

SELECT DISTINCT name, mrp, discountPercent
FROM zepto
ORDER BY discountPercent DESC
LIMIT 10;

📸 Output Screenshots

Screenshot 2025-08-04 012115

✅ 2. High-MRP products that are out of stock

SELECT DISTINCT name, mrp 
FROM zepto
WHERE outOfStock = TRUE AND mrp > 300
ORDER BY mrp DESC;

📸 Output Screenshots

Screenshot 2025-08-04 012138

✅ 3. Estimated revenue per category

SELECT category,
SUM(discountedSellingPrice * availableQuantity) AS total_revenue
FROM zepto
GROUP BY category
ORDER BY total_revenue DESC;

📸 Output Screenshots

Screenshot 2025-08-04 012202

✅ 4. Expensive products (MRP > ₹500) with low discount (< 10%)

SELECT DISTINCT name, mrp, discountPercent
FROM zepto
WHERE mrp > 500 AND discountPercent < 10
ORDER BY mrp DESC, discountPercent DESC;

📸 Output Screenshots

Screenshot 2025-08-04 012222

✅ 5. Top 5 categories with highest average discount %

SELECT category,
ROUND(AVG(discountPercent), 2) AS avg_discount
FROM zepto
GROUP BY category 
ORDER BY avg_discount DESC
LIMIT 5;

📸 Output Screenshots

Screenshot 2025-08-04 012239

✅ 6. Price per gram (value for money)

SELECT DISTINCT name, weightInGms, discountedSellingPrice,
ROUND(discountedSellingPrice / weightInGms, 2) AS price_per_gram
FROM zepto
WHERE weightInGms >= 100
ORDER BY price_per_gram;

📸 Output Screenshots

Screenshot 2025-08-04 012255

✅ 7. Classify product by weight (CASE statement)

SELECT DISTINCT name, weightInGms,
  CASE 
    WHEN weightInGms < 1000 THEN 'Low'
    WHEN weightInGms > 5000 THEN 'Medium'
    ELSE 'Bulk'
  END AS weight_category
FROM zepto;

📸 Output Screenshots

Screenshot 2025-08-04 012313

✅ 8. Total inventory weight per category

SELECT category,
SUM(weightInGms * availableQuantity) AS total_weight
FROM zepto
GROUP BY category
ORDER BY total_weight DESC;

📸 Output Screenshots

Screenshot 2025-08-04 012327