Skip to content

Latest commit

 

History

History
120 lines (82 loc) · 3.82 KB

File metadata and controls

120 lines (82 loc) · 3.82 KB

🎓 Udemy Courses Data Analysis

Analyzing the dynamics of one of the world’s largest online learning platforms using Python and data visualization tools.


📌 Project Overview

This project involves a comprehensive analysis of a publicly available Udemy courses dataset to uncover key trends in:

  • 📂 Course categories
  • 💰 Pricing models
  • 👨‍🏫 Enrollment patterns
  • ⭐ Course ratings

The goal is to deliver actionable insights that support decision-making for instructors, learners, and platform strategists.

Tools used: Python, Pandas, NumPy, Matplotlib, Seaborn — all within Jupyter Notebook.


🎯 Key Objectives

  • 📚 Understand the distribution of courses across categories and subcategories
  • 💸 Examine pricing trends & contrast free vs paid course patterns
  • 📈 Analyze enrollment data to identify popular content
  • ⭐ Explore correlation between ratings and enrollments
  • 📊 Create insightful visualizations to highlight trends and platform behavior

🧪 Methodology

🧼 Data Cleaning

  • Handled missing values, standardized formats, corrected data types
  • Removed duplicate records to ensure accuracy

📊 Exploratory Data Analysis (EDA)

  • Used descriptive stats, groupby(), and sorting to summarize key trends

📈 Visualizations

  • Employed bar charts, scatter plots, histograms, heatmaps for trend discovery

🔍 Insight Extraction

  • Identified seasonal trends, high-performing categories, price-enrollment patterns, and more

🧰 Technologies & Tools

Tool Purpose
🐍 Python Core language for analysis
📊 Pandas Data manipulation and transformation
➕ NumPy Numerical computations
📈 Matplotlib Static visualizations
🧠 Seaborn Statistical and aesthetic plotting
📓 Jupyter Interactive development environment

📂 Project Structure

Udemy-Courses-Data-Analysis/ ├── data/ │ └── udemy_courses.csv # Raw dataset ├── notebooks/ │ └── udemy_data_analysis.ipynb # Main Jupyter notebook ├── visuals/ │ └── *.png # Generated plots └── README.md # Project documentation

📥 Dataset Source 🔗 Udemy Courses Dataset on Kaggle

Provided by: Nikhil Mittal ~3,500+ courses with attributes such as title, category, price, rating, and enrollment

📌 Key Insights (Summary) ✔️ Most popular course categories include Development, Business, and IT & Software ✔️ Free courses are abundant, but paid courses contribute to majority revenue ✔️ Positive correlation between ratings and enrollments ✔️ Technical courses like data science or coding often command higher prices ✔️ Seasonal trends show spikes in new enrollments during mid-year and year-end sales

📝 Conclusion This analysis showcases how Python + Data Science can be used to generate meaningful insights from educational platform data.

✅ Supports course creators in optimizing content ✅ Aids platform managers in strategic planning ✅ Helps learners find high-value and well-rated courses

"The best way to predict the future is to analyze the data from the past." — Inspired by Peter Drucker

📜 License This project is licensed under the MIT License. Feel free to reuse and modify it for personal or commercial projects with credit.

🤝 Acknowledgements 📊 Dataset by Nikhil Mittal on Kaggle

💡 Inspiration from the growing online education industry

👨‍💻 Author Abinesh M 📧 mabinesh555@email.com 🌐 LinkedIn 💻 GitHub

🌟 Show Your Support If you liked this project:

⭐ Star this repo

🍴 Fork it

🛠️ Suggest improvements

📢 Share with the community