Skip to content

arpitkanani/EDA

Repository files navigation

📘 Exploratory Data Analysis (EDA) Projects

A collection of EDA projects performed on multiple datasets.
Each project includes data cleaning, preprocessing, visualization, feature engineering, and insights.


📂 Datasets Used (Short Descriptions)

1. Red Wine Quality Dataset

Contains chemical properties of red wine such as acidity, sulphates, alcohol, and a quality score (0–10).
Used to identify which features influence wine quality.

2. Flight Price Dataset

Includes airline, route, duration, date of journey, total stops, and ticket price.
Used to understand factors affecting airfare.

3. Student Performance Dataset

Contains demographic information, parental education, exam preparation, and exam scores (math, reading, writing).
Used to analyze patterns in student achievement.

4. Google Play Store Dataset

Includes app details such as category, rating, reviews, installs, size, price, and content rating.
Used to study app trends and user engagement.


📊 Project Summaries

1. Red Wine Quality – EDA

  • Cleaned dataset and checked missing values
  • Analyzed chemical feature distributions
  • Visualized correlations using heatmap and scatter plots
  • Insight: Higher alcohol increases quality; high volatile acidity reduces quality

2. Flight Price – EDA + Encoding

  • Extracted journey and time-based features
  • Applied Label and One-Hot Encoding
  • Visualized price variation by airline, route, duration
  • Insight: Airline type and number of stops majorly affect ticket price

3. Student Performance – EDA

  • Checked score distributions across subjects
  • Compared performance based on gender, race, parental education
  • Created histograms, boxplots, and pairplots
  • Insight: Test preparation and parental education improve scores

4. Google Play Store – EDA + Feature Engineering

  • Cleaned install count, reviews, size, and price fields
  • Engineered new features: size groups, install buckets, price groups
  • Visualized category distribution, rating trends, install patterns

✔ Key Questions Answered

1. Which category has the largest number of installations?
Games and Communication categories have the highest installs.

2. What are the top 5 most installed apps in each major category?
Extracted top apps per category (e.g., WhatsApp, Messenger, Subway Surfers, etc.).

3. How many apps have a perfect 5 rating?
Only a small number of apps achieve a perfect 5.0 rating.


5. Algerian Forest Fire Dataset - EDA and Feature Engineering

  • other details in the file.
  • For check code of End to End project of this Dataset for predict FWI check forest-fire repo

🧠 Skills Demonstrated

  • Data Cleaning & Preprocessing
  • Missing Value Handling
  • Feature Engineering
  • Label & One-Hot Encoding
  • Data Visualization (Matplotlib, Seaborn)
  • Grouping, Aggregation & Insight Extraction
  • Exploratory Data Narratives

About

performing EDA on different Datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors