Skip to content

ShreenidhiBD/Automobile-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚗 Automobile Analytics System

📌 Project Overview

This project focuses on end-to-end automobile data analysis using Python. The main objective of this project is to clean, process, analyze, and visualize automobile industry data to generate meaningful business insights.

The project includes:

  • Data Cleaning & Preprocessing
  • Exploratory Data Analysis (EDA)
  • Feature Engineering
  • Outlier Detection
  • Data Visualization
  • Business Insights Generation

🛠 Technologies Used

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Jupyter Notebook

📂 Project Structure

Automobile-Analytics-System/
│
├── data/
│   ├── Automobile_data.csv
│   └── cleaned_automobile_data.csv
│
├── notebooks/
│   └── automobile_analysis.ipynb
│
├── visuals/
│   ├── correlation_heatmap.png
│   ├── price_distribution.png
│   ├── company_boxplot.png
│   ├── horsepower_vs_price.png
│   └── fuel_type_count.png
│
├── reports/
│   └── business_insights.txt
│
├── requirements.txt
└── README.md

📊 Features

✅ Data Cleaning

  • Handled missing values
  • Replaced invalid values
  • Removed duplicate records
  • Converted object data types into numerical format

✅ Exploratory Data Analysis (EDA)

  • Average vehicle price analysis
  • Company-wise price comparison
  • Fuel type analysis
  • Horsepower analysis
  • Correlation analysis

✅ Feature Engineering

  • Price category creation
  • Price per horsepower calculation
  • Mileage analysis
  • Performance categorization

✅ Outlier Detection

  • Boxplot analysis
  • IQR method for detecting extreme price values

✅ Data Visualization

  • Histogram
  • Correlation Heatmap
  • Scatter Plot
  • Box Plot
  • Count Plot
  • Pair Plot

📈 Key Business Insights

  1. BMW and Mercedes-Benz vehicles have the highest average prices in the dataset.

  2. Toyota and Honda cars are more budget friendly compared to luxury brands.

  3. Horsepower has a strong positive relationship with vehicle price.

  4. Diesel vehicles provide better mileage efficiency.

  5. Luxury vehicles contain more pricing outliers.

  6. Sedan body style is the most common vehicle category in the dataset.

  7. Higher engine size generally increases vehicle price and performance.

  8. Mid-range vehicles dominate the automobile market segment.

  9. Gas fuel type vehicles are more common than diesel vehicles.

  10. Correlation analysis showed that horsepower, engine-size, and curb-weight strongly influence car prices.


📷 Project Visualizations

🔥 Correlation Heatmap

Correlation Heatmap


📊 Price Distribution

Price Distribution


🚗 Horsepower vs Price

Horsepower vs Price


📦 Company-wise Price Distribution

Company Boxplot


⛽ Fuel Type Count

Fuel Type Count


▶️ How to Run the Project

Install Required Libraries

pip install -r requirements.txt

Run Jupyter Notebook

jupyter notebook

📌 Skills Demonstrated

  • Python Programming
  • Data Cleaning
  • Exploratory Data Analysis
  • Data Visualization
  • Statistical Analysis
  • Feature Engineering
  • Business Analytics
  • Problem Solving

👩‍💻 Author

Shreenidhi B D

About

End-to-end automobile analytics project using Python, Pandas, NumPy, Matplotlib, and Seaborn for data cleaning, visualization, feature engineering, and business insights generation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors