Skip to content

GaneshN523/Data_Analysis_Toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Advanced Data Analysis & Cleaning Toolkit

This project is an end-to-end system for data cleaning, transformation, exploratory data analysis (EDA), and interactive visualization using only essential dependencies.

Features

  • Data Cleaning Enhancements:

    • Handling missing values (drop rows/columns and impute numeric values).
    • Duplicate removal.
    • Outlier detection (flagging using IQR or Z-score).
    • Automatic data type correction with basic label encoding.
    • (Optional) Normalization using StandardScaler or MinMaxScaler.
  • Data Transformation & Preprocessing:

    • Feature engineering with date/time handling (extract month, day, weekday).
    • Creation of polynomial (squared) features for numeric columns.
  • Exploratory Data Analysis (EDA):

    • Summary statistics and correlation matrices.
    • Advanced analysis: KMeans clustering and PCA.
  • Interactive Visualizations:

    • Interactive histogram and heatmap using Plotly.

Directory Structure

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages