Skip to content

DenisKiberaWanjohi/Data-Cleaning-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 

Repository files navigation

🧹 Data Cleaning Project: Analyzing and Cleaning a Customer Dataset

πŸ“˜ Overview

This project focuses on analyzing and cleaning a customer dataset to ensure data quality, accuracy, and consistency before further analysis. The dataset contains information such as customer names, ages, email addresses, and purchase history.
Using Python and Pandas, this project demonstrates practical data cleaning techniques that prepare raw customer data for analytics, visualization, and business insights.


🧠 Project Objectives

  • Identify and handle missing values appropriately.
  • Remove duplicate records to maintain data uniqueness.
  • Standardize data types for accurate computations.
  • Extract meaningful features from existing data fields.
  • Prepare a structured, analysis-ready dataset for downstream use.

βš™οΈ Tools & Libraries

Category Tools / Libraries
Programming Language Python
Data Manipulation Pandas, NumPy
Data Validation Regex, datetime
File Handling CSV

πŸ“ˆ Key Results

  • Removed all duplicates and standardized column data types.
  • Added derived features such as Domain and Total Purchases to enrich the dataset.
  • Ensured 100% completeness across critical fields for analytics readiness.
  • Delivered a clean, structured dataset ideal for visualization or predictive modeling.

πŸš€ Learning Outcomes

  • Strengthened data wrangling skills using Pandas.
  • Practiced real-world cleaning techniques for customer datasets.
  • Demonstrated the importance of feature engineering in data preparation.

About

This project focuses on cleaning and structuring a customer dataset using Python and Pandas. It involves identifying and handling missing values, removing duplicates, standardizing data types, and engineering new features such as email domains and total purchase counts. The result is a clean, consistent, and analysis-ready dataset .

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors