Skip to content

Latest commit

 

History

History
118 lines (78 loc) · 6.94 KB

File metadata and controls

118 lines (78 loc) · 6.94 KB

Data Analytics Course Repository

Welcome to my Data Analytics Course Repository! This repository contains a wealth of resources, exercises, and projects from a comprehensive data analytics course.

Table of Contents

Course Overview

This course covers a wide range of data analytics topics, offering in-depth knowledge and practical exercises in each module. Here's a brief overview of what you can expect from each part of the course:

Data Massaging and Data Visualization

Exploratory Data Analysis (EDA)

Numpy and Monte Carlo Simulation

Machine Learning

  • Location: Section: Machine Learning
  • Explore the fundamental concepts of modeling data.
  • data cleaning and preparation, with a strong focus on ensuring data quality and readiness for analysis
  • Gain insights into Machine Learning concepts, including data modeling and imputation.
  • Understand the basics of regression, classification, and clustering.

Time Series

  • Location: Section: Time Series
  • Explore time series concepts such as trend, cycle, and seasonality.
  • Predict and visualize time series data.

SQL

  • Location: Section: SQL
  • Dive into SQL, the database language, to explore and manipulate data.
  • Learn about relational database concepts and design.

Non-Relational Databases and Project Part 2

Data Visualization with Google Data Studio

Cloud Computing and Big Data with Spark and Pyarrow

Cloud Analytics in AWS

  • Location: Section: Cloud Analytics in AWS
  • Learn about the architecture and data lake creation.
  • Perform EDA using Python and SQL in AWS.
  • Present your findings and insights.

Course Projects

Throughout this course, I had the opportunity to work on three exciting projects, each demonstrating different aspects of data analytics. Here's a brief overview of each project:

Project 1: Exploratory Data Analysis (EDA)

  • Location: Project 1: Exploratory Data Analysis II

  • Description: In this project, I conducted an exploratory data analysis (EDA) using Jupyter Notebook. The dataset consisted of sales data for video games. I analyzed and visualized the data to uncover insights and trends. The final deliverable included a Google Slides presentation with a maximum of 7 slides. Each slide featured up to 2 graphics, accompanied by concise explanations (3-4 lines). The presentation concluded with key findings and insights.

Project 2: Supermarket Sales Analysis

  • Location: Project 2: Exploratory Data Analysis with SQL

  • Description: In this project, I delved into supermarket sales data using DBeaver. The dataset contained sales information for a supermarket. I performed data analysis to extract valuable insights. As part of the project, I created a presentation comprising up to 6 slides in Google Slides. The presentation highlighted and explained the analyses conducted.

Project 3: Cloud Analytics in AWS

  • Location: Project 3: Cloud Analytics in AWS

  • Description: In the Cloud Analytics project, I set up a simple Data Lake in AWS, utilizing a CSV file stored in an S3 bucket. I performed ETL (Extract, Transform, Load) operations and connected to the data. My analysis included SQL commands for exploratory data analysis and Python scripting. To conclude, I created a dynamic dashboard with storytelling elements using Google Data Studio to communicate the project's insights effectively.

These projects allowed me to apply the skills and knowledge gained throughout the course, providing practical experience in data analysis and visualization. Furthermore, in this section, I would like to highlight the activities of the modules about ML with a strong focus on ensuring data quality and readiness for analysis and building machine learning models through data cleaning and preparation.

Contributing

If you find errors, have suggestions, or want to contribute improvements to any part of this repository, please feel free to open issues or submit pull requests.