This project explores global COVID-19 deaths and vaccination data using SQL.
The analysis focuses on identifying infection rates, death counts, death percentages and vaccination progress across countries and continents. It also demonstrates practical SQL concepts commonly used in data analysis projects.
The main questions explored in this project include:
- How did total cases compare with total deaths?
- What percentage of the population was infected?
- Which countries had the highest infection rates?
- Which countries and continents had the highest death counts?
- How did vaccination totals grow over time?
This project uses public COVID-19 deaths and vaccination data.
Tables used:
- 'CovidDeaths'
- 'CovidVaccinations'
Typical fields used include:
- location
- date
- population
- total_cases
- new_cases
- total_deaths
- new_deaths
- new_vaccinations
- continent
- SQL
- Joins
- Common Table Expressions (CTEs)
- Temporary Tables
- Window Functions
- Aggregate Functions
- Views
This project includes:
- Filtering and sorting data
- Aggregate analysis
- Percentage calculations
- Grouping by country and continent
- Joining multiple tables
- Running totals with window functions
- CTEs
- Temporary tables
- View creation
The SQL queries cover:
- Initial exploration of COVID deaths and vaccination data
- Total cases vs total deaths
- Total cases vs population
- Countries with the highest infection rate
- Countries with the highest death count
- Continents with the highest death count
- Global summary numbers
- Rolling vaccination totals by location
- Percentage of population vaccinated
- 'covid_data_exploration.sql' - SQL queries for the full analysis
- Infection rates varied significantly across countries.
- Total deaths were concentrated in a smaller set of countries and regions.
- Vaccination progress can be tracked effectively using joins and window functions.
- SQL can be used to turn public health data into clear analytical insights.
Akhtar R. Khan