CO2 vs. Temperature Exercise (Databricks)

This repository contains exercises in Databricks that ingests Global Temperature and Global Temperature By Country data from Kaggle and CO2 Emissions data from OWID and transforms it. The goal of this exercise is to teach some basics about data wrangling and Spark with respect to real world questions.

Which countries are worse-hit (higher temperature anomalies)?
Which countries are the biggest emitters?
What are some attempts of ranking “biggest polluters” in a sensible way?

Data Sources

In order to answer some of the questions of the exercise, we picked open-source data from Open World in Data (OWID) and Kaggle.

The specific datasets:

Data Sources (Modified!)

Since the point of this exercise is to learn how to work with data and the datasets from OWID and Kaggle are both too clean and curated, a set of dirtied data is provided.

They can be found at:

Prerequisites

Basic knowledge of Python
Basic knowledge of Spark
Databricks Community Edition (free) account

Create a Databricks Community Account

Navigate the Databricks Community Login Page
Click Signup

Fill in your details, click "Get Started For Free"
SCROLL TO THE BOTTOM to create a Community Account

Data Ingestion

Clone this repo if you haven't already done so
Open Data Ingestion CO2 vs Temperature.py in Databricks Community Edition
Follow instructions, move on to following exercises once all tests pass.
Solutions can be found here.

Data Transformation

Clone this repo if you haven't already done so
Open Data Transformation CO2 vs Temperature.py in Databricks Community Edition
Follow instructions, move on to following exercises once all tests pass.
Solutions can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data-ingestion		data-ingestion
data-transformation		data-transformation
.gitignore		.gitignore
README.md		README.md
databricks-create-account.png		databricks-create-account.png
databricks-get-started.png		databricks-get-started.png
databricks-import.png		databricks-import.png
databricks-signup.png		databricks-signup.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CO2 vs. Temperature Exercise (Databricks)

Data Sources

Data Sources (Modified!)

Prerequisites

Create a Databricks Community Account

Data Ingestion

Data Transformation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CO2 vs. Temperature Exercise (Databricks)

Data Sources

Data Sources (Modified!)

Prerequisites

Create a Databricks Community Account

Data Ingestion

Data Transformation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages