data-engineering

Project 1: ETL (with Airflow and DBT)

Build a pipeline that extracts data from multiple sources (e.g., API, CSV, or database), transforms them with dbt (e.g., cleaning, enrichment), and loads it into a data warehouse (e.g., PostgreSQL, Microsoft SQL Server, Azure Data Lake, MongoDB etc.).
Pipeline Structure

Extract: Use Airflow to pull raw data from a source (e.g., API or database) and load it into a raw/staging schema in your data warehouse.
Transform: Use dbt to run transformations on staged data into analytics-ready models.
Load: dbt transformations typically are the load into the final tables

Steps for Airflow Configuration:

Move into folder "etl"
Make sure you already hast Database URL where to save airflow details (such as users, roles, dags)
Create a .env file where AIRFLOW__DATABASE__SQL_ALCHEMY_CONN and USER detail are saved
Make the file "init_airflow.sh" (to init and create new user havinf access to airflow web UI) executable by running the command: chmod +x init_airflow.sh
To init the airflow database, run: ./init_airflow.sh 0
To create a new user for airflow web UI, run: ./init_airflow.sh 1 and to list all users with: airflow users list

Run Airflow:

Run Airflow Web UI: airflow webserver or airflow webserver --port 8080

Project 2: Business Intelligence Tool (Metabase)

An open-source business intelligence application called Metabase had been developed to make data visualization and analysis simple for users without requiring a high level of technical expertise. Metabase is used in the "bi" folder to generate interactive dashboards and reports that offer insightful information and facilitate data-driven decision-making throughout the company. Its intuitive interface facilitates exchanging findings, searching databases, and real-time monitoring of important parameters.

The "bi" folder contains steps and instructions for installing and running Metabase locally.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
bi		bi
etl		etl
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data-engineering

Project 1: ETL (with Airflow and DBT)

Project 2: Business Intelligence Tool (Metabase)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Boby024/data-engineering

Folders and files

Latest commit

History

Repository files navigation

data-engineering

Project 1: ETL (with Airflow and DBT)

Project 2: Business Intelligence Tool (Metabase)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages