Skip to content

Boby024/data-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

data-engineering

Project 1: ETL (with Airflow and DBT)

Build a pipeline that extracts data from multiple sources (e.g., API, CSV, or database), transforms them with dbt (e.g., cleaning, enrichment), and loads it into a data warehouse (e.g., PostgreSQL, Microsoft SQL Server, Azure Data Lake, MongoDB etc.).
Pipeline Structure
  • Extract: Use Airflow to pull raw data from a source (e.g., API or database) and load it into a raw/staging schema in your data warehouse.
  • Transform: Use dbt to run transformations on staged data into analytics-ready models.
  • Load: dbt transformations typically are the load into the final tables
Steps for Airflow Configuration:
  • Move into folder "etl"
  • Make sure you already hast Database URL where to save airflow details (such as users, roles, dags)
  • Create a .env file where AIRFLOW__DATABASE__SQL_ALCHEMY_CONN and USER detail are saved
  • Make the file "init_airflow.sh" (to init and create new user havinf access to airflow web UI) executable by running the command: chmod +x init_airflow.sh
  • To init the airflow database, run: ./init_airflow.sh 0
  • To create a new user for airflow web UI, run: ./init_airflow.sh 1 and to list all users with: airflow users list
Run Airflow:
  • Run Airflow Web UI: airflow webserver or airflow webserver --port 8080

Project 2: Business Intelligence Tool (Metabase)

An open-source business intelligence application called Metabase had been developed to make data visualization and analysis simple for users without requiring a high level of technical expertise. Metabase is used in the "bi" folder to generate interactive dashboards and reports that offer insightful information and facilitate data-driven decision-making throughout the company. Its intuitive interface facilitates exchanging findings, searching databases, and real-time monitoring of important parameters.

The "bi" folder contains steps and instructions for installing and running Metabase locally.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published