Skip to content

Latest commit

 

History

History
42 lines (38 loc) · 2.28 KB

File metadata and controls

42 lines (38 loc) · 2.28 KB

data-engineering

Project 1: ETL (with Airflow and DBT)

Build a pipeline that extracts data from multiple sources (e.g., API, CSV, or database), transforms them with dbt (e.g., cleaning, enrichment), and loads it into a data warehouse (e.g., PostgreSQL, Microsoft SQL Server, Azure Data Lake, MongoDB etc.).
Pipeline Structure
  • Extract: Use Airflow to pull raw data from a source (e.g., API or database) and load it into a raw/staging schema in your data warehouse.
  • Transform: Use dbt to run transformations on staged data into analytics-ready models.
  • Load: dbt transformations typically are the load into the final tables
Steps for Airflow Configuration:
  • Move into folder "etl"
  • Make sure you already hast Database URL where to save airflow details (such as users, roles, dags)
  • Create a .env file where AIRFLOW__DATABASE__SQL_ALCHEMY_CONN and USER detail are saved
  • Make the file "init_airflow.sh" (to init and create new user havinf access to airflow web UI) executable by running the command: chmod +x init_airflow.sh
  • To init the airflow database, run: ./init_airflow.sh 0
  • To create a new user for airflow web UI, run: ./init_airflow.sh 1 and to list all users with: airflow users list
Run Airflow:
  • Run Airflow Web UI: airflow webserver or airflow webserver --port 8080

Project 2: Business Intelligence Tool (Metabase)

An open-source business intelligence application called Metabase had been developed to make data visualization and analysis simple for users without requiring a high level of technical expertise. Metabase is used in the "bi" folder to generate interactive dashboards and reports that offer insightful information and facilitate data-driven decision-making throughout the company. Its intuitive interface facilitates exchanging findings, searching databases, and real-time monitoring of important parameters.

The "bi" folder contains steps and instructions for installing and running Metabase locally.