Aviation analytics project with DuckDB, dbt and dlt using the data from AirLabs API.
- Data Architecture
- Running instructions
- Data exploration with the DuckDB UI
- Explore dbt project docs
- Dependency management
- CI/CD
Datasets are ingested as snapshots from the source API.
When loading to the curated tables, they are processed with a merge strategy (using a unique key per dataset).
This preserves the latest version of each record to keep it simple on the querying side.
Run make help to see available commands together with their description.
Build and spin up Docker containers needed for the app:
make docker-up
Get into the dlt container:
make docker-it-dlt
For this step we first need to generate an AirLabs API key (see how to on their website), and set the environment variable AIRLABS_API_KEY. Then run:
make dlt-ingest-source-data
Exit the dlt container and get into the dbt one by running make docker-it-dbt. Then:
make dbt-depsmake dbt-run
Once the models have been run and the data is ready, you can start exploring the data.
Run make duckdb-ui to lauch the DuckDB UI and access it via http://localhost:4213.
Example queries:
SELECT
country_code,
COUNT(*)
FROM
curated.airports
GROUP BY
country_code
ORDER BY
count(*) DESC
LIMIT 10;SELECT
flight_status,
COUNT(*)
FROM
curated.flight_positions
WHERE
processed_timestamp = (SELECT MAX(processed_timestamp) FROM curated.flight_positions)
GROUP BY
ALL
ORDER BY
COUNT(*) DESC;dbt provides auto-generated documentation for the project which can be accessed on https://guidok91.github.io/aviation-analytics/.
Dependabot is configured to periodically upgrade repo dependencies. See dependabot.yml.
A Github Actions CI/CD pipeline is defined here and can be seen here.
Note that the AIRLABS_API_KEY is provided as a Github repository secret to be used in the CI/CD pipeline.

