aws-glue-data-catalog

Here are 32 public repositories matching this topic...

j3-signalroom / apache_flink-kickstarter

Examples of Apache Flink® v2.1 applications showcasing the DataStream API, Table API in Java and Python, and Flink SQL, featuring AWS, GitHub, Terraform, Streamlit, and Apache Iceberg.

Updated Jan 13, 2026
Java

aws-samples / automated-datastore-discovery-with-aws-glue

Star

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated Apr 10, 2025
Python

DivineSamOfficial / SmartCityProject

Star

Smart City Realtime Data Engineering Project

python aws kafka aws-s3 pyspark spark-streaming aws-ec2 aws-athena aws-redshift aws-glue aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

astarrr / aws-glue-backup-tools

Star

A CLI tool to back up and restore AWS Glue catalog resources such as Databases, Tables, and Connections as JSON files. Useful when you don't have AWS Backup or versioning enabled in your account.

backup scala restore aws-glue aws-glue-data-catalog

Updated Apr 18, 2026
Scala

shiv-rna / Youtube-Data-Engineering-Pipeline

Star

This project repo 📺 offers a robust solution meticulously crafted to efficiently manage, process, and analyze YouTube video data leveraging the power of AWS services. Whether you're diving into structured statistics or exploring the nuances of trending key metrics, this pipeline is engineered to handle it all with finesse.

aws youtube aws-lambda aws-s3 aws-cli data-engineering aws-iam aws-athena aws-glue data-engineering-pipeline aws-quicksight aws-glue-data-catalog

Updated Mar 20, 2024
Python

BahBosque / delta-to-iceberg-aws-glue

Star

Tool to migrate Delta Lake tables to Apache Iceberg using AWS Glue and S3

open-source aws spark data-lake migration-tool apache-iceberg delta-lake aws-glue-data-catalog

Updated May 22, 2025

ShubhamMohanty680 / Spotify_end_to_end_data_engineering

Star

It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.

python aws aws-lambda aws-s3 spotify-api data-engineering aws-athena data-engineering-pipeline spotipy-library aws-glue-crawler awscloudwatch aws-glue-data-catalog aws-trigger

Updated Jan 22, 2025
Jupyter Notebook

j3-signalroom / supercharge_streamlit-apache_flink

Star

Engaging, interactive visualizations crafted with Streamlit, seamlessly powered by Apache Flink in batch mode to reveal deep insights from data.

kafka apache-flink flink iceberg apache-iceberg flink-sql streamlit streamlit-dashboard pyflink aws-glue-data-catalog

Updated Dec 1, 2024
Python

ablange / aws-data-lake

Star

Prototype of AWS data lake reference implementation written in Python and Spark: https://aws.amazon.com/solutions/implementations/data-lake-solution/

python aws sql spark aws-s3 aws-sns aws-cloudformation aws-dynamodb aws-athena aws-lambda-python aws-glue aws-glue-data-catalog

Updated Apr 13, 2025
Python

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

Chiragjain0911 / aws-ecommerce-risk-analytics

Star

End-to-end AWS data analytics pipeline for product risk detection and customer dissatisfaction analysis.

sql aws-s3 python-script data-engineering data-analysis powerbi aws-athena aws-glue aws-glue-crawler aws-glue-data-catalog aws-glue-job

Updated Feb 16, 2026
Python

j3-signalroom / ccaf-tableflow-aws_glue-snowflake-kickstarter

Star

This project demonstrates how to use Terraform to enable Tableflow in Kafka to generate and store the Iceberg Table files in an AWS S3 bucket. Then, configure Snowflake to read the Iceberg Tables using AWS Glue Data Catalog and the AWS S3 bucket where Tableflow produces the Iceberg files.

snowflake amazon-s3 confluent-kafka confluent-cloud aws-glue-data-catalog confluent-flink confluent-tableflow

Updated Mar 19, 2026
HCL

wyang10 / AWS-Serverless-ELT-Pipeline-Enterprise

Star

Enterprise track: Step Functions/EventBridge + Glue + data quality on top of the v1 serverless ELT

Updated Jan 3, 2026
Python

subhamay-cloudworks / 0052-agapanthus-cft

Sponsor

Star

Working with Glue Data Catalog and Running the Glue Crawler On Demand

aws-cloudformation aws-glue aws-glue-crawler aws-iam-roles aws-iam-policies aws-glue-data-catalog

Updated May 11, 2023

ev2900 / Iceberg_Glue_register_table

Star

Example using the Iceberg register_table command with AWS Glue and Glue Data Catalog

aws glue iceberg aws-glue apache-iceberg aws-glue-data-catalog

Updated Apr 2, 2026
Python

SadafAsad / LinkedIn-Jobs-Analysis

Star

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

raditpasy25 / AWS-Serverless-ELT-Pipeline

Star

🌟 Build a production-lite serverless ELT pipeline on AWS, enabling efficient data ingestion and transformation from S3 to Parquet with minimal overhead.

Updated Apr 26, 2026
Python

ShreyasLengade / serverless_etl_pipeline

Star

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog

Updated Jun 25, 2024
Python

danilodonato / aws_sentinel

Star

End-to-end AWS Data Engineering project for cloud cost monitoring and automated reporting.

react python aws aws-lambda aws-s3 aws-billing aws-apigateway aws-cloudwatch figma aws-athena aws-amplify-react aws-glue aws-glue-data-catalog

Updated Feb 14, 2026
TypeScript

Tomaslopera / AWS_ECommerce_DW

Star

sql etl crontab aws-s3 aws-ec2 aws-rds powerbi aws-redshift prefect aws-glue-crawler aws-glue-data-catalog

Updated Feb 3, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the aws-glue-data-catalog topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-data-catalog topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-glue-data-catalog

Here are 32 public repositories matching this topic...

j3-signalroom / apache_flink-kickstarter

aws-samples / automated-datastore-discovery-with-aws-glue

DivineSamOfficial / SmartCityProject

astarrr / aws-glue-backup-tools

shiv-rna / Youtube-Data-Engineering-Pipeline

BahBosque / delta-to-iceberg-aws-glue

ShubhamMohanty680 / Spotify_end_to_end_data_engineering

j3-signalroom / supercharge_streamlit-apache_flink

ablange / aws-data-lake

subhamay-cloudworks / 0090-deutzia-cft

Chiragjain0911 / aws-ecommerce-risk-analytics

j3-signalroom / ccaf-tableflow-aws_glue-snowflake-kickstarter

wyang10 / AWS-Serverless-ELT-Pipeline-Enterprise

subhamay-cloudworks / 0052-agapanthus-cft

ev2900 / Iceberg_Glue_register_table

SadafAsad / LinkedIn-Jobs-Analysis

raditpasy25 / AWS-Serverless-ELT-Pipeline

ShreyasLengade / serverless_etl_pipeline

danilodonato / aws_sentinel

Tomaslopera / AWS_ECommerce_DW

Improve this page

Add this topic to your repo