Skip to content

KAMILELH/Universal-ETL-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Universal AI Data Preprocessing Agent (ETL Pipeline)

An intelligent, automated Extract, Transform, Load (ETL) pipeline that uses a Large Language Model (Google Gemini) to dynamically sanitize, standardize, and format messy CSV data into pure, algorithm-ready JSON.

Architecture

This project utilizes n8n as the orchestration engine to handle batch processing and API routing, combined with a custom vanilla JavaScript frontend.

  • Frontend: A lightweight HTML/Tailwind CSS interface that sends raw CSV files via a FormData POST request.
  • Backend Pipeline: n8n webhook listener that processes incoming files.
  • Transformation Engine: A Loop architecture that feeds data row-by-row into the Google Gemini LLM.
  • Strict Output Formatting: Uses dynamic JSON Schemas to force the AI to return strict data types (e.g., converting mixed date strings to ISO 8601, inferring booleans, and handling null values).

Tech Stack

  • Orchestration: n8n
  • AI/NLP: Google Gemini 2.5 Flash
  • Frontend: HTML5, Vanilla JavaScript, Tailwind CSS
  • Data Formats: CSV (Input), JSON (Output)

How to Run Locally

  1. Import the etl_pipeline_workflow.json file into your local n8n instance.
  2. Add your Google Gemini API credentials to the AI node.
  3. Activate the workflow.
  4. Open index.html in any browser, upload grades_test.csv, and watch the console output the sanitized JSON array.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages