From a57d8c63d35159eab602d469234e3cabc4bae4a1 Mon Sep 17 00:00:00 2001 From: Raghvendra Bankar <162977734+raghavbankar@users.noreply.github.com> Date: Tue, 17 Feb 2026 19:37:26 +0530 Subject: [PATCH 1/3] Revise README for improved project documentation Updated README.md to enhance clarity and detail about ClickML's features, architecture, and installation process. --- README.md | 158 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 148 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index df4f7ae..f8ce9c6 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,156 @@ -# ClickML - build MLOps workflow (just click, save and use) +# πŸš€ ClickML – End-to-End ML Lifecycle Platform -#### ClickML is a low-code/no-code platform that helps MLOps engineers and data teams to create end-to-end ML pipelines β€” from ETL to model training and deployment β€” all through a simple, click-based interface. +#### ClickML is a modular, full-stack MLOps platform that converts UI-based workflow actions into executable machine learning jobs. +#### It manages the complete ML lifecycle β€” from data ingestion and preprocessing to pretraining, fine-tuning, quantization, registry tracking, and deployment. +#### Designed for scalability, reproducibility, and hardware compatibility. -## Architecture +image +--- + +## πŸ“š Table of Contents +## πŸ“‘ Table of Contents + +- [🎯 Vision](#-vision) +- [Objectives](#objectives) +- [🧠 Core Capabilities](#-core-capabilities) +- [πŸ—οΈ System Architecture](#%EF%B8%8F-system-architecture) +- [πŸ› οΈ Tech Stack](#️-tech-stack) +- [βš™οΈ Installation & Setup](#️-installation--setup) +- [πŸ”„ Example Workflow](#-example-workflow-in-clickml) +- [πŸ“Š Why ClickML Stands Out](#-why-clickml-stands-out) +- [πŸ” Future Roadmap](#-future-roadmap) + + +## 🎯 Vision + +ClickML simplifies complex ML engineering workflows into structured, traceable pipelines without sacrificing flexibility or control. + +It is built for: +- ML Engineers +- AI Researchers +- Data Engineers +- Students building production-grade ML systems + +--- +# Objectives +- To allow users to create configurable ETL pipelines. +- To automate pipeline scheduling using Apache Airflow. +- To provide a no-code machine learning model creation interface. +- To store processed and raw data in the user’s database. +- To support model training for regression and classification problems. +- To generate pickle files and comprehensive model reports. +- To deploy ML models via API endpoints. +- To create a robust frontend for seamless user interaction. + +# 🧠 Core Capabilities + +## 1️⃣ Data Governance & ETL Engine +- Structured dataset ingestion +- Data version tracking +- Pipeline-based transformations +- Validation & schema enforcement +- Reproducible preprocessing jobs + +- image + + +## 2️⃣ Model Training Engine +- Pretraining workflows +- Supports Multiple Models: + - Linear Regression + - Random Forest Regression + - Decision Tree Regression + - Random Forest Classification + - Decision Tree Classification +- Hyperparameter configuration via UI +- Distributed training support (Docker-ready) +- Training logs & metrics tracking + +image + +## 3️⃣ Deployment Layer +- FastAPI-based inference endpoints +- Containerized model serving +- Production-ready deployment structure + +## 4️⃣ Workflow Orchestration +- Airflow-integrated job scheduling +- Modular DAG execution +- Background task management +- Retry & failure handling +image + +--- + +# πŸ—οΈ System Architecture + +ClickML follows a modular microservice-style structure: ClickMLPlatform -## Features -- Drag-and-drop pipeline builder -- ETL pipeline execution (transform, clean, normalize) -- Train ML models via dedicated ML backend -- Model serialization and deployment (FastAPI endpoints) -- View logs, metrics, and monitor deployed models -- Modular architecture for easy scaling and team collaboration --- + +# πŸ› οΈ Tech Stack + +| Layer | Technology Used | +|-------------------|----------------| +| Frontend | TypeScript + React | +| Backend API | Python + FastAPI | +| Workflow Engine | Apache Airflow | +| ML Framework | PyTorch / Scikit-learn | +| Containerization | Docker | +| Orchestration | Docker Compose | + +--- + +# βš™οΈ Installation & Setup + +## πŸ”Ή Prerequisites + +- Python 3.9+ +- Node.js 18+ +- Docker & Docker Compose +- Git + +--- +## πŸ”Ή Backend Setup +cd Backend + +pip install -r requirements.txt + +## πŸ”Ή Frontend Setup +cd Frontend/clickml + +npm install + +npm run dev + +## πŸ”Ή Start Full System (Recommended) +docker compose up --build + +## πŸ”„ Example Workflow in ClickML + +image + +# πŸ“Š Why ClickML Stands Out + +- Full ML lifecycle coverage +- Built-in reproducibility +- UI β†’ executable pipeline conversion +- Model lineage tracking +- Registry-driven deployment +- Modular & scalable architecture + +# πŸ” Future Roadmap + +- RAG pipeline integration + +- LLM fine-tuning modules + +- Experiment tracking dashboard + +- Kubernetes deployment support + +- Multi-user workspace system From 02d4a43fad4aff4463298ad04d7f1ab475b31048 Mon Sep 17 00:00:00 2001 From: Raghvendra Bankar <162977734+raghavbankar@users.noreply.github.com> Date: Wed, 18 Feb 2026 18:40:09 +0530 Subject: [PATCH 2/3] Refactor README for improved clarity and structure Updated the README to improve formatting and structure, including changes to headings and the table of contents. --- README.md | 85 +++++++++++++------------------------------------------ 1 file changed, 20 insertions(+), 65 deletions(-) diff --git a/README.md b/README.md index f8ce9c6..32cf9a1 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# πŸš€ ClickML – End-to-End ML Lifecycle Platform +# ClickML – End-to-End ML Lifecycle Platform #### ClickML is a modular, full-stack MLOps platform that converts UI-based workflow actions into executable machine learning jobs. #### It manages the complete ML lifecycle β€” from data ingestion and preprocessing to pretraining, fine-tuning, quantization, registry tracking, and deployment. @@ -7,21 +7,24 @@ image --- -## πŸ“š Table of Contents -## πŸ“‘ Table of Contents +## Table of Contents -- [🎯 Vision](#-vision) +- [System Architecture](#system-architecture) +- [Vision](#vision) - [Objectives](#objectives) -- [🧠 Core Capabilities](#-core-capabilities) -- [πŸ—οΈ System Architecture](#%EF%B8%8F-system-architecture) -- [πŸ› οΈ Tech Stack](#️-tech-stack) -- [βš™οΈ Installation & Setup](#️-installation--setup) -- [πŸ”„ Example Workflow](#-example-workflow-in-clickml) -- [πŸ“Š Why ClickML Stands Out](#-why-clickml-stands-out) -- [πŸ” Future Roadmap](#-future-roadmap) +- [Core Capabilities](#core-capabilities) +- [Example Workflow](#example-workflow-in-clickml) +- [Why ClickML Stands Out](#why-clickml-stands-out) +- [Future Roadmap](#future-roadmap) - -## 🎯 Vision +# System Architecture + +ClickML follows a modular microservice-style structure: + +ClickMLPlatform + +--- +## Vision ClickML simplifies complex ML engineering workflows into structured, traceable pipelines without sacrificing flexibility or control. @@ -42,7 +45,7 @@ It is built for: - To deploy ML models via API endpoints. - To create a robust frontend for seamless user interaction. -# 🧠 Core Capabilities +# Core Capabilities ## 1️⃣ Data Governance & ETL Engine - Structured dataset ingestion @@ -82,59 +85,11 @@ It is built for: --- -# πŸ—οΈ System Architecture - -ClickML follows a modular microservice-style structure: - -ClickMLPlatform - - - ---- - -# πŸ› οΈ Tech Stack - -| Layer | Technology Used | -|-------------------|----------------| -| Frontend | TypeScript + React | -| Backend API | Python + FastAPI | -| Workflow Engine | Apache Airflow | -| ML Framework | PyTorch / Scikit-learn | -| Containerization | Docker | -| Orchestration | Docker Compose | - ---- - -# βš™οΈ Installation & Setup - -## πŸ”Ή Prerequisites - -- Python 3.9+ -- Node.js 18+ -- Docker & Docker Compose -- Git - ---- -## πŸ”Ή Backend Setup -cd Backend - -pip install -r requirements.txt - -## πŸ”Ή Frontend Setup -cd Frontend/clickml - -npm install - -npm run dev - -## πŸ”Ή Start Full System (Recommended) -docker compose up --build - -## πŸ”„ Example Workflow in ClickML +## Example Workflow in ClickML image -# πŸ“Š Why ClickML Stands Out +# Why ClickML Stands Out - Full ML lifecycle coverage - Built-in reproducibility @@ -143,7 +98,7 @@ docker compose up --build - Registry-driven deployment - Modular & scalable architecture -# πŸ” Future Roadmap +# Future Roadmap - RAG pipeline integration From fcb888c02181685db101e69c93044df350f9b1ae Mon Sep 17 00:00:00 2001 From: Chandra Kumar Rajwal <122195878+imckr@users.noreply.github.com> Date: Thu, 19 Feb 2026 18:17:47 +0530 Subject: [PATCH 3/3] Revise README for clarity and add visual elements Updated README.md to enhance structure and add badges. --- README.md | 169 +++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 155 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 32cf9a1..626cf51 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,151 @@ -# ClickML – End-to-End ML Lifecycle Platform +
+ +# ClickML +### End-to-End ML Lifecycle Platform +![Static Badge](https://img.shields.io/badge/license-Apache--2.0-blue) +![Static Badge](https://img.shields.io/badge/Build_ClickML_Docker_Images-passing-green?logo=github) +![Static Badge](https://img.shields.io/badge/PostgreSQL-blue?logo=postgresql&logoColor=white) +![Static Badge](https://img.shields.io/badge/FastAPI-white?logo=fastapi&logoColor=%23009688) +--- +![Apache Airflow](https://img.shields.io/badge/Apache%20Airflow-017CEE?style=for-the-badge&logo=apacheairflow&logoColor=white) +![PostgreSQL](https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white) +![Amazon S3](https://img.shields.io/badge/Amazon%20S3-FF9900?style=for-the-badge&logo=amazons3&logoColor=white) +![GitHub Actions](https://img.shields.io/badge/GitHub%20Actions-2088FF?style=for-the-badge&logo=githubactions&logoColor=white) +![Docker](https://img.shields.io/badge/Docker-2496ED?style=for-the-badge&logo=docker&logoColor=white) + +
+ +ClickML is a modular, full-stack MLOps platform that converts UI-based workflow actions into executable machine learning jobs. It manages the complete ML lifecycle β€” from data ingestion and preprocessing to pretraining, fine-tuning, quantization, registry tracking, and deployment. Designed for scalability, reproducibility, and hardware compatibility. + +ClickMLPlatform + -#### ClickML is a modular, full-stack MLOps platform that converts UI-based workflow actions into executable machine learning jobs. -#### It manages the complete ML lifecycle β€” from data ingestion and preprocessing to pretraining, fine-tuning, quantization, registry tracking, and deployment. -#### Designed for scalability, reproducibility, and hardware compatibility. +## Workflow in ClickML + +```mermaid +flowchart TB + %% ── Storage Layer (top-left) ────────────────────────────── + subgraph STORAGE["Storage Layer"] + LogFiles[("Log Files\nPostgreSQL")] + ModelDB[("Model File\nDatabase (S3)")] + PlatformDB[("Platform-dependent\nDatabase (PostgreSQL)")] + end + + subgraph USER_STORE["User Storage"] + UserData[("User Data")] + UserDatabase[("User Database")] + end + + %% ── Auth ────────────────────────────────────────────────── + UserA(["User"]) + UserB(["User"]) + SignUp["Sign Up"] + Login["Login"] + + UserA --> SignUp --> UserDatabase + UserB --> Login --> UserDatabase + UserDatabase --> |"Authenticate"| InteractionLayer + + %% ── ETL Pipeline (top-center) ───────────────────────────── + subgraph ETL["Airflow – ETL Pipeline"] + RunDAG["Run DAG"] + Extract["Extract Data"] + Transform["Transform"] + Load["Load Data"] + Trigger1{{"Trigger"}} + + RunDAG --> Extract --> Transform --> Load --> Trigger1 + end + + APIConfig["API's State\nEndpoint / Secret Key"] --> ETL + ETL --> |"Logs"| LogFiles + Trigger1 --> |"Database?"| DBCheck{{"DB?"}} + DBCheck --> |"Yes"| DataLake + DBCheck --> |"No – Fetch data"| DataLake + + DataLake[("Data Lake /\nWarehouse")] + + %% ── Interaction Layer ───────────────────────────────────── + subgraph InteractionLayer["Interaction Layer"] + direction TB + PipelineCreate["Data Pipeline Creation"] + MLPipeline["ML / DL Pipeline"] + ModelDeploy["Model Deployment"] + end + + PipelineCreate --> |"Format (optional)\nTransform data format"| ETL + PipelineCreate --> |"Database: hostname,\npassword, dbname"| ETL + PipelineCreate --> |"Trigger Time"| ETL + + MLPipeline --> ModelSelection["Model Selection"] + ModelDeploy --> Redeploy["Redeploy"] + ModelDeploy --> ModelFileSelection["Model File Selection"] + + %% ── ML Training Pipeline (right) ───────────────────────── + subgraph TRAINING["ML Training Pipeline"] + direction TB + Trigger2{{"Trigger"}} + DataPreprocess["Data Preprocess"] + ModelTrain["Model Train"] + Evaluation["Evaluation"] + TestVal["Test Validation"] + + Trigger2 --> DataPreprocess --> ModelTrain --> TestVal --> Evaluation + end + + ModelSelection --> |"Model Type"| TRAINING + ModelSelection --> |"Parameters"| TRAINING + ModelSelection --> |"Input/Output Features"| TRAINING + DataLake --> |"Fetch Data"| TRAINING + + Evaluation --> ModelReport["Model Report"] + Evaluation --> ModelPKL[("Model\n(.pkl) File")] + TRAINING --> |"Logs"| TrainingLogs[("Model Training\nLogs")] + ModelPKL --> |"Storing output files"| DataLake + + %% ── Deployment (bottom-center) ──────────────────────────── + ModelFileSelection --> |"Model File (.pkl)"| DeployFlow + + subgraph DeployFlow["Deployment Flow"] + FastAPI["Create Fast API Server"] + EC2Deploy["Deploy on AWS EC2"] + OutputJob["Output – Server Job"] + + FastAPI --> EC2Deploy --> OutputJob + end + + DeployFlow --> |"Logs"| LogFiles + + %% ── Infrastructure (bottom-left) ────────────────────────── + subgraph INFRA["Infrastructure (AWS)"] + Terminal[">_ Terminal\nssh -i print-key clickml@ec2-ip"] + EC2["EC2 Instance"] + RDS[("RDS")] + ClickMLDB[("clickml-database")] + + Terminal --> |"connect@ssh username"| EC2 + EC2 --> |"Insert username\n+ password"| RDS + RDS --> ClickMLDB + end + + InteractionLayer --> |"Send Models"| ModelSelection + ModelPKL --> ModelFileSelection + STORAGE --> InteractionLayer + + %% ── Styles ──────────────────────────────────────────────── + classDef storage fill:#4a90d9,stroke:#2c5f8a,color:#fff + classDef process fill:#f9f3d9,stroke:#c8a84b,color:#333 + classDef decision fill:#ffe0b2,stroke:#e65100,color:#333 + classDef infra fill:#e8f5e9,stroke:#388e3c,color:#333 + classDef io fill:#fce4ec,stroke:#c62828,color:#333 + + class LogFiles,ModelDB,PlatformDB,UserData,UserDatabase,DataLake,TrainingLogs,ModelPKL,ClickMLDB,RDS storage + class Extract,Transform,Load,RunDAG,DataPreprocess,ModelTrain,Evaluation,FastAPI,EC2Deploy,ModelSelection process + class Trigger1,Trigger2,DBCheck decision + class EC2,Terminal infra + class ModelReport,OutputJob io +``` -image --- ## Table of Contents @@ -21,7 +162,6 @@ ClickML follows a modular microservice-style structure: -ClickMLPlatform --- ## Vision @@ -47,17 +187,20 @@ It is built for: # Core Capabilities -## 1️⃣ Data Governance & ETL Engine +## Data Governance & ETL Engine - Structured dataset ingestion - Data version tracking - Pipeline-based transformations - Validation & schema enforcement - Reproducible preprocessing jobs +- image +--- - image -## 2️⃣ Model Training Engine +## Model Training Engine + - Pretraining workflows - Supports Multiple Models: - Linear Regression @@ -71,24 +214,22 @@ It is built for: image -## 3️⃣ Deployment Layer +## Deployment Layer - FastAPI-based inference endpoints - Containerized model serving - Production-ready deployment structure -## 4️⃣ Workflow Orchestration +## Workflow Orchestration - Airflow-integrated job scheduling - Modular DAG execution - Background task management - Retry & failure handling + + image --- -## Example Workflow in ClickML - -image - # Why ClickML Stands Out - Full ML lifecycle coverage