Skip to content

not-shivansh/Scorix-AI-Evaluation-Ranking-Platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Scorix AI — AI Evaluation & Ranking Platform

Production-grade AI evaluation system that scores, ranks, and improves AI-generated responses using machine learning and real-time APIs.


🌟 Overview

Scorix AI is a full-stack AI benchmarking platform that simulates enterprise-level LLM evaluation systems.

It allows you to:

  • ⚡ Evaluate AI responses using ML models
  • 🧠 Rank multiple responses intelligently
  • 📊 Track evaluation logs & feedback
  • 🔁 Retrain models with new datasets

🎯 Key Features

🧪 AI Evaluation Engine

  • Scores responses using:

    • TF-IDF vectorization
    • Cosine similarity
    • Feature engineering
  • Output: 0–10 quality score


🏆 Ranking System

  • Compare multiple responses for a prompt
  • Returns ranked list based on score

📊 Feedback Learning (RLHF-style)

  • Store human feedback
  • Use for retraining and improvement

🔁 Dataset Upload + Retraining

  • Upload CSV dataset
  • Retrain model via API

🧠 ML Model

⚙️ Architecture

TF-IDF (5000 features)
+ Cosine Similarity
+ Length Features
+ Word Count
+ Keyword Overlap
→ Gradient Boosting Regressor

📈 Model Performance

  • 🔥 R² Score: 0.9658
  • 🔥 MAE: 0.2558
  • 📊 Dataset Size: 50,000 samples

🧠 Architecture

Frontend (Vanilla JS)
        ↓
FastAPI Backend
        ↓
ML Model (TF-IDF + Gradient Boosting)
        ↓
Evaluation + Ranking Engine
        ↓
Database (SQLite)

🛠️ Tech Stack

Layer Technology
Backend FastAPI
ML Model Scikit-learn
NLP TF-IDF + Cosine Similarity
Database SQLite
ORM SQLAlchemy
Frontend HTML, CSS, Vanilla JS
Server Uvicorn

⚡ Quick Start

1️⃣ Install dependencies

pip install -r requirements.txt

2️⃣ Run server

python -m uvicorn backend.main:app --reload

🌐 Open


📡 API Endpoints

🔹 Evaluate Response

POST /evaluate
{
  "prompt": "What is AI?",
  "response": "AI is the simulation of human intelligence in machines."
}

🔹 Rank Responses

POST /rank
{
  "prompt": "What is AI?",
  "responses": [
    "AI is computer intelligence.",
    "AI is the simulation of human intelligence in machines.",
    "AI is random."
  ]
}

🔹 Submit Feedback

POST /feedback

🔹 Upload Dataset + Retrain

POST /upload-dataset

🔹 Health Check

GET /health

📸 Screenshots

image

image

image

image

image

🧠 What Makes This Special

✔ End-to-end ML pipeline (dataset → training → deployment) ✔ Real-time scoring & ranking APIs ✔ Feature-engineered ML model (not just black-box) ✔ RLHF-style feedback system


🚀 Future Improvements

  • 🔁 Reinforcement learning (RLHF loop)
  • 📊 Model comparison leaderboard
  • ⚡ Async batch evaluation
  • 🧠 Deep learning-based scoring

👨‍💻 Author

Shivansh Thakur Linkedin


⭐ Support

If you like this project, give it a ⭐ on GitHub 🚀

About

AI evaluation & ranking platform using FastAPI and ML (R²=0.96) with real-time scoring, feedback learning, and retraining pipelines.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors