Skip to content

singhgurraj/Intellisearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IntelliSearch

Semantic search with hybrid ranking, RAG answers, and fast file ingestion.

Live Demo → https://intellisearch-eight.vercel.app

Highlights

  • ⚡ Processing 4,000+ documents
  • 🔍 3 search modes (Semantic, Hybrid, RAG)
  • 🤖 AI-powered answers with GPT-4o-mini
  • ⏱️ Sub-500ms retrieval speed
  • 💾 Redis caching for cost optimization

Architecture

User → React Frontend (Vercel)
         ↓
    FastAPI Backend (Railway)
         ↓
    ┌────┴────┬─────────┐
    ↓         ↓         ↓
OpenAI    Pinecone   Redis
(Embed)   (Vectors)  (Cache)

Tech Stack

Frontend

  • React
  • TypeScript
  • Tailwind CSS
  • Vite

Backend

  • Python
  • FastAPI
  • Uvicorn

AI/ML

  • OpenAI

Database + Cache

  • Pinecone
  • Redis

Deploy

  • Railway
  • Vercel

Features

  • Semantic search with OpenAI embeddings (text-embedding-3-small)
  • Hybrid search (vector similarity + keyword overlap)
  • RAG search with gpt-4o-mini for grounded answers
  • File uploads for PDF, DOCX, TXT, and Markdown
  • Chunking for large documents with parent/child metadata
  • Pinecone vector store with auto-index creation
  • Redis caching for embeddings and search responses (optional)
  • Batch ingestion and bulk upload script

Quick Start

  1. Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
  1. Frontend
cd frontend
npm install
npm run dev
  1. Environment

Create backend/.env:

OPENAI_API_KEY=your_openai_key
PINECONE_API_KEY=your_pinecone_key
REDIS_URL=redis://localhost:6379

Optional frontend override:

VITE_API_BASE_URL=http://localhost:8000

API Documentation

Health

  • GET /health

Ingestion

  • POST /documents/ingest
    • Body: { "content": "text", "metadata": {} }
    • Query: chunk=true
  • POST /documents/batch-ingest
    • Body: [{"content": "...", "metadata": {...}}, ...]
    • Query: chunk=true

File Uploads

  • POST /documents/upload
    • Form file: file
    • Query: chunk=true
    • Query: metadata (JSON string)
  • POST /documents/upload-batch
    • Form files: files[]
    • Query: chunk=true
    • Query: metadata (JSON string)

Search

  • POST /search
    • Body: { "query": "text", "top_k": 5 }
  • POST /hybrid-search
    • Body: { "query": "text", "top_k": 5, "vector_weight": 0.7 }
  • POST /rag-search
    • Body: { "query": "text", "top_k": 3 }

Document Management

  • GET /documents
    • Query: limit (default 20), offset (default 0)
  • GET /documents/{id}
  • DELETE /documents/{id} (also deletes chunk children)

Bulk Upload Script

The script backend/bulk_upload.py generates synthetic content and uploads in batches.

cd backend
python bulk_upload.py

It targets the Railway production URL by default. Update API_URL in the script for local ingestion.

Deployment (Railway)

Docker-based deployment via Dockerfile + railway.json. The container runs:

uvicorn app.main:app --host 0.0.0.0 --port ${PORT:-8000}

Required Railway environment variables:l

  • OPENAI_API_KEY
  • PINECONE_API_KEY
  • REDIS_URL (optional)

Author

  • Name: Gurraj Singh

About

Semantic search engine with RAG architecture

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors