Skip to content

YelveTejas/rag-project

Repository files navigation

Insight Stream

Insight Stream is a private AI research workspace built with Next.js. It gives signed-in users a chat interface for asking questions, attaching images, saving research threads, and grounding answers with local document context or web search fallback.

Features

  • OAuth sign-in with Google and GitHub through Auth.js / NextAuth.
  • Account-scoped chat history stored in MongoDB.
  • Streaming AI responses from Groq for fast, live answer generation.
  • Image-aware prompts with client-side image preview and a 2 MB upload guard.
  • Retrieval-augmented generation using MongoDB vector search over ingested documents.
  • Web search fallback through Tavily when local context is missing or too thin.
  • URL ingestion endpoint that scrapes page text, chunks it, embeds it with Hugging Face, and stores it in MongoDB.
  • Responsive chat UI with a collapsible sidebar, markdown-rendered responses, new-chat flow, chat deletion, and sign-out confirmation.

Tech Stack

  • Next.js 16 App Router
  • React 19
  • Tailwind CSS 4
  • Auth.js / NextAuth
  • MongoDB
  • Groq SDK
  • Hugging Face Inference API
  • Tavily Search API
  • Framer Motion
  • Lucide React

Project Structure

app/
  api/
    auth/[...nextauth]/route.js   Auth route handlers
    chat-history/route.js         Fetch saved chats
    chat-history/[id]/route.js    Delete a saved chat
    ingest/route.js               Ingest URL content into the vector store
    research/route.js             Main streamed chat/research endpoint
  layout.jsx
  page.jsx

components/
  ChatBox.jsx                     Main authenticated chat experience
  MessageBubble.jsx               Markdown chat message renderer
  InputBox.jsx
  ThoughtProcess.jsx

lib/
  agent.js                        Chooses local RAG context or web search fallback
  embeddings.js                   Hugging Face embedding helper
  mongo.js                        MongoDB connection helper
  retriever.js                    MongoDB vector search
  searchWeb.js                    Tavily search helper
  ingestion/                      URL scrape and document storage helpers

Requirements

  • Node.js 20.9 or newer
  • npm
  • MongoDB database, preferably MongoDB Atlas if you want vector search
  • OAuth apps for Google and GitHub
  • API keys for Groq, Tavily, and Hugging Face

Environment Variables

Create .env.local in the repository root. You can start from .env.example, then add the remaining keys used by the app:

AUTH_SECRET=
AUTH_GOOGLE_ID=
AUTH_GOOGLE_SECRET=
AUTH_GITHUB_ID=
AUTH_GITHUB_SECRET=

MONGODB_URI=
GROQ_API_KEY=
TAVILY_API_KEY=
HUGGINGFACE_API_KEY=

# Optional: used by lib/generate.js if that helper is wired in
GEMINI_API_KEY=

Generate an Auth.js secret with:

npx auth secret

For local OAuth development, configure these callback URLs in the provider dashboards:

http://localhost:3000/api/auth/callback/google
http://localhost:3000/api/auth/callback/github

MongoDB Setup

The app uses the rag_db database with two collections:

  • documents: stores ingested text chunks and their embeddings.
  • chats: stores user-specific chat histories.

For retrieval to work, create a vector search index named default on the documents collection using the embedding field. The current embedding model is sentence-transformers/all-MiniLM-L6-v2, which returns 384-dimensional vectors.

Getting Started

Install dependencies:

npm install

Run the development server:

npm run dev

Open the app:

http://localhost:3000

Useful Scripts

npm run dev      # Start the local Next.js development server
npm run build    # Build for production
npm run start    # Start the production server
npm run lint     # Run ESLint

API Overview

  • POST /api/research: accepts chat messages and an optional chatId, retrieves context, streams a Groq response, and saves the updated conversation.
  • GET /api/chat-history: returns the 10 most recent chats for the signed-in user.
  • DELETE /api/chat-history/:id: deletes one chat owned by the signed-in user.
  • POST /api/ingest: accepts a source object such as { "type": "url", "value": "https://example.com" }, scrapes the page, embeds chunks, and stores them.
  • /api/auth/*: Auth.js routes for Google and GitHub sign-in.

How Answers Are Generated

  1. The user sends a text prompt or an image prompt from the chat UI.
  2. /api/research verifies the session and extracts the latest user message.
  3. lib/agent.js tries MongoDB vector search through lib/retriever.js.
  4. If local context is unavailable, the app falls back to Tavily web search.
  5. Groq streams the final answer back to the client.
  6. The completed exchange is saved to the signed-in user's chat history.

Notes

  • .env.local contains secrets and should not be committed.
  • Important answers should be verified, especially when web search is involved.
  • The app currently stores image data as part of the chat message payload, so keep image sizes small.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors