Skip to content

AustinFlippo/POC-for-RAG-based-Course-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proof of Concept for a RAG-based Course Assistant

As a freshman at UC San Diego, I thought that the official course catalog wasn't the greatest resource when planning courses I wanted to take. This was a shared opinion among my peers.

To explore a solution, I developed a proof-of-concept chatbot using a Retrieval-Augmented Generation (RAG) system. I used a document I knew well, the course catalog from my local community college (El Camino College) as the initial dataset.

This POC evolved into the Course Assistant in the TritonPlanner app (tritonplanner.com). My team and I scaled the concept by webscraping data from UCSD's course catalogs, department webpages, and other websites, transforming all course information into vector embeddings to use in a RAG-based chatbot, helping students quickly find answers to any question about UCSD courses, prerequisites, and professors.

Requirements

Python Packages

PyPDF2==3.0.1
torch>=2.0.0
ollama>=0.1.7
openai>=1.0.0

Install Dependencies

Setup

  • ollama pull mxbai-embed-large
  • ollama pull dolphin-llama3
  • ollama serve

Usage

  1. Use provided pdf path (course_catalog.pdf), or type in your own pdf path.
  2. Run cells in order

Files Structure

RAG_FOR_PDF/
├── requirements.txt
├── pdf_upload.py          # PDF processing script
├── rag_chat.py            # Chat interface script  
├── chunked.txt            # Processed document 
└── README.md

How It Works

  1. PDF Processing: Extracts text and splits into 1000-character chunks
  2. Embeddings: Uses mxbai-embed-large to create vector representations
  3. Retrieval: Finds most relevant chunks using cosine similarity
  4. Generation: Uses dolphin-llama3 with context to answer questions (conversation history is maintained)

About

This is a proof-of-concept notebook using course catalog text, transformed into vectors, to power a RAG-based chatbot.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors