An AI-powered educational assistant that processes course materials and provides intelligent tutoring through a retrieval-augmented generation (RAG) system. The system automatically processes PDF documents, creates searchable indexes, and provides contextual answers with document citations.
This system consists of three main components:
-
Document Processing Pipeline (Azure Functions)
- PDF chunking and text extraction
- Vector embedding generation
- Azure AI Search indexing
-
AI Chat Backend (FastAPI)
- RAG implementation with Azure OpenAI
- Semantic search across documents
- Educational prompt engineering
-
Frontend Interface (Next.js)
- Chat interface for student interactions
- Document reference display
- Course material navigation
- Node.js 18.x or later
- Azure subscription with the following services:
- Azure Functions
- Azure Blob Storage
- Azure AI Search
- Azure OpenAI
- Python 3.9+ (for FastAPI backend)
# Create resource group
az group create --name ai-course-support --location eastus
# Create storage account
az storage account create --name <storage-name> --resource-group ai-course-support --location eastus --sku Standard_LRS
# Create Azure AI Search service
az search service create --name <search-name> --resource-group ai-course-support --location eastus --sku Basic
# Create Azure OpenAI service
az cognitiveservices account create --name <openai-name> --resource-group ai-course-support --location eastus --kind OpenAI --sku S0students-tools(for original PDF uploads)students-tools-chunked(for processed document chunks)
# Navigate to API directory
cd src/api
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file with your Azure credentials
cp .env.example .env
# Edit .env with your Azure service credentials
# Start the FastAPI server
python main.py# Install Azure Functions Core Tools
npm install -g azure-functions-core-tools@4 --unsafe-perm true
# Install dependencies
npm install
# Copy and configure local settings
cp local.settings.example.json local.settings.json
# Edit local.settings.json with your Azure credentials
# Start functions locally
npm start# Navigate to frontend directory
cd src/frontend
# Install dependencies
npm install
# Start development server
npm run dev{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "<storage-connection-string>",
"FUNCTIONS_WORKER_RUNTIME": "node",
"AZURE_SEARCH_ENDPOINT": "https://<search-name>.search.windows.net",
"AZURE_SEARCH_API_KEY": "<search-admin-key>",
"AZURE_SEARCH_INDEX_NAME": "document-chunks",
"AZURE_OPENAI_ENDPOINT": "https://<openai-name>.openai.azure.com",
"AZURE_OPENAI_API_KEY": "<openai-api-key>",
"AZURE_OPENAI_EMBEDDING_DEPLOYMENT": "text-embedding-ada-002",
"PAGES_PER_CHUNK": "10"
}
}AZURE_SEARCH_ENDPOINT=https://<search-name>.search.windows.net
AZURE_SEARCH_API_KEY=<search-admin-key>
AZURE_SEARCH_INDEX_NAME=document-chunks
AZURE_OPENAI_ENDPOINT=https://<openai-name>.openai.azure.com
AZURE_OPENAI_API_KEY=<openai-api-key>
AZURE_OPENAI_COMPLETION_DEPLOYMENT=gpt-4o-mini
ALLOWED_ORIGINS=http://localhost:3000NEXT_PUBLIC_API_URL=http://localhost:8000- Trigger: Blob storage events when PDFs are uploaded
- Process: Splits PDFs into overlapping chunks, extracts text
- Output: Chunked PDFs and metadata in separate container
- Trigger: Blob storage events for chunked documents
- Process: Generates embeddings, creates search index
- Dependencies: Embeddings Generator, Search Index Manager
- Endpoints:
/api/ChatCompletion,/health - Features: RAG with Azure AI Search, educational prompt engineering
- Response: AI answers with document citations
- Components: Chat interface, document panel, navigation
- Features: Real-time chat, document references, course selection
βββ src/
β βββ functions/ # Azure Functions
β β βββ pdfchunker/ # PDF processing
β β β βββ PDFChunker.ts # Main chunking function
β β βββ indexing/ # Document indexing
β β β βββ PDFIndexer.ts # Indexing function
β β βββ services/ # Shared services
β β βββ indexing/ # Search and embedding services
β β βββ utils/ # Utility functions
β βββ api/ # FastAPI backend
β β βββ main.py # Main API application
β β βββ requirements.txt # Python dependencies
β β βββ .env # Environment variables
β βββ frontend/ # Next.js frontend
β β βββ app/ # App router pages
β β βββ components/ # React components
β β βββ lib/ # Utilities and services
β β βββ package.json # Frontend dependencies
β βββ models/ # Shared TypeScript models
βββ host.json # Function app configuration
βββ local.settings.json # Local development settings
βββ package.json # Function dependencies
βββ README.md # This file
- Document Upload: PDFs uploaded to
students-toolscontainer - Chunking: Azure Function processes PDFs into chunks
- Indexing: Another Function creates embeddings and search index
- Query: User asks question through web interface
- Search: System performs hybrid search on indexed content
- Generate: Azure OpenAI generates response using retrieved documents
- Display: Response shown with document citations
# Build and deploy functions
npm run build
func azure functionapp publish <your-function-app-name># Build Docker image
docker build -t ai-course-api src/api/
# Deploy to Azure Container Instances
az container create --resource-group ai-course-support --name ai-course-api --image ai-course-api --cpu 1 --memory 2 --port 8000# Build frontend
cd src/frontend
npm run build
# Deploy to Azure Static Web Apps
az staticwebapp create --name ai-course-frontend --resource-group ai-course-support --source . --branch main --app-location "src/frontend" --output-location ".next"- Upload Documents: Place PDF course materials in the
students-toolsblob container - Wait for Processing: Functions automatically chunk and index documents
- Start Chatting: Use the web interface to ask questions about course content
- Review References: Check document citations provided with each response
- Modify
PDFChunker.tsto handle additional file formats - Update text extraction logic in the chunking function
- Change
PAGES_PER_CHUNKenvironment variable - Rebuild and redeploy functions
- Edit the system prompt in
src/api/main.py - Adjust OpenAI parameters (temperature, max_tokens)
- Function Logs: View in Azure Portal or Application Insights
- API Health: Check
/healthendpoint - Search Performance: Monitor Azure AI Search metrics
- OpenAI Usage: Track token consumption in Azure OpenAI
- Functions not triggering: Check blob storage connection strings
- Search returning no results: Verify index exists and has data
- OpenAI errors: Check API key and deployment names
- Frontend not connecting: Verify API URL in environment variables
# Check function logs
func start --verbose
# Test API health
curl http://localhost:8000/health
# Check search index
curl -H "api-key: <search-key>" "https://<search-name>.search.windows.net/indexes/document-chunks?api-version=2023-11-01"For issues and questions:
- Check the troubleshooting section above
- Review Azure service logs
- Verify all environment variables are set correctly
- Ensure all required Azure services are running
- v1.0.0: Initial release with PDF processing, RAG chat, and web interface
- Current: Enhanced error handling, improved document citations, markdown support