A University Chatbot for the University of Massachusetts Lowell that can answer a variety of questions about the university.
- Retrieval Augmentation Generation
- GitHub Actions CI
- Kubernetes
- Longhorn
- Ollama
- Milvus
- Open WebUI
- PostgreSQL on K8s
- Redis on K8s
- AWS SSM Parameter Store
Divided deployment on Kubernetes and on VM.
Inference model, embedding model, and docling chunker run directly on VM and expose an Ollama API endpoint and docling-serve API endpoint to call.
Milvus is deployed on K8s as a cluster.
Open WebUI is currently deployed as part of a custom helm chart.
The design is intended to be fast and scalable. It leverage asynchronous API calls, multi-threading and a producer-consumer approach with thread-safe queues.
- Scale Open WebUI horizontally
- Install PostgreSQL on K8s as ArgoCD app
- Install Redis on K8s as ArgoCD app
- Create PVC of persistent shared storage for Open WebUI replicas
- Configure and install Gateway API Controller and HTTPRoute on K8s as ArgoCD app
- Update the docs on the website
Tangential:
- ~~Create a separate VM or K8s cluster and install Hashicorp Vault for on-prem key management ~~
- Gurpreet Singh
- Nick Bottari

