Vulcan Lab – Entrance Exam Project
🎥 Video Demo (Don't worry, I deleted the API key :3):
Click here
A modular, memory-augmented chatbot system built with LLMs, structured outputs, and a vector database for long-term conversational context.
This project implements a stateful LLM chatbot with:
- Structured outputs (JSON Schema–enforced)
- Short-term & long-term memory
- Context augmentation via vector search
- Scalable multi-user / multi-chat architecture
The system periodically summarizes conversations and stores them in a vector database (Milvus), enabling retrieval and augmentation for future queries.
- LLM Provider: Groq API
- Vector Database: Milvus
- Memory Types:
- Short-term: current context window
- Long-term: summarized session memory (vectorized)
- Core Components:
- Query understanding (ambiguity detection, query rewriting)
- Context augmentation
- Structured session summarization
- Grounded answer generation
- Docker & Docker Compose
- Python 3.10+
- Groq API key
docker compose up -dCreate a .env file in the project root:
GROQ_API_KEY=your_groq_api_key_hereCreate a free Groq API key at:
https://console.groq.com/keys
chmod u+x ./run.sh
./run.shOr run directly:
python -m src.main --config ./configs/app.yaml{
"original_query": "...",
"is_ambiguous": true,
"rewritten_query": "...",
"needed_context_from_memory": [
"user_profile.prefs",
"open_questions"
],
"clarifying_questions": [],
"final_augmented_context": {}
}{
"session_summary": {
"user_profile": {
"prefs": [],
"constraints": []
},
"key_facts": [],
"decisions": [],
"open_questions": [],
"todos": []
},
"message_range_summarized": {
"from": 0,
"to": 42
}
}- The chatbot enforces structured LLM outputs via JSON Schema
- The architecture supports horizontal scalability through
user_idandchat_id - Long conversations are supported via vectorized session memory
- Ambiguous query classification relies solely on the LLM (no dedicated classifier yet)
- Context input size can grow large and needs further optimization
- Context augmentation is currently prompt-based
Example configs/app.yaml:
# User info
chat_id: "001"
user_id: "user_123"
# App config
model_name: "openai/gpt-oss-120b"
chat_history_path: "chatbot_logs/"
reload: true
chatbot_temperature: 0.2
max_completion_tokens: 500
max_context_length: 1000
# Database config
uri: "http://localhost:19530"
token: ""
db_name: "chatbot_db"
session_collection_name: "session_memory"
chat_logs_collection_name: "chat_logs"
context_window_collection_name: "context_window"
embedding_dimension: 384
index_type: "IVF_FLAT"
metric_type: "COSINE"
nlist: 128
nprobe: 10
topk: 5- Groq API Keys: https://console.groq.com/keys
- Milvus Quickstart: https://milvus.io/docs/quickstart.md
- RAG Context Refinement Agent: https://devpost.com/software/rag-context-refinement-agent
- LangChain Groq Integration: https://github.com/langchain-ai/langchain/tree/master/libs/partners/groq
