A sophisticated Retrieval-Augmented Generation (RAG) chatbot system with role-based access control and enterprise-grade security. RoleSense leverages document ingestion, vector embeddings, and LLM integration to provide intelligent, context-aware responses while maintaining strict security protocols based on user roles.
- Role-Based Access Control: Multi-tier permission system (CEO, Manager, Employee, Viewer)
- Document Ingestion: Support for multiple document formats (Markdown, CSV, Text)
- Vector Search: Semantic document retrieval using Chroma vector database
- LLM Integration: Powered by LLaMA 2 or OpenAI for intelligent responses
- Audit Logging: Comprehensive security audit trail for all operations
- Streamlit UI: User-friendly web interface for chatbot interaction
- Document Versioning: Track and manage multiple document versions
- Security Filtering: Content filtering based on user roles
RoleSense/
βββ app/
β βββ main.py # Main application entry point
β βββ schemas/ # Pydantic data models
β βββ services/
β βββ audit_logger.py # Audit logging service
β βββ document_loader.py # Document ingestion
β βββ embedding_service.py # Vector embeddings
β βββ ingest.py # Ingestion pipeline
β βββ llm_service.py # LLM integration
β βββ vector_store.py # Vector database operations
βββ resources/
β βββ data/ # Document resources
β βββ engineering/ # Engineering documents
β βββ finance/ # Financial documents
β βββ general/ # General documents
β βββ hr/ # HR documents
β βββ marketing/ # Marketing documents
βββ streamlit_app.py # Streamlit web interface
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project configuration
βββ README.md # This file
- Python 3.8+
- pip or conda
- Git
-
Clone the repository
git clone https://github.com/AkshuRaj/RoleSense.git cd RoleSense -
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env # Edit .env with your LLM API keys and settings
streamlit run streamlit_app.pypython app/main.pyDocuments are loaded from the resources/data/ directory organized by category:
- Place documents in appropriate category folders
- Documents are automatically indexed and embedded
- Supported formats:
.md,.csv,.txt
| Role | Access Level | Permissions |
|---|---|---|
| CEO | Full | All documents, all operations |
| Manager | High | Department documents, reports |
| Employee | Medium | General and HR documents |
| Viewer | Low | General documents only |
from app.services.llm_service import LLMService
from app.services.vector_store import VectorStore
# Initialize services
vector_store = VectorStore()
llm = LLMService()
# Query documents
results = vector_store.search("financial performance Q4", k=5)
# Generate response with context
response = llm.generate_response(query, context=results, role="manager")
print(response)- Audit Logging: All queries and operations are logged with timestamps and user info
- Role-Based Filtering: Content automatically filtered based on user role
- Secure Storage: Documents encrypted in vector database
- Session Management: Secure user session handling
- Rate Limiting: Built-in rate limiting for API endpoints
Audit logs are stored in the database and can be accessed through:
from app.services.audit_logger import AuditLogger
logger = AuditLogger()
logs = logger.get_logs(filter_by_user="username", days=7)# LLM Configuration
LLM_MODEL=llama2 # or openai
OPENAI_API_KEY=your_api_key_here
LLAMA_MODEL_PATH=./models/llama-2
# Vector Store
CHROMA_DB_PATH=./chroma_db
# Logging
LOG_LEVEL=INFO
AUDIT_LOG_ENABLED=true- Vector Store: Chroma (local SQLite backend)
- Audit Database: SQLite
- Auto-creates on first run
- Technical specifications
- System architecture
- Development guidelines
- Quarterly reports
- Financial summaries
- Budget allocations
- Employee handbook
- Policies and procedures
- Training materials
- Market reports
- Campaign strategies
- Quarterly reviews
- Company information
- Announcements
- Public documentation
Run the test suite:
# Run all tests
pytest
# Run specific test
pytest tests/test_audit_logging.py -v- Document indexing: ~100 docs/minute
- Query response: <2 seconds average
- Vector search: <500ms for 10K documents
- Memory usage: ~2GB for full database
- Solution: Run ingestion pipeline first:
python app/services/ingest.py
- Solution: Download model and update
.envpath or set API key
- Solution: Check user role assignment in audit logs
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
This project is licensed under the MIT License - see LICENSE file for details.
Akshu Raj
- GitHub: @AkshuRaj
- Email: akshuRaj2k6@gmail.com
- Chroma for vector database
- LLaMA 2 and OpenAI for LLM capabilities
- Streamlit for web framework
- Pydantic for data validation
For issues, questions, or suggestions:
- Check existing GitHub Issues
- Create a new issue with detailed description
- Contact: akshuRaj2k6@gmail.com
Last Updated: April 2026 Version: 1.0.0