RoleSense - Role-Based RAG Chatbot

A sophisticated Retrieval-Augmented Generation (RAG) chatbot system with role-based access control and enterprise-grade security. RoleSense leverages document ingestion, vector embeddings, and LLM integration to provide intelligent, context-aware responses while maintaining strict security protocols based on user roles.

🎯 Features

Role-Based Access Control: Multi-tier permission system (CEO, Manager, Employee, Viewer)
Document Ingestion: Support for multiple document formats (Markdown, CSV, Text)
Vector Search: Semantic document retrieval using Chroma vector database
LLM Integration: Powered by LLaMA 2 or OpenAI for intelligent responses
Audit Logging: Comprehensive security audit trail for all operations
Streamlit UI: User-friendly web interface for chatbot interaction
Document Versioning: Track and manage multiple document versions
Security Filtering: Content filtering based on user roles

📋 Project Structure

RoleSense/
├── app/
│   ├── main.py                 # Main application entry point
│   ├── schemas/                # Pydantic data models
│   └── services/
│       ├── audit_logger.py     # Audit logging service
│       ├── document_loader.py  # Document ingestion
│       ├── embedding_service.py # Vector embeddings
│       ├── ingest.py           # Ingestion pipeline
│       ├── llm_service.py      # LLM integration
│       └── vector_store.py     # Vector database operations
├── resources/
│   └── data/                   # Document resources
│       ├── engineering/        # Engineering documents
│       ├── finance/            # Financial documents
│       ├── general/            # General documents
│       ├── hr/                 # HR documents
│       └── marketing/          # Marketing documents
├── streamlit_app.py           # Streamlit web interface
├── requirements.txt           # Python dependencies
├── pyproject.toml            # Project configuration
└── README.md                 # This file

🚀 Quick Start

Prerequisites

Python 3.8+
pip or conda
Git

Installation

Clone the repository

git clone https://github.com/AkshuRaj/RoleSense.git
cd RoleSense

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Configure environment variables

cp .env.example .env
# Edit .env with your LLM API keys and settings

Running the Application

Streamlit Web Interface

streamlit run streamlit_app.py

Python Script

python app/main.py

📚 Usage Guide

Document Ingestion

Documents are loaded from the resources/data/ directory organized by category:

Place documents in appropriate category folders
Documents are automatically indexed and embedded
Supported formats: .md, .csv, .txt

User Roles and Permissions

Role	Access Level	Permissions
CEO	Full	All documents, all operations
Manager	High	Department documents, reports
Employee	Medium	General and HR documents
Viewer	Low	General documents only

API Example

from app.services.llm_service import LLMService
from app.services.vector_store import VectorStore

# Initialize services
vector_store = VectorStore()
llm = LLMService()

# Query documents
results = vector_store.search("financial performance Q4", k=5)

# Generate response with context
response = llm.generate_response(query, context=results, role="manager")
print(response)

🔒 Security Features

Audit Logging: All queries and operations are logged with timestamps and user info
Role-Based Filtering: Content automatically filtered based on user role
Secure Storage: Documents encrypted in vector database
Session Management: Secure user session handling
Rate Limiting: Built-in rate limiting for API endpoints

Viewing Audit Logs

Audit logs are stored in the database and can be accessed through:

from app.services.audit_logger import AuditLogger

logger = AuditLogger()
logs = logger.get_logs(filter_by_user="username", days=7)

🛠️ Configuration

Environment Variables

# LLM Configuration
LLM_MODEL=llama2  # or openai
OPENAI_API_KEY=your_api_key_here
LLAMA_MODEL_PATH=./models/llama-2

# Vector Store
CHROMA_DB_PATH=./chroma_db

# Logging
LOG_LEVEL=INFO
AUDIT_LOG_ENABLED=true

Database

Vector Store: Chroma (local SQLite backend)
Audit Database: SQLite
Auto-creates on first run

📖 Document Categories

Engineering

Technical specifications
System architecture
Development guidelines

Finance

Quarterly reports
Financial summaries
Budget allocations

HR

Employee handbook
Policies and procedures
Training materials

Marketing

Market reports
Campaign strategies
Quarterly reviews

General

Company information
Announcements
Public documentation

🧪 Testing

Run the test suite:

# Run all tests
pytest

# Run specific test
pytest tests/test_audit_logging.py -v

📊 Performance

Document indexing: ~100 docs/minute
Query response: <2 seconds average
Vector search: <500ms for 10K documents
Memory usage: ~2GB for full database

🐛 Troubleshooting

Issue: "Vector database not found"

Solution: Run ingestion pipeline first: python app/services/ingest.py

Issue: "LLM model not loaded"

Solution: Download model and update .env path or set API key

Issue: "Permission denied" for role

Solution: Check user role assignment in audit logs

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Commit changes: git commit -m 'Add amazing feature'
Push to branch: git push origin feature/amazing-feature
Open a Pull Request

📝 License

This project is licensed under the MIT License - see LICENSE file for details.

👨‍💼 Author

Akshu Raj

GitHub: @AkshuRaj
Email: akshuRaj2k6@gmail.com

🙏 Acknowledgments

Chroma for vector database
LLaMA 2 and OpenAI for LLM capabilities
Streamlit for web framework
Pydantic for data validation

📞 Support

For issues, questions, or suggestions:

Check existing GitHub Issues
Create a new issue with detailed description
Contact: akshuRaj2k6@gmail.com

🔗 Links

Last Updated: April 2026 Version: 1.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
resources/data		resources/data
.gitignore		.gitignore
README.md		README.md
ROLE_BASED_QUERIES.md		ROLE_BASED_QUERIES.md
pyproject.toml		pyproject.toml
reingest_documents.py		reingest_documents.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
streamlit_requirements.txt		streamlit_requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RoleSense - Role-Based RAG Chatbot

🎯 Features

📋 Project Structure

🚀 Quick Start

Prerequisites

Installation

Running the Application

Streamlit Web Interface

Python Script

📚 Usage Guide

Document Ingestion

User Roles and Permissions

API Example

🔒 Security Features

Viewing Audit Logs

🛠️ Configuration

Environment Variables

Database

📖 Document Categories

Engineering

Finance

HR

Marketing

General

🧪 Testing

📊 Performance

🐛 Troubleshooting

Issue: "Vector database not found"

Issue: "LLM model not loaded"

Issue: "Permission denied" for role

🤝 Contributing

📝 License

👨‍💼 Author

🙏 Acknowledgments

📞 Support

🔗 Links

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages