Skip to content

AkshuRaj/RoleSense

Repository files navigation

RoleSense - Role-Based RAG Chatbot

A sophisticated Retrieval-Augmented Generation (RAG) chatbot system with role-based access control and enterprise-grade security. RoleSense leverages document ingestion, vector embeddings, and LLM integration to provide intelligent, context-aware responses while maintaining strict security protocols based on user roles.

🎯 Features

  • Role-Based Access Control: Multi-tier permission system (CEO, Manager, Employee, Viewer)
  • Document Ingestion: Support for multiple document formats (Markdown, CSV, Text)
  • Vector Search: Semantic document retrieval using Chroma vector database
  • LLM Integration: Powered by LLaMA 2 or OpenAI for intelligent responses
  • Audit Logging: Comprehensive security audit trail for all operations
  • Streamlit UI: User-friendly web interface for chatbot interaction
  • Document Versioning: Track and manage multiple document versions
  • Security Filtering: Content filtering based on user roles

πŸ“‹ Project Structure

RoleSense/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py                 # Main application entry point
β”‚   β”œβ”€β”€ schemas/                # Pydantic data models
β”‚   └── services/
β”‚       β”œβ”€β”€ audit_logger.py     # Audit logging service
β”‚       β”œβ”€β”€ document_loader.py  # Document ingestion
β”‚       β”œβ”€β”€ embedding_service.py # Vector embeddings
β”‚       β”œβ”€β”€ ingest.py           # Ingestion pipeline
β”‚       β”œβ”€β”€ llm_service.py      # LLM integration
β”‚       └── vector_store.py     # Vector database operations
β”œβ”€β”€ resources/
β”‚   └── data/                   # Document resources
β”‚       β”œβ”€β”€ engineering/        # Engineering documents
β”‚       β”œβ”€β”€ finance/            # Financial documents
β”‚       β”œβ”€β”€ general/            # General documents
β”‚       β”œβ”€β”€ hr/                 # HR documents
β”‚       └── marketing/          # Marketing documents
β”œβ”€β”€ streamlit_app.py           # Streamlit web interface
β”œβ”€β”€ requirements.txt           # Python dependencies
β”œβ”€β”€ pyproject.toml            # Project configuration
└── README.md                 # This file

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • pip or conda
  • Git

Installation

  1. Clone the repository

    git clone https://github.com/AkshuRaj/RoleSense.git
    cd RoleSense
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Configure environment variables

    cp .env.example .env
    # Edit .env with your LLM API keys and settings

Running the Application

Streamlit Web Interface

streamlit run streamlit_app.py

Python Script

python app/main.py

πŸ“š Usage Guide

Document Ingestion

Documents are loaded from the resources/data/ directory organized by category:

  • Place documents in appropriate category folders
  • Documents are automatically indexed and embedded
  • Supported formats: .md, .csv, .txt

User Roles and Permissions

Role Access Level Permissions
CEO Full All documents, all operations
Manager High Department documents, reports
Employee Medium General and HR documents
Viewer Low General documents only

API Example

from app.services.llm_service import LLMService
from app.services.vector_store import VectorStore

# Initialize services
vector_store = VectorStore()
llm = LLMService()

# Query documents
results = vector_store.search("financial performance Q4", k=5)

# Generate response with context
response = llm.generate_response(query, context=results, role="manager")
print(response)

πŸ”’ Security Features

  • Audit Logging: All queries and operations are logged with timestamps and user info
  • Role-Based Filtering: Content automatically filtered based on user role
  • Secure Storage: Documents encrypted in vector database
  • Session Management: Secure user session handling
  • Rate Limiting: Built-in rate limiting for API endpoints

Viewing Audit Logs

Audit logs are stored in the database and can be accessed through:

from app.services.audit_logger import AuditLogger

logger = AuditLogger()
logs = logger.get_logs(filter_by_user="username", days=7)

πŸ› οΈ Configuration

Environment Variables

# LLM Configuration
LLM_MODEL=llama2  # or openai
OPENAI_API_KEY=your_api_key_here
LLAMA_MODEL_PATH=./models/llama-2

# Vector Store
CHROMA_DB_PATH=./chroma_db

# Logging
LOG_LEVEL=INFO
AUDIT_LOG_ENABLED=true

Database

  • Vector Store: Chroma (local SQLite backend)
  • Audit Database: SQLite
  • Auto-creates on first run

πŸ“– Document Categories

Engineering

  • Technical specifications
  • System architecture
  • Development guidelines

Finance

  • Quarterly reports
  • Financial summaries
  • Budget allocations

HR

  • Employee handbook
  • Policies and procedures
  • Training materials

Marketing

  • Market reports
  • Campaign strategies
  • Quarterly reviews

General

  • Company information
  • Announcements
  • Public documentation

πŸ§ͺ Testing

Run the test suite:

# Run all tests
pytest

# Run specific test
pytest tests/test_audit_logging.py -v

πŸ“Š Performance

  • Document indexing: ~100 docs/minute
  • Query response: <2 seconds average
  • Vector search: <500ms for 10K documents
  • Memory usage: ~2GB for full database

πŸ› Troubleshooting

Issue: "Vector database not found"

  • Solution: Run ingestion pipeline first: python app/services/ingest.py

Issue: "LLM model not loaded"

  • Solution: Download model and update .env path or set API key

Issue: "Permission denied" for role

  • Solution: Check user role assignment in audit logs

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push to branch: git push origin feature/amazing-feature
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see LICENSE file for details.

πŸ‘¨β€πŸ’Ό Author

Akshu Raj

πŸ™ Acknowledgments

  • Chroma for vector database
  • LLaMA 2 and OpenAI for LLM capabilities
  • Streamlit for web framework
  • Pydantic for data validation

πŸ“ž Support

For issues, questions, or suggestions:

  1. Check existing GitHub Issues
  2. Create a new issue with detailed description
  3. Contact: akshuRaj2k6@gmail.com

πŸ”— Links


Last Updated: April 2026 Version: 1.0.0

About

An enterprise-grade RAG chatbot that enforces role-based data access, combining LangChain, ChromaDB, and LLMs to deliver secure, context-aware responses across multiple departments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages