NeuraX is a production-ready offline multimodal Retrieval-Augmented Generation (RAG) system designed for NTRO's SIH 2025 problem statement. It provides secure, air-gapped document intelligence with advanced multimodal capabilities and enterprise-grade security features.
- Complete Offline Operation: Zero internet dependencies, air-gapped deployment
- Knowledge Graph Security: Real-time anomaly detection and tamper protection
- Audit Logging: Comprehensive activity tracking and compliance monitoring
- Data Sovereignty: All processing occurs locally with no external API calls
- Multimodal Understanding: Process documents, images, and audio seamlessly
- Cross-Modal Search: Find relevant content across different data types
- LM Studio Integration: Local LLM hosting with Gemma 3n (multimodal) and Qwen3 4B (reasoning)
- CLIP Embeddings: State-of-the-art visual-text similarity matching
- Intelligent Citations: Numbered references with confidence scores and expandable sources
- Documents: PDF, DOCX, DOC, TXT with OCR fallback
- Images: JPG, JPEG, PNG, BMP, TIFF, WEBP with visual similarity search
- Audio: WAV, MP3, M4A, FLAC, OGG with speech-to-text processing
- Batch Processing: Handle multiple files simultaneously with progress tracking
- Auto-Deployment: One-click executable generation with PyInstaller
- USB Portability: Export complete system to USB for air-gapped deployment
- Performance Optimization: Memory-efficient processing with GPU acceleration
- Error Resilience: Graceful degradation and comprehensive error handling
- Real-time Feedback: User feedback collection and performance metrics
- LM Studio Integration: Local LLM server for multimodal and reasoning tasks
- ChromaDB: Persistent vector database for semantic search
- CLIP Embeddings: Visual-text cross-modal understanding
- Whisper STT: Speech-to-text for audio processing
- NetworkX: Knowledge graph with security monitoring
- Gradio UI: Modern web interface for end users
- Streamlit Dashboard: Analytics and system monitoring
- Python: 3.8+ (3.9+ recommended for optimal performance)
- Memory: 8GB RAM (16GB+ recommended for large datasets)
- Storage: 5GB free space (models are managed via LM Studio)
- OS: Windows 10+, Linux (Ubuntu 18.04+), macOS 10.15+
- Memory: 16GB+ RAM for smooth operation
- GPU: 6GB+ VRAM for accelerated processing (CPU fallback available)
- Storage: 10GB+ for cache and data processing
- Network: None required during operation (offline-first design)
- LM Studio: For local LLM hosting (Gemma 3n + Qwen3 4B)
- Tesseract OCR: For document text extraction (auto-bundled)
- FFmpeg: For audio processing (platform-specific installation)
# Clone the repository
git clone https://github.com/thrishank007/NeuraX.git
cd NeuraX
# Run automated setup
python install_dependencies.py
# Setup LM Studio integration
python migrate_to_lmstudio.py
# Launch the system
python main_launcher.py# Clone repository
git clone https://github.com/thrishank007/NeuraX.git
cd NeuraX
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install system dependencies (platform-specific)
# Ubuntu/Debian: sudo apt-get install tesseract-ocr ffmpeg
# macOS: brew install tesseract ffmpeg
# Windows: Automated via install_dependencies.py
# Launch system
python main_launcher.py# Build portable executable
python build_executables.py
# Deploy to USB or air-gapped system
# Executable will be in packages/ directoryNeuraX uses LM Studio for local LLM hosting, providing better performance and easier model management:
- Download from https://lmstudio.ai/
- Install and launch the application
In LM Studio, search for and download:
- Gemma 3n: For multimodal queries (text + images)
- Qwen3 4B Thinking 2507: For complex reasoning tasks
- Go to "Local Server" tab in LM Studio
- Load your preferred model (Gemma for multimodal, Qwen for reasoning)
- Start server on
localhost:1234 - Verify server is running with green status indicator
python test_lmstudio_integration.py# Upload documents via Gradio interface
# Supported: PDF, DOCX, DOC, TXT files
# Automatic text extraction and indexing
# Query your documents
query = "What are the main findings in the research?"
# System returns relevant passages with citations# Upload images along with documents
# Supported: JPG, PNG, BMP, TIFF, WEBP
# Cross-modal queries
query = "Find documents related to this chart"
# System matches visual content with textual descriptions# Upload audio files
# Supported: WAV, MP3, M4A, FLAC, OGG
# Audio-to-text search
query = "What was discussed about budget planning?"
# System transcribes audio and searches contentNeuraX/
βββ π ingestion/ # Multimodal data processors
β βββ document_processor.py # PDF, DOCX, DOC, TXT processing
β βββ image_processor.py # Image analysis and OCR
β βββ audio_processor.py # Speech-to-text conversion
β βββ notes_processor.py # Structured note processing
β βββ ingestion_manager.py # Orchestrates all processors
β
βββ π indexing/ # Vector embeddings and storage
β βββ embedding_manager.py # CLIP + text embeddings
β βββ vector_store.py # ChromaDB interface
β βββ cache_manager.py # Embedding cache optimization
β βββ memory_manager.py # Memory usage optimization
β βββ performance_benchmarker.py # Performance monitoring
β
βββ π retrieval/ # Query processing
β βββ query_processor.py # Multimodal query handling
β βββ speech_to_text_processor.py # Audio query processing
β
βββ π generation/ # LLM integration
β βββ lmstudio_generator.py # LM Studio API client
β βββ llm_factory.py # Model selection logic
β βββ llm_generator.py # Legacy HF integration
β βββ citation_generator.py # Citation formatting
β
βββ π kg_security/ # Knowledge graph security
β βββ knowledge_graph_manager.py # Graph construction
β βββ anomaly_detector.py # Security monitoring
β βββ security_event_logger.py # Audit logging
β βββ feedback_integration.py # User feedback processing
β
βββ π feedback/ # Feedback system
β βββ feedback_system.py # User feedback collection
β βββ metrics_collector.py # Performance metrics
β βββ π exports/ # Feedback data exports
β
βββ π ui/ # User interfaces
β βββ gradio_app.py # Main web interface
β βββ streamlit_dashboard.py # Analytics dashboard
β βββ demo_gradio_app.py # Demo interface
β
βββ π tests/ # Comprehensive test suite
β βββ test_*.py # Unit and integration tests
β βββ conftest.py # Test configuration
β
βββ π models/ # Local model cache (LM Studio managed)
βββ π data/ # Input data and samples
βββ π vector_db/ # ChromaDB persistent storage
βββ π cache/ # Embedding and processing cache
βββ π logs/ # System logs and error reports
β
βββ π§ config.py # Central configuration
βββ π main_launcher.py # Application orchestrator
βββ π requirements.txt # Python dependencies
βββ π οΈ install_dependencies.py # Automated setup script
βββ π¦ build_executables.py # Portable build script
βββ π migrate_to_lmstudio.py # LM Studio migration tool
βββ π§ͺ test_*.py # Verification and test scripts
# LM Studio Configuration
LM_STUDIO_CONFIG = {
"base_url": "http://localhost:1234/v1",
"gemma_model": "google/gemma-3n", # Multimodal model
"qwen_model": "qwen/qwen3-4b-thinking-2507", # Reasoning model
"auto_model_switching": True, # Auto switch based on query type
}
# Security Configuration
SECURITY_CONFIG = {
"allowed_file_extensions": [
".pdf", ".docx", ".doc", ".txt", # Documents
".jpg", ".jpeg", ".png", ".bmp", ".tiff", ".webp", # Images
".wav", ".mp3", ".m4a", ".flac", ".ogg" # Audio
],
"max_file_size_mb": 100,
"enable_audit_logging": True,
}- Performance tuning: Memory thresholds, batch sizes, GPU settings
- Security policies: File validation, audit logging, anomaly detection
- UI customization: Interface themes, component visibility
- Model preferences: LLM selection, embedding models, fallback strategies
# Run complete test suite
python -m pytest tests/
# Test specific components
python test_image_query_no_ocr.py # Image processing
python test_multimodal_simple.py # Multimodal queries
python test_lmstudio_integration.py # LM Studio integration
python test_final_verification.py # End-to-end validation# Test file upload interface
python test_file_upload_interface_fix.py
# Validate system performance
python test_vector_store.py
# Check citation generation
python test_citation_fix.py- Install Python dependencies via pip
- Setup LM Studio separately
- Run via
python main_launcher.py
# Build self-contained executable
python build_executables.py
# Generates:
# - NeuraX-Windows-x64.zip
# - USB_Deployment/ folder for air-gapped systems# Create USB-ready package
python build_executables.py --usb-deployment
# Copy USB_Deployment/ contents to USB drive
# Includes autorun.inf for Windows systems- Build executable on internet-connected system
- Copy package to air-gapped environment
- Install LM Studio and download models offline
- Run executable with zero internet dependencies
- Document Indexing: 50-100 documents/minute
- Image Processing: 25-50 images/minute
- Audio Transcription: Real-time (1x speed with Whisper-tiny)
- Query Response: 200-500ms average
- Vector Search: 4.7+ items/second similarity search
- Memory: 4-8GB typical usage (scales with data size)
- Storage: 100MB base + data size + cache
- GPU: Optional but recommended for large datasets
- CPU: Efficient with multi-core utilization
- Local Processing: All data remains on local system
- Encrypted Storage: Vector database encryption at rest
- Audit Trails: Comprehensive activity logging
- Access Control: File type and size validation
- Knowledge Graph Monitoring: Real-time graph analysis
- Behavioral Analysis: Unusual query pattern detection
- Tamper Detection: Content integrity verification
- Alert System: Automated security event notifications
# Clone for development
git clone https://github.com/thrishank007/NeuraX.git
cd NeuraX
# Install development dependencies
pip install -r requirements.txt
pip install pytest black flake8
# Run tests before committing
python -m pytest tests/- Tesseract OCR: Auto-bundled in executables, manual install for dev
- GPU Memory: Adjust batch sizes in config for lower VRAM systems
- LM Studio Connection: Ensure server is running on localhost:1234
- Large Files: Use batch processing for datasets >1GB
- API Reference:
/docs/api/(generated from code) - Architecture Guide:
/docs/architecture.md - Deployment Guide:
/docs/deployment.md - Troubleshooting:
/docs/troubleshooting.md
- β Complete offline multimodal RAG system
- β LM Studio integration with Gemma 3n + Qwen3 4B
- β Cross-modal search capabilities
- β Portable executable generation
- β Enterprise security features
- π Additional LLM integrations (Ollama, LocalAI)
- π Enhanced video processing capabilities
- π Multi-language support expansion
- π Advanced analytics dashboard
- π Distributed deployment options
This project is licensed under the MIT License - see the LICENSE file for details.
- NTRO SIH 2025: Problem statement and requirements definition
- Hugging Face: CLIP and Transformer models
- LM Studio: Local LLM hosting platform
- ChromaDB: Vector database infrastructure
- Gradio: Modern web interface framework
Built with β€οΈ for secure, offline AI document intelligence
For detailed documentation, visit: Documentation
For support and issues: GitHub Issues
