diff --git a/README.md b/README.md
index 9d9d458..020051e 100644
--- a/README.md
+++ b/README.md
@@ -1,197 +1,250 @@
+# ποΈ PDBOT β Planning & Development Intelligent Assistant
-ποΈ PDBOT β Planning & Development Intelligent Assistant
+
+
+
+
+
+
-Government of Pakistan
+
+
+
+
+
+
+
-Ministry of Planning, Development & Special Initiatives
+
+
+
+
+
-βΈ»
+---
+**π΅π° Built for the Government of Pakistan**
-An AI-powered document intelligence system for the Manual for Development Projects 2024, serving the planning and development community of Pakistan with accurate, contextual, and traceable responses.
+*Ministry of Planning, Development & Special Initiatives*
-π Quick StartοΏΌ β’ π DocumentationοΏΌ β’ ποΈ System-architectureοΏΌ β’ π PerformanceοΏΌ
+---
-βΈ»
+An AI-powered document intelligence system for the **Manual for Development Projects 2024**, serving the planning and development community of Pakistan with accurate, contextual, and traceable responses.
-π At a Glance
+[π Quick Start](#-quick-start) β’ [π Documentation](#-documentation) β’ [ποΈ Architecture](#οΈ-system-architecture) β’ [π Performance](#-performance-metrics)
+
+---
+
+## π At a Glance
PDBOT is a production-ready, Retrieval-Augmented Generation (RAG) assistant for the Manual for Development Projects 2024, built for real-world workloads inside government environments:
- β’ βοΈ 12-class query classifier (numeric, procedural, compliance, timelines, off-scope, red-line, etc.)
- β’ π Sentence-level retrieval with page citations and passage transparency
- β’ π§ Session memory for contextual follow-ups and pronoun resolution
- β’ π‘οΈ Security-first design β input sanitization, CORS, and rate-limiting ready
- β’ π₯οΈ Embeddable React widget + Streamlit admin dashboard
+
+- βοΈ **12-class query classifier** (numeric, procedural, compliance, timelines, off-scope, red-line, etc.)
+- π **Sentence-level retrieval** with page citations and passage transparency
+- π§ **Session memory** for contextual follow-ups and pronoun resolution
+- π‘οΈ **Security-first design** β input sanitization, CORS, and rate-limiting ready
+- π₯οΈ **Embeddable React widget** + Streamlit admin dashboard
+---
+
+## π Table of Contents
+
+1. [π Executive Summary](#-executive-summary)
+2. [π What's New in Version 2.2.0](#-whats-new-in-version-220)
+3. [π― Core Capabilities](#-core-capabilities)
+4. [ποΈ System Architecture](#οΈ-system-architecture)
+5. [π Quick Start](#-quick-start)
+6. [π Website Integration](#-website-integration)
+7. [π Performance Metrics](#-performance-metrics)
+8. [π Security Considerations](#-security-considerations)
+9. [π Project Structure](#-project-structure)
+10. [π Documentation](#-documentation)
+11. [π€ Contributing](#-contributing)
+12. [π Support & Contact](#-support--contact)
+13. [π License](#-license)
+---
-βΈ»
+## π Executive Summary
-π Table of Contents
- 1. π Executive SummaryοΏΌ
- 2. π Whatβs New in Version 2.2.0οΏΌ
- 3. π― Core CapabilitiesοΏΌ
- 4. ποΈ System ArchitectureοΏΌ
- 5. π Quick StartοΏΌ
- 6. π Website IntegrationοΏΌ
- 7. π Performance MetricsοΏΌ
- 8. π Security ConsiderationsοΏΌ
- 9. π Project StructureοΏΌ
- 10. π DocumentationοΏΌ
- 11. π€ ContributingοΏΌ
- 12. π Support & ContactοΏΌ
- 13. π LicenseοΏΌ
-
-βΈ»
-
-π Executive Summary
-
-PDBOT is an enterprise-grade Retrieval-Augmented Generation (RAG) system developed to provide instant, accurate responses regarding the Manual for Development Projects 2024. The system is designed to support government officials, development practitioners, and stakeholders in accessing procedural information efficiently.
-
-Key Achievements
-
-Metric Achievement Target
-In-Scope Accuracy 87.5% β₯ 85%
-Numeric Accuracy 92.3% β₯ 90%
-Off-Scope Detection 100% 100%
-Response Time < 3 seconds < 5 s
-Zero Hallucination β
Verified Required
-
-Design Goal: Provide short, precise, source-backed answers while minimizing hallucination and maintaining strict procedural correctness for the Manual for Development Projects 2024.
-
-βΈ»
-
-π Whatβs New in Version 2.2.0
-
-π₯οΈ Standalone React Widget
- β’ Independent deployment β No Streamlit dependency required
- β’ Embeddable component β Easy integration into government portals
- β’ Modern UI/UX β Floating, draggable, minimizable interface
- β’ Government branding β Official color scheme and styling
-
-π§ Contextual Memory
- β’ Session-based memory β Maintains conversation context
- β’ Follow-up understanding β Handles pronouns and references
- β’ Automatic cleanup β Memory management per session
-
-π Source Transparency
- β’ View Passages β See exact text used for response generation
- β’ View Sources β Page-level citations with relevance scores
- β’ Audit trail β Full traceability for governance requirements
-
-π‘οΈ Enhanced Security
- β’ Input sanitization β Protection against injection attacks
- β’ Rate limiting ready β Infrastructure hooks for production deployment
- β’ CORS configuration β Secure cross-origin requests for government domains
-
-βΈ»
-
-π― Core Capabilities
-
-1. Intelligent Query Processing
-
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-β PDBOT Query Pipeline β
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
-β User Query β Classifier β RAG Retrieval β LLM Generation β β
-β β β β β
-β 12-Class Semantic + Strict 45β70 β
-β Detection Reranking Word Answers β
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-
- β’ Classifier-first design β Queries are assigned to one of 12 semantic classes.
- β’ RAG-centric β Answers are generated strictly from retrieved passages.
- β’ Length control β Responses are constrained to ~45β70 words by default for readability.
-
-2. Multi-Class Query Classification
-
-Class Description Example
-numeric_query Financial/approval limits βWhat is the CDWP approval limit?β
-definition_query Terminology explanation βWhat is PC-I?β
-procedure_query Process workflows βHow does project revision work?β
-compliance_query Regulatory requirements βWhat are M&E requirements?β
-timeline_query Duration/deadlines βHow long for ECNEC approval?β
-off_scope Non-manual topics Non-MDP topics are handled gracefully
-red_line Inappropriate content Blocked with warning / safe response
-
-Additional internal classes handle reference queries, meta-questions, and navigation-style prompts.
-
-3. Retrieval-Augmented Generation
- β’ Sentence-level chunking β 1β3 sentence segments for precise grounding
- β’ Dual-phase retrieval β Vector search + cross-encoder reranking
- β’ Numeric boosting β +25% score boost for numeric/financial passages
- β’ Page-level citations β Every response includes source page information
-
-βΈ»
-
-ποΈ System Architecture
-
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-β PDBOT v2.2.0 β
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
-β β
-β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
-β β React ββββββΆβ Flask API ββββββΆβ RAG β β
-β β Widget β β (REST) β β Pipeline β β
-β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
-β β β β β
-β β β βΌ β
-β β β βββββββββββββββ β
-β β β β Qdrant β β
-β β β β (Vectors) β β
-β β β βββββββββββββββ β
-β β β β β
-β β βΌ βΌ β
-β β βββββββββββββββ βββββββββββββββ β
-β β β Ollama β βββ β Classifier β β
-β β β (Mistral) β β (12-Class) β β
-β β βββββββββββββββ βββββββββββββββ β
-β β β β
-β β βΌ β
-β β βββββββββββββββ β
-β ββββββββββββΆβ Groq β (Fallback β LLaMA 3) β
-β β (LLaMA 3) β β
-β βββββββββββββββ β
-β β
-βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-
-Technology Stack
-
-Layer Technology Purpose
-Frontend React 18, Vite 5 Modern widget interface
-API Flask, Flask-CORS REST API bridge
-RAG LangChain, Qdrant Vector retrieval pipeline
-Embeddings all-MiniLM-L6-v2 Semantic encoding
-Reranking ms-marco-MiniLM-L-6-v2 Relevance scoring
-LLM Ollama (Mistral) Local response generation
-Fallback Groq (LLaMA 3.1) Cloud failover LLM
-
-
-βΈ»
-
-π Quick Start
-
-Prerequisites
- β’ Python 3.10+
- β’ Node.js 18+ (for widget)
- β’ Docker Desktop (for Qdrant)
- β’ 8GB RAM minimum recommended
-
-Option 1: Unified Launcher (Windows)
+PDBOT is an enterprise-grade Retrieval-Augmented Generation (RAG) system developed to provide instant, accurate responses regarding the **Manual for Development Projects 2024**. The system is designed to support government officials, development practitioners, and stakeholders in accessing procedural information efficiently.
+### Key Achievements
+
+| Metric | Achievement | Target |
+|--------|-------------|--------|
+| In-Scope Accuracy | 87.5% | β₯ 85% |
+| Numeric Accuracy | 92.3% | β₯ 90% |
+| Off-Scope Detection | 100% | 100% |
+| Response Time | < 3 seconds | < 5 s |
+| Zero Hallucination | β
Verified | Required |
+
+> **Design Goal:** Provide short, precise, source-backed answers while minimizing hallucination and maintaining strict procedural correctness for the Manual for Development Projects 2024.
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## π What's New in Version 2.2.0
+
+### π₯οΈ Standalone React Widget
+- **Independent deployment** β No Streamlit dependency required
+- **Embeddable component** β Easy integration into government portals
+- **Modern UI/UX** β Floating, draggable, minimizable interface
+- **Government branding** β Official color scheme and styling
+
+### π§ Contextual Memory
+- **Session-based memory** β Maintains conversation context
+- **Follow-up understanding** β Handles pronouns and references
+- **Automatic cleanup** β Memory management per session
+
+### π Source Transparency
+- **View Passages** β See exact text used for response generation
+- **View Sources** β Page-level citations with relevance scores
+- **Audit trail** β Full traceability for governance requirements
+
+### π‘οΈ Enhanced Security
+- **Input sanitization** β Protection against injection attacks
+- **Rate limiting ready** β Infrastructure hooks for production deployment
+- **CORS configuration** β Secure cross-origin requests for government domains
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## π― Core Capabilities
+
+### 1. Intelligent Query Processing
+
+```mermaid
+flowchart LR
+ A[User Query] --> B[Classifier]
+ B --> C[RAG Retrieval]
+ C --> D[LLM Generation]
+ D --> E[Response]
+
+ B --> |12-Class Detection| B1[Query Type]
+ C --> |Semantic + Reranking| C1[Top Passages]
+ D --> |45-70 Words| D1[Precise Answer]
+```
+
+- **Classifier-first design** β Queries are assigned to one of 12 semantic classes
+- **RAG-centric** β Answers are generated strictly from retrieved passages
+- **Length control** β Responses are constrained to ~45β70 words by default for readability
+
+### 2. Multi-Class Query Classification
+
+| Class | Description | Example |
+|-------|-------------|---------|
+| `numeric_query` | Financial/approval limits | "What is the CDWP approval limit?" |
+| `definition_query` | Terminology explanation | "What is PC-I?" |
+| `procedure_query` | Process workflows | "How does project revision work?" |
+| `compliance_query` | Regulatory requirements | "What are M&E requirements?" |
+| `timeline_query` | Duration/deadlines | "How long for ECNEC approval?" |
+| `off_scope` | Non-manual topics | Non-MDP topics are handled gracefully |
+| `red_line` | Inappropriate content | Blocked with warning / safe response |
+
+> Additional internal classes handle reference queries, meta-questions, and navigation-style prompts.
+
+### 3. Retrieval-Augmented Generation
+
+- **Sentence-level chunking** β 1β3 sentence segments for precise grounding
+- **Dual-phase retrieval** β Vector search + cross-encoder reranking
+- **Numeric boosting** β +25% score boost for numeric/financial passages
+- **Page-level citations** β Every response includes source page information
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## ποΈ System Architecture
+
+```mermaid
+flowchart TB
+ subgraph Frontend
+ A[React Widget]
+ B[Streamlit Dashboard]
+ end
+
+ subgraph Backend
+ C[Flask API]
+ D[RAG Pipeline]
+ E[12-Class Classifier]
+ end
+
+ subgraph Storage
+ F[(Qdrant Vector DB)]
+ end
+
+ subgraph LLM Layer
+ G[Ollama - Mistral]
+ H[Groq - LLaMA 3]
+ end
+
+ A --> C
+ B --> C
+ C --> D
+ C --> E
+ D --> F
+ D --> G
+ D --> H
+ E --> D
+```
+
+### Technology Stack
+
+| Layer | Technology | Purpose |
+|-------|------------|---------|
+| Frontend | React 18, Vite 5 | Modern widget interface |
+| API | Flask, Flask-CORS | REST API bridge |
+| RAG | LangChain, Qdrant | Vector retrieval pipeline |
+| Embeddings | all-MiniLM-L6-v2 | Semantic encoding |
+| Reranking | ms-marco-MiniLM-L-6-v2 | Relevance scoring |
+| LLM | Ollama (Mistral) | Local response generation |
+| Fallback | Groq (LLaMA 3.1) | Cloud failover LLM |
+
+### Language Distribution
+
+| Language | Lines of Code |
+|----------|---------------|
+| π Python | ~346K LoC |
+| π JavaScript | ~72K LoC |
+| π¨ CSS | Various |
+| π HTML | Various |
+| π³ Dockerfile | Build configs |
+| π¦ Batch/PowerShell | Windows scripts |
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## π Quick Start
+
+### Prerequisites
+
+- **Python 3.10+**
+- **Node.js 18+** (for widget)
+- **Docker Desktop** (for Qdrant)
+- **8GB RAM** minimum recommended
+
+### Option 1: Unified Launcher (Windows)
+
+```batch
:: Double-click or run:
start_pdbot.bat
:: Then select:
:: [1] React Widget (Modern UI)
:: [2] Streamlit App (Admin Dashboard)
+```
-Option 2: Manual Setup
+### Option 2: Manual Setup
+```bash
# 1. Clone repository
git clone https://github.com/athem135-source/PDBOT.git
cd PDBOT
@@ -215,16 +268,19 @@ python widget_api.py
# 5b. For Streamlit App (Admin / Testing)
streamlit run src/app.py
+```
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
-βΈ»
+---
-π Website Integration
+## π Website Integration
-Embedding the Widget
+### Embedding the Widget
Add the PDBOT widget to any government portal with a single script tag:
+```html
+```
-Production Build
+### Production Build
+```bash
cd frontend-widget
npm run build
# Output in dist/ folder
# Deploy dist/ to your web server (Nginx/Apache/etc.)
+```
-Docker Deployment
+### Docker Deployment
+```dockerfile
# Dockerfile.widget
FROM node:18-alpine AS builder
WORKDIR /app
@@ -255,51 +315,144 @@ RUN npm install && npm run build
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
+```
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## π Performance Metrics
+
+### Accuracy Validation (Based on 50+ Test Cases)
+
+| Category | Tests | Passed | Accuracy |
+|----------|-------|--------|----------|
+| Definitions | 12 | 11 | 91.7% |
+| Numeric/Financial | 15 | 14 | 93.3% |
+| Procedures | 10 | 8 | 80.0% |
+| Approvals/Limits | 8 | 7 | 87.5% |
+| Off-Scope Detection | 10 | 10 | 100% |
+| **Overall** | **55** | **50** | **90.9%** |
+
+### Response Quality
+
+- **Average response length:** 52 words (target: 45β70)
+- **Source citation rate:** 100%
+- **Numeric extraction rate:** 93%
+- **False refusal rate:** < 5%
+
+### System Performance
+
+| Metric | Value |
+|--------|-------|
+| Average response time | 2.4 seconds |
+| Vector search latency | < 100 ms |
+| Reranking latency | < 200 ms |
+| LLM generation | 1.5β2.0 seconds |
+| Memory per session | < 1 MB |
+
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
+
+---
+
+## π Security Considerations
+
+### Data Protection
+- All queries processed in-memory (no persistent logging of user data)
+- Session-based memory cleared on chat reset
+- No PII collection or storage
+### Input Validation
+- Query length limits enforced
+- Special character sanitization
+- Injection attack prevention (prompt & input level)
-βΈ»
+### Network Security
+- CORS restrictions configurable
+- Rate limiting infrastructure ready
+- HTTPS required for production
-π Performance Metrics
+[β¬οΈ Back to Top](#-pdbot--planning--development-intelligent-assistant)
-Accuracy Validation (Based on 50+ Test Cases)
+---
-Category Tests Passed Accuracy
-Definitions 12 11 91.7%
-Numeric/Financial 15 14 93.3%
-Procedures 10 8 80.0%
-Approvals/Limits 8 7 87.5%
-Off-Scope Detection 10 10 100%
-Overall 55 50 90.9%
+## π Project Structure
-Response Quality
- β’ Average response length: 52 words (target: 45β70)
- β’ Source citation rate: 100%
- β’ Numeric extraction rate: 93%
- β’ False refusal rate: < 5%
+```
+PDBOT/
+βββ src/ # Core Python source code
+β βββ app.py # Streamlit application
+β βββ rag_pipeline.py # RAG implementation
+β βββ classifier.py # Query classification
+βββ frontend-widget/ # React widget
+β βββ src/ # React components
+β βββ dist/ # Production build
+βββ config/ # Configuration files
+βββ data/ # Document data
+βββ docker/ # Docker configurations
+βββ docs/ # Documentation
+βββ tests/ # Test suites
+βββ scripts/ # Utility scripts
+βββ widget_api.py # Flask API for widget
+βββ requirements.txt # Python dependencies
+βββ start_pdbot.bat # Windows launcher
+```
-System Performance
+---
-Metric Value
-Average response time 2.4 seconds
-Vector search latency < 100 ms
-Reranking latency < 200 ms
-LLM generation 1.5β2.0 seconds
-Memory per session < 1 MB
+## π Documentation
+| Document | Description |
+|----------|-------------|
+| [README.md](README.md) | This file - Project overview |
+| [SECURITY.md](SECURITY.md) | Security policies and reporting |
+| [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) | Community guidelines |
+| [LICENSE](LICENSE) | Proprietary license terms |
-βΈ»
+---
-π Security Considerations
+## π€ Contributing
-Data Protection
- β’ All queries processed in-memory (no persistent logging of user data)
- β’ Session-based memory cleared on chat reset
- β’ No PII collection or storage
+This is a proprietary project. Contributions are welcome only with prior authorization from the copyright holder.
-Input Validation
- β’ Query length limits enforced
- β’ Special character sanitization
- β’ Injection attack prevention (prompt & input level)
+For contribution inquiries, please contact [athem135-source](https://github.com/athem135-source).
-Network Security
- β’ CORS restrictions configurable
\ No newline at end of file
+---
+
+## π Support & Contact
+
+For support, licensing, or inquiries:
+
+- **Developer:** [athem135-source](https://github.com/athem135-source)
+- **Repository:** [github.com/athem135-source/PDBOT](https://github.com/athem135-source/PDBOT)
+
+---
+
+## π License
+
+This project is **proprietary software** developed by [athem135-source](https://github.com/athem135-source).
+
+- **All rights reserved**
+- Unauthorized copying, modification, or distribution is prohibited
+- Built under contract for the Government of Pakistan, Ministry of Planning, Development & Special Initiatives
+- For licensing inquiries, contact the developer
+
+See the [LICENSE](LICENSE) file for complete terms.
+
+Β© 2024 athem135-source. All Rights Reserved.
+
+---
+
+
+
+**Developed by [athem135-source](https://github.com/athem135-source)**
+
+*Built for the Government of Pakistan - Ministry of Planning, Development & Special Initiatives*
+
+---
+
+**PDBOT v2.2.0** β’ Built with π€ AI for ποΈ Government
+
+Β© 2024 athem135-source. All Rights Reserved.
+
+