A modern, AI-powered PDF management and interaction platform built with Next.js and Express.js. Upload, analyze, and chat with your PDF documents using advanced AI technology.
- ๐ AI-Powered PDF Chat: Interact with your PDF documents using advanced AI
- ๐ง Qwen2-VL-2B-Instruct Model: Local vision-language model for advanced PDF understanding
- ๐ค Multi-Model AI Integration: Google Generative AI + Qwen2-VL for optimal performance
- ๐ฑ Modern UI/UX: Beautiful, responsive design with glass morphism effects
- ๐ Smart Analysis: Extract insights and analyze PDF content with AI
- ๐ Note Management: Create and manage notes from your PDFs
- ๐ผ๏ธ Image Extraction: Extract and analyze images from PDF documents
- ๐ฌ Conversational AI: Natural language processing for interactive discussions
- ๐ฏ Context-Aware Responses: AI maintains conversation history for meaningful interactions
- ๐ Automated Summarization: Generate intelligent summaries of PDF content
- ๐ User Authentication: Secure user management with JWT
- ๐ Dashboard: Centralized management of all your PDFs and notes
- ๐ RESTful API: Well-documented API with Swagger/OpenAPI
Our PDF Helper AI leverages cutting-edge generative AI technologies to provide intelligent document interaction:
- ๐ง Qwen2-VL-2B-Instruct: Advanced vision-language model deployed locally for superior PDF understanding and multimodal analysis
- ๐ Google Generative AI (Gemini): Integrated for advanced reasoning, content generation, and multi-modal understanding
- ๐ Hybrid AI Architecture: Combines the power of cloud-based GenAI with local custom models for optimal performance and privacy
- ๐๏ธ Vision-Language Understanding: Specialized model capable of processing both text and visual content from PDFs
- ๐ Document Analysis: Optimized for document understanding with 2B parameters for efficient local inference
- ๐ผ๏ธ Image Comprehension: Advanced visual reasoning capabilities for charts, diagrams, and images within PDFs
- ๐ก Instruction Following: Fine-tuned for following complex instructions and providing detailed responses
- โก Lightweight Architecture: 2B parameter model optimized for local deployment with minimal resource requirements
- ๐ Privacy-First: Runs entirely offline, ensuring document confidentiality and data security
- ๐ Vision-Language Processing: Qwen2-VL-2B-Instruct model trained for comprehensive document understanding
- ๐ Local Deployment: Runs entirely on-premise for maximum privacy and data security
- โก Optimized Inference: GPU-accelerated processing with model quantization for fast responses
- ๐ Privacy-First: All document processing happens locally, ensuring confidentiality
- ๐ฏ Multimodal Understanding: Enhanced capability for processing text, images, charts, and diagrams in PDFs
- ๐ Efficient Architecture: 2B parameter model provides excellent performance with minimal resource usage
- ๐ฌ Intelligent Conversations: Natural language interface for document queries and analysis
- ๐ Smart Summarization: Automatic generation of key insights and executive summaries
- ๐ Semantic Search: Advanced content discovery using vector embeddings and similarity matching
- ๐ผ๏ธ Multimodal Analysis: Process both text and images within PDFs using OCR and vision models
- ๐จ Content Generation: Create structured notes, outlines, and reports from PDF content
- ๐ฎ Predictive Analysis: AI suggests relevant questions and topics based on document context
- ๐ Performance Optimization: Continuous model improvement through feedback loops and usage analytics
- Qwen2-VL-2B-Instruct: Local vision-language model for document understanding and analysis
- LM Studio SDK: Local model management and inference optimization
- Redis Vector Store: Efficient storage and retrieval of document embeddings
- Custom Training Pipeline: Automated model fine-tuning and deployment workflow
- API Gateway: Seamless integration between multiple AI models and services
- Next.js 15 - React framework with App Router
- TypeScript - Type-safe JavaScript
- Tailwind CSS - Utility-first CSS framework
- Lucide React - Beautiful icons
- Zustand - State management
- React Hot Toast - Notifications
- Express.js - Web framework for Node.js
- MongoDB - NoSQL database with Mongoose ODM
- Google Generative AI - AI integration
- Redis - Caching and session management
- JWT - Authentication
- Multer - File upload handling
- PDF-Parse - PDF text extraction
- Tesseract.js - OCR for image text extraction
- Node.js 18+
- MongoDB database
- Redis server
- Google Generative AI API key
-
Clone the repository
git clone https://github.com/govindmehta/pdfHelper.git cd pdfHelper -
Backend Setup
cd backend npm installCreate a
.envfile in the backend directory:PORT=5000 MONGODB_URI=mongodb://localhost:27017/pdfhelper JWT_SECRET=your-jwt-secret-key GOOGLE_API_KEY=your-google-generative-ai-key REDIS_URL=redis://localhost:6379
-
Frontend Setup
cd ../frontend npm installCreate a
.env.localfile in the frontend directory:NEXT_PUBLIC_API_URL=http://localhost:5000
-
Start the backend server
cd backend npm run dev -
Start the frontend development server
cd frontend npm run dev -
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:5000
- API Documentation: http://localhost:5000/api-docs
pdfhelper/
โโโ backend/
โ โโโ config/ # Configuration files
โ โโโ controllers/ # Route controllers
โ โโโ middlewares/ # Custom middleware
โ โโโ models/ # Database models
โ โโโ routes/ # API routes
โ โโโ services/ # Business logic
โ โโโ utils/ # Utility functions
โ โโโ uploads/ # File uploads
โ โโโ server.js # Main server file
โโโ frontend/
โ โโโ src/
โ โ โโโ app/ # Next.js app router
โ โ โโโ components/ # React components
โ โ โโโ lib/ # Utility libraries
โ โโโ public/ # Static assets
โ โโโ package.json
โโโ README.md
POST /api/users/register- Register new userPOST /api/users/login- User loginGET /api/users/profile- Get user profile
POST /api/pdfs/upload- Upload PDF fileGET /api/pdfs- Get user's PDFsGET /api/pdfs/:id- Get specific PDFDELETE /api/pdfs/:id- Delete PDF
POST /api/ai/chat- Chat with PDF contentGET /api/ai/conversation/:pdfId- Get conversation history
POST /api/notes- Create noteGET /api/notes- Get user's notesPUT /api/notes/:id- Update noteDELETE /api/notes/:id- Delete note
- Landing Page: Modern hero section with gradient animations
- Dashboard: Glass morphism design with PDF management
- Chat Interface: Real-time AI conversation with structured responses
- Authentication: Clean login/register forms
- AI Response: Structured content parsing with icons and formatting
- JWT-based authentication
- Input validation and sanitization
- File upload restrictions
- CORS configuration
- Rate limiting (recommended for production)
- Mobile-first approach
- Breakpoint-specific layouts
- Touch-friendly interactions
- Optimized performance
- Set up MongoDB Atlas or your preferred database
- Configure Redis instance
- Set environment variables
- Deploy to your preferred platform (Heroku, AWS, etc.)
- Build the application:
npm run build - Deploy to Vercel, Netlify, or your preferred platform
- Configure environment variables
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the ISC License.
- Google Generative AI for powerful AI capabilities
- The open-source community for amazing tools and libraries
- Contributors who help improve this project
For support, please create an issue in the GitHub repository or contact the maintainers.
Made with โค๏ธ by Govind Mehta