Get up and running with the Multi-Modal Academic Research System in 5 minutes.
- Prerequisites Checklist
- 5-Minute Setup
- Your First Query
- Collecting Your First Papers
- Understanding the Interface
- Next Steps
Before you begin, ensure you have completed:
- Python 3.9+ installed (
python --version) - Docker installed and running (
docker --version) - Google Gemini API key (free from https://makersuite.google.com/app/apikey)
- Project downloaded/cloned to your local machine
Not ready? See the full Installation Guide for detailed setup instructions.
Open a terminal and run:
docker run -d \
--name opensearch-research \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=MyStrongPassword@2024!" \
opensearchproject/opensearch:latestWait 30 seconds for OpenSearch to initialize.
Navigate to the project directory and run:
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtCreate your .env file:
cp .env.example .envEdit .env and add your Gemini API key:
GEMINI_API_KEY=your_actual_api_key_here
OPENSEARCH_HOST=localhost
OPENSEARCH_PORT=9200python main.pyYou should see:
π Initializing Multi-Modal Research Assistant...
β
Connected to OpenSearch at localhost:9200
β
Research Assistant ready!
π Opening web interface...
Running on local URL: http://0.0.0.0:7860
Running on public URL: https://xxxxx.gradio.live
Success! Open http://localhost:7860 in your browser.
The system comes with no data initially. Let's test it with a simple query to understand the interface, then collect some papers.
- Navigate to the Research tab (should be open by default)
- In the query box, type: "What is machine learning?"
- Click Ask Question
Expected Result: You'll see a message indicating no documents are available yet. This is normal - you need to collect data first!
Let's populate the system with academic papers about a topic.
- Click the Data Collection tab at the top of the interface
-
Find the Collect Papers section
-
Enter a topic you're interested in, for example:
- "machine learning"
- "natural language processing"
- "computer vision"
- "quantum computing"
-
Set Number of papers to:
5(for a quick start) -
Click Collect Papers
What happens: The system will:
- Search ArXiv for the latest papers on your topic
- Download the PDFs
- Extract text and diagrams
- Analyze diagrams using Gemini Vision
- Generate embeddings
- Index everything in OpenSearch
Time: Expect 2-5 minutes for 5 papers.
You'll see status updates like:
β
Collected 5 papers on 'machine learning'
Processing paper 1/5: "Deep Learning for Computer Vision"...
β
Indexed paper: Deep Learning for Computer Vision
Processing paper 2/5: "Attention Is All You Need"...
...
-
Go back to the Research tab
-
Enter a query related to your collected papers:
- "What are the key concepts in machine learning?"
- "Explain neural networks"
- "How do transformers work?"
-
Click Ask Question
Expected Result: You'll receive:
- A comprehensive answer synthesized from the papers
- Citations in brackets [1], [2], etc.
- Source information showing which papers were used
Example response:
Machine learning is a subset of artificial intelligence that enables
systems to learn and improve from experience [1]. Key concepts include:
1. Neural Networks: Computational models inspired by biological neurons [1][2]
2. Training: The process of adjusting model parameters using data [2]
3. Deep Learning: Multi-layer neural networks for complex patterns [3]
Sources:
[1] "Deep Learning Fundamentals" (Smith et al., 2023)
[2] "Introduction to Neural Networks" (Johnson, 2023)
[3] "Modern Machine Learning" (Lee et al., 2024)
The Research Assistant has four main tabs:
Purpose: Query your knowledge base and get AI-powered answers with citations
Key Features:
- Query input box
- AI-generated responses with citations
- Conversation history
- Source attribution
Usage Tips:
- Ask specific questions for better results
- Use follow-up questions to dive deeper
- Check citations to verify information
Purpose: Gather academic content from multiple sources
Sources Available:
- Academic Papers: ArXiv, PubMed Central, Semantic Scholar
- YouTube Videos: Educational channels and lectures
- Podcasts: Academic and educational podcast episodes
Parameters:
- Topic/search query
- Number of items to collect
- Source preference
Usage Tips:
- Start with 5-10 papers to avoid long wait times
- Choose topics that match your research interests
- Mix different sources (papers, videos, podcasts) for diverse perspectives
Purpose: View and export citations from your research sessions
Features:
- List of all cited sources
- Export to BibTeX format
- Citation details (authors, title, date, URL)
Usage Tips:
- Export citations after each research session
- Use BibTeX exports in your LaTeX documents
- Keep track of sources for academic writing
Purpose: Configure system settings and connections
Settings:
- OpenSearch connection (host, port)
- API keys (Gemini)
- Index management
- System health status
Usage Tips:
- Check connection status if searches fail
- Verify OpenSearch is running
- Update API keys if needed
Start Small: Collect 5-10 papers initially to test the system
- Faster processing
- Easier to verify quality
- Quick feedback on topics
Scale Up: Once comfortable, collect 20-50 papers per topic
- Better coverage
- More comprehensive answers
- Diverse perspectives
Be Specific:
- Good: "What is the attention mechanism in transformers?"
- Less effective: "Tell me about AI"
Ask Follow-ups:
- "Can you explain that in simpler terms?"
- "What are the practical applications?"
- "How does this compare to other approaches?"
Request Evidence:
- "What evidence supports this claim?"
- "Which papers discuss this topic?"
Mix different content types for richer research:
- Papers: Detailed technical information, formulas, experiments
- Videos: Visual explanations, demonstrations, lectures
- Podcasts: Discussions, interviews, high-level overviews
Update Your Knowledge Base:
- Collect new papers weekly on your topics
- Keep content current with latest research
Monitor Storage:
- PDFs and processed data accumulate in
data/folder - Clean up old content periodically
Check Logs:
- Review
logs/directory for any errors - Helps troubleshoot issues early
Quick Fix:
# Check if OpenSearch is running
docker ps | grep opensearch
# If not running, start it
docker start opensearch-research
# If container doesn't exist, create it (see Step 1)Quick Fix:
- Verify
.envfile exists:ls -la .env - Check content:
cat .env - Ensure key has no quotes:
GEMINI_API_KEY=AIza...notGEMINI_API_KEY="AIza..." - Restart application:
python main.py
Quick Fix:
- Check logs in
logs/directory for errors - Verify Gemini API key is valid
- Try with fewer papers (1-2) to isolate issues
- Check internet connection
Quick Fix:
- Start with fewer papers (5 instead of 20)
- Close other applications to free memory
- Wait for first-time model downloads to complete
- Check Docker has enough memory allocated (4GB+ recommended)
- Collect: Data Collection β Enter "quantum computing" β Collect 10 papers
- Explore: Research β "What is quantum computing?"
- Deep Dive: Research β "How do quantum gates work?"
- Compare: Research β "What are the differences between quantum and classical computing?"
- Export: Citation Manager β Export BibTeX
- Broad Collection: Collect 30 papers on your research area
- Overview: "What are the main research directions in [topic]?"
- Specific Topics: "What methods are used for [specific problem]?"
- Gaps: "What are open challenges in [topic]?"
- Timeline: "How has [topic] evolved over time?"
- Mixed Media: Collect papers + YouTube videos on the topic
- Introduction: "Explain [concept] in simple terms"
- Technical: "What is the mathematical foundation of [concept]?"
- Visual: Videos provide diagrams and animations
- Practice: "What are example applications of [concept]?"
Now that you're up and running:
-
Configuration Guide: Customize settings, logging, and advanced options
-
Architecture: Understand how the system works under the hood
- See Technology Stack
-
Full Documentation: Explore all features and capabilities
- See Main README
- Collect papers from multiple sources (ArXiv, PubMed, Semantic Scholar)
- Add YouTube lectures from educational channels
- Include podcast episodes for diverse perspectives
- Create topic-specific collections
- Use citation exports for your papers
- Experiment with different query styles
- Build a comprehensive research database
Still having issues?
- Check Logs:
logs/research_system_*.logcontains detailed error information - Verify Setup: Run through the Installation Guide checklist
- Review Configuration: See Configuration Guide
- Common Issues: Full list in Installation Guide - Common Issues
- Documentation: Start with CLAUDE.md in the project root
- Logs: Check
logs/directory for error details - Issues: Report bugs or request features on GitHub
Congratulations! You're now ready to conduct AI-powered academic research with multi-modal sources.
Happy researching!