Research Paper Reading Assistant Backend API Server
- Utilizes Upstage Document Parse and Solar LLM to parse, summarize, translate, answer questions, and generate highlights for research papers.
- Defines HTTP endpoints in the
routerslayer and handles Upstage API integration and domain logic in theserviceslayer.
| Category | Technology |
|---|---|
| Framework | FastAPI |
| Server | Uvicorn (ASGI) |
| Validation | Pydantic v2 |
| AI/ML | Upstage Document Parse, Solar LLM |
| Deployment | Docker |
- PDF Upload and Parsing: High-precision PDF parsing using Upstage Document Parse API
- Structured Data Extraction: Document structure analysis in HTML and Text formats, automatic document section recognition (Introduction, Methods, Results, etc.)
- Large Document Support: Maximum 50MB, automatic selection between synchronous (100 pages) and asynchronous (1000 pages) methods
- Automatic Section Summaries: Recognition and summarization of Introduction, Methods, Results, Discussion, Conclusion
- Interactive Q&A: Solar LLM-based paper content Q&A
- Key Point Extraction: Automatic extraction of main content from each section
- Automatic Highlighting: AI automatically detects and highlights key sentences and important content in papers
- Section Importance Analysis: Automatic identification of the most important parts in each section
- Python 3.10+
- Upstage API Key (Get one here)
# Clone repository
git clone https://github.com/ScholarLensAI/scholarlensAI-BE.git
cd scholarlensAI-BECreate .env file:
cp .env.example .env.env file example:
Replace UPSTAGE_API_KEY and other values with actual values
UPSTAGE_API_KEY=up_your_api_key_here
UPSTAGE_BASE_URL=https://api.upstage.ai/v1
LOG_LEVEL=INFO
DEBUG=False| Environment Variable | Required | Default | Description |
|---|---|---|---|
UPSTAGE_API_KEY |
✅ | - | Upstage API Key |
UPSTAGE_BASE_URL |
❌ | https://api.upstage.ai/v1 |
API Base URL |
LOG_LEVEL |
❌ | INFO |
Log Level (DEBUG, INFO, WARNING, ERROR) |
DEBUG |
❌ | False |
FastAPI Debug Mode |
⚠️ Security Notice: Never commit API keys to code repository. Manage them using.envfiles or environment variables.
📌 After running, you can access the API at:
- Backend →
http://localhost:8000 - Swagger UI →
http://localhost:8000/docs
With only Docker installed, you can run immediately without additional setup.
# Build image
docker build -t scholarlens-backend .
# Run container
docker run \
--name scholarlens \
-p 8000:8000 \
-e UPSTAGE_API_KEY="your_api_key_here" \
-e UPSTAGE_BASE_URL="https://api.upstage.ai/v1" \
-e LOG_LEVEL="INFO" \
-e DEBUG="0" \
scholarlens-backendUse this for FastAPI development and debugging.
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file and set API key
cp .env.example .env
# Enter UPSTAGE_API_KEY in .env file
# Run server (choose one)
python3 main.py # Basic execution
uvicorn main:app --reload --host 0.0.0.0 --port 8000 # Hot reloadscholarlensAI-BE/
├── models/
│ └── schemas.py # Pydantic schema definitions
│
├── routers/ # API endpoints
│ ├── chat.py # Chat API
│ ├── highlights.py # Highlights API
│ ├── summary.py # Summary API
│ └── translation.py # Translation API
│
├── services/ # Business logic (Service layer)
│ ├── document_service.py # Document processing service
│ ├── heading_config.py
│ ├── highlight_service.py # Highlight service
│ ├── summary_service.py # Summary service
│ └── upstage_service.py # Upstage API client
│
├── utils/ # Utility functions
│ └── helpers.py # Common helper functions
│
├── main.py # FastAPI application
├── config.py # Configuration and environment variable management
├── requirements.txt # Python dependencies
├── test.py # Endpoint testing script
├── Dockerfile # Docker image configuration
├── .env.example # Environment variable template
└── .gitignore
This project follows RESTful style and is organized into summary/translation/chat/highlight domains by functionality.
| Method | Path | Description |
|---|---|---|
| GET | / |
Server status/documentation link |
| GET | /health |
Health check and API key presence indicator |
| POST | /api/summary/upload |
Upload PDF and start parsing (returns document ID) |
| GET | /api/summary/sections/{document_id} |
Query parsed section/heading index |
| GET | /api/summary/generate/{document_id} |
Generate paper section summaries |
| POST | /api/summary/section |
Summarize specific section (document_id, section_name form data) |
| POST | /api/translation/translate |
Translate text or section (JSON: text or document_id+section) |
| POST | /api/translation/translate-section |
Section translation based on query parameters |
| GET | /api/translation/languages |
List supported languages |
| POST | /api/chat/message |
Document context-based Q&A |
| GET | /api/chat/history/{document_id} |
(Placeholder) Query chat history |
| DELETE | /api/chat/history/{document_id} |
(Placeholder) Delete chat history |
| GET | /api/highlights/{document_id} |
Query highlight areas in document |
Document Upload
curl -F "file=@paper.pdf" http://localhost:8000/api/summary/uploadSection Summary
curl -X POST \
-F "document_id=..." \
-F "section_name=Introduction" \
http://localhost:8000/api/summary/sectionTranslation
curl -X POST \
-H "Content-Type: application/json" \
-d '{"text": "Hello", "source_language": "en", "target_language": "ko"}' \
http://localhost:8000/api/translation/translatetest.py is a script that sequentially tests all major endpoints
- Calls major endpoints in order and outputs JSON results to console
- Server must be running (
uvicorn main:app --reload) - Terminates immediately on request/response errors
| Option | Description | Example |
|---|---|---|
--base-url |
API server address (default: http://localhost:8000) |
--base-url http://192.168.1.100:8000 |
--pdf |
PDF file path to test | --pdf ./paper.pdf |
--document-id |
Reuse existing document ID | --document-id 550e8400-... |
Execution Examples
# 1. Health check only (verify server connection)
python3 test.py
# 2. Upload PDF on local server and test all endpoints
python3 test.py --pdf path/to/paper.pdf
# 3. Upload PDF on remote server and test all endpoints
python test.py --base-url http://localhost:8000 --pdf path/to/paper.pdf
# 4. Test with existing document_id (skip upload)
python3 test.py --document-id <doc_id> --base-url http://localhost:8000When using upload mode, verify file path is correct first. Terminates immediately if path is invalid.
/health- Verify server statusPOST /api/summary/upload- Upload PDF (with --pdf option)GET /api/summary/sections/{id}- Query sectionsGET /api/summary/generate/{id}- Full summaryPOST /api/summary/section- Section summaryPOST /api/translation/translate- Text translationPOST /api/translation/translate-section- Section translationPOST /api/chat/message- Q&AGET /api/highlights/{id}- Highlights
- Check Swagger UI at
http://localhost:8000/docs - After upload, verify
/api/summary/sections/{document_id}response includes sections/headings - Verify no errors when calling translation/chat/highlights
- API Key Error: Verify correct
UPSTAGE_API_KEYis set in.env - Upload Failure: Check PDF extension and file size is under 50MB
- Parsing Delay: Large documents may take time for asynchronous parsing
- CORS Issues: Check
CORS_ORIGINSsettings inconfig.pyand restart