scholarlensAI-BE

Research Paper Reading Assistant Backend API Server

Overview

Utilizes Upstage Document Parse and Solar LLM to parse, summarize, translate, answer questions, and generate highlights for research papers.
Defines HTTP endpoints in the routers layer and handles Upstage API integration and domain logic in the services layer.

Tech Stack

Category	Technology
Framework	FastAPI
Server	Uvicorn (ASGI)
Validation	Pydantic v2
AI/ML	Upstage Document Parse, Solar LLM
Deployment	Docker

Key Features

1. Document Processing (DocumentParserService)

PDF Upload and Parsing: High-precision PDF parsing using Upstage Document Parse API
Structured Data Extraction: Document structure analysis in HTML and Text formats, automatic document section recognition (Introduction, Methods, Results, etc.)
Large Document Support: Maximum 50MB, automatic selection between synchronous (100 pages) and asynchronous (1000 pages) methods

2. AI-Based Analysis

Automatic Section Summaries: Recognition and summarization of Introduction, Methods, Results, Discussion, Conclusion
Interactive Q&A: Solar LLM-based paper content Q&A
Key Point Extraction: Automatic extraction of main content from each section
Automatic Highlighting: AI automatically detects and highlights key sentences and important content in papers
Section Importance Analysis: Automatic identification of the most important parts in each section

Quick Start

Requirements

Python 3.10+
Upstage API Key (Get one here)

1. Installation

# Clone repository
git clone https://github.com/ScholarLensAI/scholarlensAI-BE.git
cd scholarlensAI-BE

2. Environment Variable Setup

Create .env file:

cp .env.example .env

.env file example: Replace UPSTAGE_API_KEY and other values with actual values

UPSTAGE_API_KEY=up_your_api_key_here
UPSTAGE_BASE_URL=https://api.upstage.ai/v1
LOG_LEVEL=INFO
DEBUG=False

Environment Variable	Required	Default	Description
`UPSTAGE_API_KEY`	✅	-	Upstage API Key
`UPSTAGE_BASE_URL`	❌	`https://api.upstage.ai/v1`	API Base URL
`LOG_LEVEL`	❌	`INFO`	Log Level (DEBUG, INFO, WARNING, ERROR)
`DEBUG`	❌	`False`	FastAPI Debug Mode

⚠️ Security Notice: Never commit API keys to code repository. Manage them using .env files or environment variables.

3. Run Server

📌 After running, you can access the API at:

Backend → http://localhost:8000
Swagger UI → http://localhost:8000/docs

Option 1: Docker-based Execution (Recommended)

With only Docker installed, you can run immediately without additional setup.

# Build image
docker build -t scholarlens-backend .

# Run container
docker run \
  --name scholarlens \
  -p 8000:8000 \
  -e UPSTAGE_API_KEY="your_api_key_here" \
  -e UPSTAGE_BASE_URL="https://api.upstage.ai/v1" \
  -e LOG_LEVEL="INFO" \
  -e DEBUG="0" \
  scholarlens-backend

Option 2: Local Development Environment

Use this for FastAPI development and debugging.

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create .env file and set API key
cp .env.example .env
# Enter UPSTAGE_API_KEY in .env file

# Run server (choose one)
python3 main.py                                    # Basic execution
uvicorn main:app --reload --host 0.0.0.0 --port 8000  # Hot reload

Project Structure

scholarlensAI-BE/
├── models/
│   └── schemas.py                      # Pydantic schema definitions
│
├── routers/                            # API endpoints
│   ├── chat.py                         # Chat API
│   ├── highlights.py                   # Highlights API
│   ├── summary.py                      # Summary API
│   └── translation.py                  # Translation API
│
├── services/                           # Business logic (Service layer)
│   ├── document_service.py             # Document processing service
│   ├── heading_config.py
│   ├── highlight_service.py            # Highlight service
│   ├── summary_service.py              # Summary service
│   └── upstage_service.py              # Upstage API client
│
├── utils/                              # Utility functions
│   └── helpers.py                      # Common helper functions
│
├── main.py                             # FastAPI application
├── config.py                           # Configuration and environment variable management
├── requirements.txt                    # Python dependencies
├── test.py                             # Endpoint testing script
├── Dockerfile                          # Docker image configuration
├── .env.example                        # Environment variable template
└── .gitignore

API Endpoint Summary

This project follows RESTful style and is organized into summary/translation/chat/highlight domains by functionality.

Method	Path	Description
GET	`/`	Server status/documentation link
GET	`/health`	Health check and API key presence indicator
POST	`/api/summary/upload`	Upload PDF and start parsing (returns document ID)
GET	`/api/summary/sections/{document_id}`	Query parsed section/heading index
GET	`/api/summary/generate/{document_id}`	Generate paper section summaries
POST	`/api/summary/section`	Summarize specific section (`document_id`, `section_name` form data)
POST	`/api/translation/translate`	Translate text or section (JSON: `text` or `document_id`+`section`)
POST	`/api/translation/translate-section`	Section translation based on query parameters
GET	`/api/translation/languages`	List supported languages
POST	`/api/chat/message`	Document context-based Q&A
GET	`/api/chat/history/{document_id}`	(Placeholder) Query chat history
DELETE	`/api/chat/history/{document_id}`	(Placeholder) Delete chat history
GET	`/api/highlights/{document_id}`	Query highlight areas in document

Request/Response Examples

Document Upload

curl -F "file=@paper.pdf" http://localhost:8000/api/summary/upload

Section Summary

curl -X POST \
  -F "document_id=..." \
  -F "section_name=Introduction" \
  http://localhost:8000/api/summary/section

Translation

curl -X POST \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello", "source_language": "en", "target_language": "ko"}' \
  http://localhost:8000/api/translation/translate

Testing

Automated Testing (test.py)

test.py is a script that sequentially tests all major endpoints

Calls major endpoints in order and outputs JSON results to console
Server must be running (uvicorn main:app --reload)
Terminates immediately on request/response errors

Usage

Option	Description	Example
`--base-url`	API server address (default: `http://localhost:8000`)	`--base-url http://192.168.1.100:8000`
`--pdf`	PDF file path to test	`--pdf ./paper.pdf`
`--document-id`	Reuse existing document ID	`--document-id 550e8400-...`

Execution Examples

# 1. Health check only (verify server connection)
python3 test.py

# 2. Upload PDF on local server and test all endpoints
python3 test.py --pdf path/to/paper.pdf

# 3. Upload PDF on remote server and test all endpoints
python test.py --base-url http://localhost:8000 --pdf path/to/paper.pdf

# 4. Test with existing document_id (skip upload)
python3 test.py --document-id <doc_id> --base-url http://localhost:8000

When using upload mode, verify file path is correct first. Terminates immediately if path is invalid.

Test Flow

/health - Verify server status
POST /api/summary/upload - Upload PDF (with --pdf option)
GET /api/summary/sections/{id} - Query sections
GET /api/summary/generate/{id} - Full summary
POST /api/summary/section - Section summary
POST /api/translation/translate - Text translation
POST /api/translation/translate-section - Section translation
POST /api/chat/message - Q&A
GET /api/highlights/{id} - Highlights

Manual Verification Points

Check Swagger UI at http://localhost:8000/docs
After upload, verify /api/summary/sections/{document_id} response includes sections/headings
Verify no errors when calling translation/chat/highlights

Troubleshooting

API Key Error: Verify correct UPSTAGE_API_KEY is set in .env
Upload Failure: Check PDF extension and file size is under 50MB
Parsing Delay: Large documents may take time for asynchronous parsing
CORS Issues: Check CORS_ORIGINS settings in config.py and restart

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scholarlensAI-BE

Overview

Tech Stack

Key Features

1. Document Processing (DocumentParserService)

2. AI-Based Analysis

Quick Start

Requirements

1. Installation

2. Environment Variable Setup

3. Run Server

Option 1: Docker-based Execution (Recommended)

Option 2: Local Development Environment

Project Structure

API Endpoint Summary

Request/Response Examples

Testing

Automated Testing (test.py)

Usage

Test Flow

Manual Verification Points

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
models		models
routers		routers
services		services
utils		utils
.env.exmaple		.env.exmaple
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py

Folders and files

Latest commit

History

Repository files navigation

scholarlensAI-BE

Overview

Tech Stack

Key Features

1. Document Processing (DocumentParserService)

2. AI-Based Analysis

Quick Start

Requirements

1. Installation

2. Environment Variable Setup

3. Run Server

Option 1: Docker-based Execution (Recommended)

Option 2: Local Development Environment

Project Structure

API Endpoint Summary

Request/Response Examples

Testing

Automated Testing (test.py)

Usage

Test Flow

Manual Verification Points

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages