PDF Tools Custom Node - Installation Guide

Overview

This is a comprehensive ComfyUI custom node package that provides powerful tools for:

PDF Extraction & Processing - Extract images and text from PDFs
Media Downloading - Download images/videos from Instagram, Reddit, Twitter, YouTube, etc.
AI-Powered Image Analysis - Florence2 vision models for rectangle detection
Layout Analysis - Detect document layouts and structures
Image Enhancement - Modern image enhancement for better quality

What's Included

1. Gallery-dl Downloader Node

Download media from 100+ websites including:

Instagram (posts, stories, reels)
Reddit (posts, galleries)
Twitter/X (images, videos)
Imgur, DeviantArt, Artstation, and more

Features:

Browser cookie authentication
File organization by type (images/videos/audio)
Download archive to avoid duplicates
Metadata extraction

2. Yt-dlp Downloader Node

Download videos and audio from:

YouTube (videos, playlists, channels)
TikTok, Twitch, Instagram videos
1000+ video platforms

Features:

Format selection (quality presets)
Audio extraction (MP3, FLAC, etc.)
Subtitle download and embedding
Playlist support

3. PDF Extractor Nodes

Multiple versions for extracting content from PDFs:

Extract images with quality assessment
OCR text recognition
Spread detection (book scanning)
Metadata preservation
Multiple output formats

4. Florence2 Rectangle Detector

AI-powered image analysis using Microsoft's Florence2 model:

Detect rectangular regions in images
Caption generation
Object detection
Visual question answering

5. Layout Parser Nodes

Document layout analysis:

Detect text blocks, figures, tables
Enhanced OCR with multiple engines
Computer vision-based layout detection

Installation Steps

Step 1: Navigate to Your Python Environment

cd A:\Comfy25\ComfyUI_windows_portable

Step 2: Install Core Requirements

Use the embedded Python to install packages:

.\python_embeded\python.exe -m pip install -r custom_nodes\PDF_tools\requirements.txt

Step 3: Install External Tools (Optional but Recommended)

3a. Install gallery-dl (for web downloads)

.\python_embeded\python.exe -m pip install gallery-dl

3b. Install yt-dlp (for video downloads)

.\python_embeded\python.exe -m pip install yt-dlp

3c. Install FFmpeg (Required for yt-dlp audio extraction)

Download FFmpeg from: https://www.gyan.dev/ffmpeg/builds/
Extract to C:\ffmpeg\ (or any location)
Add C:\ffmpeg\bin to your system PATH
Verify: ffmpeg -version

Step 4: Verify Installation

Test that packages are installed:

# Test gallery-dl
.\python_embeded\python.exe -m gallery_dl --version

# Test yt-dlp
.\python_embeded\python.exe -m yt_dlp --version

# Test PyMuPDF
.\python_embeded\python.exe -c "import fitz; print(f'PyMuPDF {fitz.__version__}')"

# Test transformers
.\python_embeded\python.exe -c "import transformers; print(f'Transformers {transformers.__version__}')"

Step 5: Configure Authentication (Optional)

For Instagram/Reddit downloads:

Export cookies from your browser:
- Use browser extension: "Get cookies.txt LOCALLY" (Chrome/Firefox)
- Save as configs/instagram_cookies.json (Netscape format is fine)
Or use browser cookies directly:
- Set use_browser_cookies: True in the node
- Chrome requires admin privileges on Windows
- Firefox works without admin

For Reddit API (if using config file):

See Docs/reddit_app_creation_guide.py for setup instructions
Note: Reddit API requires OAuth and may have rate limits

Step 6: Start ComfyUI

# Start ComfyUI normally
.\run_nvidia_gpu.bat

# Or with admin privileges (for Chrome cookie access)
# Right-click run_nvidia_gpu.bat → "Run as administrator"

Quick Test

Test Gallery-dl Node:

Add "Gallery-dl Downloader" node to workflow
Set URL: https://www.instagram.com/janaioannaa/ (or any public profile)
Set output_dir: ./test-output
Run workflow
Check test-output/instagram/janaioannaa/images/ for downloaded files

Test Yt-dlp Node:

Add "Yt-dlp Downloader" node to workflow
Set URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ (example)
Set output_dir: ./yt-output
Run workflow
Check output directory for downloaded video

Test PDF Extractor:

Add "PDF Extractor v08" node to workflow
Load a PDF file
Set output directory
Run to extract pages as images

Common Issues & Solutions

Issue: "gallery-dl: command not found"

Solution: Install with pip: python_embeded\python.exe -m pip install gallery-dl

Issue: "yt-dlp: command not found"

Solution: Install with pip: python_embeded\python.exe -m pip install yt-dlp

Issue: "FFmpeg not found" (yt-dlp)

Solution:

Download FFmpeg: https://www.gyan.dev/ffmpeg/builds/
Extract and add to PATH
Or place ffmpeg.exe in ComfyUI root directory

Issue: "PyMuPDF not found"

Solution: python_embeded\python.exe -m pip install PyMuPDF

Issue: Chrome cookies not accessible

Solutions:

Run ComfyUI as administrator (Windows security restriction)
Or use Firefox instead (doesn't require admin)
Or export cookies to file and use cookie_file parameter

Issue: Instagram/Reddit downloads fail

Solutions:

Export cookies from logged-in browser session
Place in configs/instagram_cookies.json
Set cookie_file parameter in node
Make sure you're logged into the site in your browser

Issue: "CUDA out of memory" (Florence2/AI models)

Solutions:

Close other GPU applications
Reduce batch size in node settings
Use smaller model variants
Enable model offloading in ComfyUI settings

Issue: Transformers version conflicts

Solution:

.\python_embeded\python.exe -m pip install transformers>=4.35.0 --upgrade

Package Size Warnings

Some packages are large and optional:

Surya OCR: ~1GB models (advanced OCR)
SAM2: ~1-2GB models (segmentation)
Florence2 models: ~500MB-2GB (vision models, auto-downloaded)
PaddleOCR: ~500MB models (Chinese/English OCR)
EasyOCR: ~1GB models (multi-language OCR)

These are commented out in requirements.txt - only install if needed.

Minimal Installation

If you only want specific features:

Just Gallery-dl (web downloads):

.\python_embeded\python.exe -m pip install gallery-dl browser-cookie3 requests

Just Yt-dlp (video downloads):

.\python_embeded\python.exe -m pip install yt-dlp

Just PDF extraction:

.\python_embeded\python.exe -m pip install PyMuPDF Pillow numpy

Just Florence2 (AI vision):

.\python_embeded\python.exe -m pip install transformers safetensors accelerate timm

Next Steps

Review the documentation:
- Docs/gallery_dl_node_complete_guide.md - Gallery-dl setup
- Docs/yt_dlp_node_complete_guide.md - Yt-dlp setup
- Docs/SETUP_COMPLETE.md - Authentication setup
Test with example workflows:
- Start with simple single-URL downloads
- Test authentication with your accounts
- Try batch downloads from files
Configure for your needs:
- Set up cookie files for authenticated sites
- Create custom config files for specific sites
- Organize download directories

Getting Help

Check the Docs/ folder for detailed guides
Review test scripts in Docs/test_*.py for examples
Check ComfyUI console for debug output (nodes provide detailed status)

System Requirements

OS: Windows 10/11 (primary), Linux (should work)
GPU: NVIDIA GPU with CUDA (for AI models, optional for downloaders)
RAM: 8GB minimum, 16GB+ recommended for AI models
Storage: 5-10GB for packages + models
Python: 3.10+ (comes with ComfyUI portable)

What's Working

✅ Gallery-dl downloads (Instagram, Reddit, Twitter, etc.) ✅ Yt-dlp downloads (YouTube, TikTok, etc.) ✅ PDF extraction with PyMuPDF ✅ Florence2 rectangle detection ✅ Browser cookie authentication ✅ File organization by type ✅ Download archives (no duplicates) ✅ Metadata extraction ✅ Debug output and error handling

Known Limitations

⚠️ Reddit API may hang with old credentials (use browser cookies instead) ⚠️ Chrome cookies require admin privileges on Windows ⚠️ Some sites require valid login cookies ⚠️ Large models (Florence2, SAM2) require GPU memory ⚠️ Transformers versions may need updates

Support & Updates

This is a custom node package. For issues:

Check the documentation in Docs/
Review test scripts for working examples
Check ComfyUI console for detailed error messages
Ensure all requirements are installed correctly

Happy downloading and processing! 🚀

FilesExpand file tree

INSTALLATION_GUIDE.md

Latest commit

History

INSTALLATION_GUIDE.md

File metadata and controls

PDF Tools Custom Node - Installation Guide

Overview

What's Included

1. Gallery-dl Downloader Node

2. Yt-dlp Downloader Node

3. PDF Extractor Nodes

4. Florence2 Rectangle Detector

5. Layout Parser Nodes

Installation Steps

Step 1: Navigate to Your Python Environment

Step 2: Install Core Requirements

Step 3: Install External Tools (Optional but Recommended)

3a. Install gallery-dl (for web downloads)

3b. Install yt-dlp (for video downloads)

3c. Install FFmpeg (Required for yt-dlp audio extraction)

Step 4: Verify Installation

Step 5: Configure Authentication (Optional)

For Instagram/Reddit downloads:

For Reddit API (if using config file):

Step 6: Start ComfyUI

Quick Test

Test Gallery-dl Node:

Test Yt-dlp Node:

Test PDF Extractor:

Common Issues & Solutions

Issue: "gallery-dl: command not found"

Issue: "yt-dlp: command not found"

Issue: "FFmpeg not found" (yt-dlp)

Issue: "PyMuPDF not found"

Issue: Chrome cookies not accessible

Issue: Instagram/Reddit downloads fail

Issue: "CUDA out of memory" (Florence2/AI models)

Issue: Transformers version conflicts

Package Size Warnings

Minimal Installation

Just Gallery-dl (web downloads):

Just Yt-dlp (video downloads):

Just PDF extraction:

Just Florence2 (AI vision):

Next Steps

Getting Help

System Requirements

What's Working

Known Limitations

Support & Updates