PDF Summarizer with DOI Detection and CSV Export

This code extracts text from research PDFs, detects DOI links, summarizes the content using a transformer-based NLP model (distilBART), and exports the results into a structured CSV file.

Features

Automatically finds all PDF files in a specified directory
Extracts full text from PDFs using PyMuPDF
Detects DOI (Digital Object Identifier) using regex
Cleans out references/bibliography sections
Summarizes content using a pre-trained Hugging Face transformer model
Exports summary and DOI links into a CSV file

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
FSCI2025_SUMMARIZATION.pptx		FSCI2025_SUMMARIZATION.pptx
README.md		README.md
requirements.txt		requirements.txt
summarization.py		summarization.py
visualize_csv.py		visualize_csv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Summarizer with DOI Detection and CSV Export

Features

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Shabnam2212/summarization

Folders and files

Latest commit

History

Repository files navigation

PDF Summarizer with DOI Detection and CSV Export

Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages