Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 260 Bytes

File metadata and controls

18 lines (11 loc) · 260 Bytes

PDF Scraper

text.py image.py table.py

Description

Can be used to extract text, images and tables from a pdf file

Parameters may be modified to filter out selected data

Libraries used

Regular Expression

pdfminer.six

Pillow

PyMuPDF

tabula-py