pip install pyzipper textract python-magic PyPDF2 python-docx chardet python-pytesseract
git clone https://github.com/Michael-Sebero/Document-Tools
python3 /home/$USER/Document-Tools/document-tools.py
This compares two documents and lists the similarities and differences to an output file.
This detects duplicate lines in a file, removes them and then saves the changes to an output file.
This extracts text from an image or a directory full of images.
This looks in a given directory recursively for keywords in documents and tells you where you can find them.
This looks in a given directory's .zip or .tar file for keywords in documents and tells you where you can find them in the archive.
This looks for keywords in a file and extracts lines where they're found to an output file.
This replaces keywords in a file.
