Python Handwritten OCR Document Generator

Extract structured data from handwritten images using Google Gemini and export it automatically to PDF, Excel and Word documents.

This project is a production-style example of how to digitise handwritten records for:

maintenance and inspection checklists
field data collection
warehouse and inventory logs
healthcare and clinical notes
any paper-based workflow that needs to go digital

How it works

Most OCR tutorials work on clean printed text. This example works on a real photo taken with a mobile phone — handwritten, imperfect, and rotated.

The pipeline handles orientation automatically, extracts the table structure, and generates three ready-to-use output formats.

Step 1 — Input: a real handwritten photo

Step 2 — OCR: Google Gemini reads the table

The image is sent to the Gemini API, which returns the table as a structured list — headers and rows, ready to process.

Step 3 — Output: PDF, Excel and Word generated automatically

_{PDF — printable report with title and footnote}

_{Excel — formatted spreadsheet with frozen header}

_{Word — editable document ready to share}

What gets extracted

Field	Description
Headers	First row of the handwritten table
Data rows	Each subsequent row, preserving original values

The structure is detected dynamically — no hardcoded column names, no templates. Works on any handwritten table.

Quick Start

git clone https://github.com/hasff/python-handwritten-ocr-document-generator.git
cd python-handwritten-ocr-document-generator
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install -r requirements.txt

Get a free Gemini API key at Google AI Studio — no billing required.

Create a .env file in the project root:

GEMINI_API_KEY=your_key_here

Place your image in input/handwrite.jpeg, then run:

python program.py

The script will generate:

output/report.pdf
output/report.xlsx
output/report.docx

Note: By default DEBUG = True in program.py, which uses mock data without consuming API quota. Set DEBUG = False to run the full pipeline with a real image.

Technical approach

Rather than using traditional OCR libraries, this project uses Google Gemini's vision capabilities to understand handwritten content:

EXIF orientation is detected and corrected automatically before processing
The image is sent to Gemini with a structured prompt requesting a JSON array of arrays
The response is parsed and passed to three independent export functions
Each format (PDF, Excel, Word) is generated with consistent styling — matching colors, alternating rows, and a shared visual theme

This approach handles messy handwriting, rotated images, and irregular layouts without any preprocessing or template configuration.

Why not EasyOCR or Tesseract?

Traditional OCR libraries struggle with cursive handwriting. During development, EasyOCR was tested on the same image and returned fragments like "2", "J", "s" with confidence scores below 10%.

Gemini reads the same image correctly on the first attempt.

Need custom document automation?

I help companies automate document processing pipelines:

digitisation of handwritten forms and checklists
batch processing of images and scanned documents
export to Excel, PDF, Word, CSV or JSON
integration with databases, APIs and ERP systems
OCR for printed and handwritten content

📩 Contact: hugoferro.business(at)gmail.com

🌐 Courses and professional tools: https://hasff.github.io/site/

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
input		input
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
program.py		program.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Handwritten OCR Document Generator

How it works

What gets extracted

Quick Start

Technical approach

Why not EasyOCR or Tesseract?

Need custom document automation?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python Handwritten OCR Document Generator

How it works

What gets extracted

Quick Start

Technical approach

Why not EasyOCR or Tesseract?

Need custom document automation?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages