Skip to content

cnjack/handwrite

Repository files navigation

Handwriting Removal Project

This project trains a U-Net model to remove handwritten text from images (e.g., exam papers).

Project Structure

  • data/train: Input images (with handwriting)
  • data/train_label: Target images (clean, without handwriting)
  • data/test: Test images to process
  • model.py: U-Net architecture definition
  • dataset.py: Data loading and augmentation
  • train.py: Training script
  • predict.py: Inference script

Prerequisites

This project uses uv for dependency management.

  1. Install uv (if not installed):

    curl -LsSf https://astral.sh/uv/install.sh | sh
  2. Create Virtual Environment and Install Dependencies:

    uv sync

    Or manually:

    uv venv
    source .venv/bin/activate
    uv pip install -r pyproject.toml

Training

To train the model, run:

uv run train.py --epochs 50 --batch_size 4

Arguments:

  • --train_img_dir: Path to training images (default: data/train)
  • --train_label_dir: Path to training labels (default: data/train_label)
  • --epochs: Number of training epochs (default: 200)
  • --batch_size: Batch size (default: 4)
  • --lr: Learning rate (default: 1e-4)
  • --img_size: Image size, must be divisible by 16 (default: 1024)
  • --checkpoint_dir: Directory to save models (default: checkpoints)
  • --log_dir: Directory for TensorBoard logs (default: runs)

Loss Function Options:

  • --loss: Loss function type (default: combined)
    • charbonnier: Single Charbonnier loss (robust to outliers)
    • mse: Mean Squared Error loss
    • l1: L1 / MAE loss
    • combined: Advanced multi-component loss (recommended)
  • --loss-preset: Preset weights for combined loss (default: balanced)
    • conservative: Stable training with lower weights (perc=0.05, ssim=0.3, edge=0.3)
    • balanced: Balanced configuration (perc=0.1, ssim=0.5, edge=0.5)
    • aggressive: Sharp results with higher weights (perc=0.2, ssim=0.8, edge=0.8)

Example with custom loss:

uv run train.py --epochs 100 --batch_size 8 --loss combined --loss-preset aggressive

Monitoring: You can monitor training progress using TensorBoard:

uv run tensorboard --logdir runs

Inference (Prediction)

To remove handwriting from new images:

uv run predict.py --input_dir data/test --output_dir results --model_path checkpoints/best_model.pth

Arguments:

  • --input_dir: Directory containing images to process
  • --output_dir: Directory to save processed images
  • --model_path: Path to the trained model checkpoint

Model Details

  • Architecture: U-Net with Bilinear Upsampling, ResBlock, and CBAM Attention
  • Input: RGB Image (resized to 1024x1024 by default)
  • Output: RGB Image (Cleaned)
  • Loss Functions:
    • Combined Loss (default): Multi-component loss optimizing multiple aspects
      • Charbonnier Loss: Base pixel-wise reconstruction
      • Perceptual Loss (VGG16): Preserves high-level features and textures
      • SSIM Loss: Maintains structural similarity
      • Edge Loss (Sobel): Preserves sharp edges and details
    • Single Losses: Charbonnier, MSE, or L1 for simpler training

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •