Skip to content

trenchproject/trench_image_processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLO Image Segmenter

A powerful Python tool for object detection, cropping, background removal, and classification using YOLO and Segment Anything Model (SAM).

Features

  • Object Detection: Detect objects in images using YOLOv11
  • Smart Cropping: Automatically crop detected objects with intelligent padding based on confidence
  • Background Removal: Remove backgrounds using SAM segmentation
  • Mask Refinement: Clean segmentation masks and assign unallocated pixels
  • Classification: Optionally classify cropped objects
  • Batch Processing: Process entire directories of images

Requirements

  • Python 3.12 or higher
  • Dependencies listed in pyproject.toml

Installation

This package uses uv for dependency management. If you don't have uv installed, you can install it following the instructions at https://github.com/astral-sh/uv.

# Clone the repository
git clone https://github.com/trenchproject/trench_image_processor.git
cd trench_image_processor

# Install dependencies using uv and actiavte virtual environment
uv sync
source .venv/bin/activate

# Run unit and integration tests
pytest
pytest -m "slow"

Usage

The program provides a command-line interface with various options:

uv run src/main.py -i INPUT_DIR -o OUTPUT_DIR -m YOLO_MODEL [options]

Basic Arguments

  • -i, --input_dir: Directory containing images to process
  • -o, --output_dir: Directory for saving cropped images
  • -m, --model: Path to YOLOv11 model file (.pt)
  • -c, --confidence: Confidence threshold (default: 0.8)
  • --iou: Intersection over union threshold (default: 0.2)
  • -r, --resolution: Set output resolution width in pixels

Padding Options

  • -bp, --base-padding: Base padding factor at highest confidence (default: 0.05)
  • -mp, --max-padding: Maximum padding factor at threshold confidence (default: 0.25)

Background Removal Options

  • --sam-model: Path to SAM segmentation model file (.pt)
  • --bg-color: Background replacement color in hex format (default: #00a000 - green)
  • --clean-masks: Clean segmentation mask by removing small holes and islands
  • --complete-masks: Assign all foreground pixels in source image to an output segment
  • --visualize-leftovers: Visualize leftover pixels before and after mask completion

Classification Options

  • --cls-model: Path to YOLO classification model file (.pt)
  • --cls-confidence: Confidence threshold for classification (default: 0.75)

Examples

Basic Object Detection and Cropping

uv run main.py -i ./images -o ./output -m yolov11n.pt

With Background Removal

uv run src/main.py -i ./images -o ./output -m yolov11n.pt --sam-model sam_l.pt --bg-color "#00a000"

With Classification

uv run src/main.py -i ./images -o ./output -m yolov11n.pt --cls-model yolov11-cls.pt

Complete Pipeline with Custom Settings

uv run main.py -i ./images -o ./output -m yolov11n.pt -c 0.7 --iou 0.3 \
  -bp 0.1 -mp 0.3 -r 800 \
  --sam-model sam_l.pt --bg-color "#0000FF" --clean-masks --complete-masks \
  --cls-model yolov11-cls.pt --cls-confidence 0.8

Output

The program saves cropped images to the output directory with filenames that include:

  • Original image name
  • Class number and name
  • Instance number (if multiple objects of the same class)
  • Classification result (if classification is enabled)

Example: image1--0_person-(1)-[5_adult].jpg

Advanced Features

Smart Padding

The tool applies variable padding to bounding boxes based on detection confidence:

  • Higher confidence detections receive minimal padding (base-padding)
  • Lower confidence detections receive more padding (up to max-padding)

Mask Refinement

When using background removal with --clean-masks and --complete-masks:

  • Small isolated regions and holes are removed from masks
  • Unallocated foreground pixels are assigned to the closest appropriate segment

Logging

The program provides detailed logging information:

  • Detection confidence scores
  • Applied padding values
  • Classification results
  • Processing status for each image

License

Copyright (c) 2025 University of Washington
Licensed under the MIT License. See LICENSE file for details.

Acknowledgments

About

Initial repo for Trench image processing project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages