A powerful Python tool for object detection, cropping, background removal, and classification using YOLO and Segment Anything Model (SAM).
- Object Detection: Detect objects in images using YOLOv11
- Smart Cropping: Automatically crop detected objects with intelligent padding based on confidence
- Background Removal: Remove backgrounds using SAM segmentation
- Mask Refinement: Clean segmentation masks and assign unallocated pixels
- Classification: Optionally classify cropped objects
- Batch Processing: Process entire directories of images
- Python 3.12 or higher
- Dependencies listed in
pyproject.toml
This package uses uv for dependency management. If you don't have uv installed, you can install it following the instructions at https://github.com/astral-sh/uv.
# Clone the repository
git clone https://github.com/trenchproject/trench_image_processor.git
cd trench_image_processor
# Install dependencies using uv and actiavte virtual environment
uv sync
source .venv/bin/activate
# Run unit and integration tests
pytest
pytest -m "slow"The program provides a command-line interface with various options:
uv run src/main.py -i INPUT_DIR -o OUTPUT_DIR -m YOLO_MODEL [options]-i, --input_dir: Directory containing images to process-o, --output_dir: Directory for saving cropped images-m, --model: Path to YOLOv11 model file (.pt)-c, --confidence: Confidence threshold (default: 0.8)--iou: Intersection over union threshold (default: 0.2)-r, --resolution: Set output resolution width in pixels
-bp, --base-padding: Base padding factor at highest confidence (default: 0.05)-mp, --max-padding: Maximum padding factor at threshold confidence (default: 0.25)
--sam-model: Path to SAM segmentation model file (.pt)--bg-color: Background replacement color in hex format (default: #00a000 - green)--clean-masks: Clean segmentation mask by removing small holes and islands--complete-masks: Assign all foreground pixels in source image to an output segment--visualize-leftovers: Visualize leftover pixels before and after mask completion
--cls-model: Path to YOLO classification model file (.pt)--cls-confidence: Confidence threshold for classification (default: 0.75)
uv run main.py -i ./images -o ./output -m yolov11n.ptuv run src/main.py -i ./images -o ./output -m yolov11n.pt --sam-model sam_l.pt --bg-color "#00a000"uv run src/main.py -i ./images -o ./output -m yolov11n.pt --cls-model yolov11-cls.ptuv run main.py -i ./images -o ./output -m yolov11n.pt -c 0.7 --iou 0.3 \
-bp 0.1 -mp 0.3 -r 800 \
--sam-model sam_l.pt --bg-color "#0000FF" --clean-masks --complete-masks \
--cls-model yolov11-cls.pt --cls-confidence 0.8The program saves cropped images to the output directory with filenames that include:
- Original image name
- Class number and name
- Instance number (if multiple objects of the same class)
- Classification result (if classification is enabled)
Example: image1--0_person-(1)-[5_adult].jpg
The tool applies variable padding to bounding boxes based on detection confidence:
- Higher confidence detections receive minimal padding (base-padding)
- Lower confidence detections receive more padding (up to max-padding)
When using background removal with --clean-masks and --complete-masks:
- Small isolated regions and holes are removed from masks
- Unallocated foreground pixels are assigned to the closest appropriate segment
The program provides detailed logging information:
- Detection confidence scores
- Applied padding values
- Classification results
- Processing status for each image
Copyright (c) 2025 University of Washington
Licensed under the MIT License. See LICENSE file for details.
- Ultralytics for YOLO implementation
- Segment Anything for SAM implementation