Skip to content

Comments

ML morph implementation#10

Open
leyangloh wants to merge 19 commits intomainfrom
leyang/ml-morph
Open

ML morph implementation#10
leyangloh wants to merge 19 commits intomainfrom
leyang/ml-morph

Conversation

@leyangloh
Copy link
Collaborator

@leyangloh leyangloh commented Feb 16, 2026

Complete pipeline for lizard toepad landmark prediction using dlib shape predictors
(ml-morph) with YOLO-detected bounding boxes. Compares two detection approaches:

  • Baseline: Standard YOLO detect (axis-aligned bounding boxes, 6-class)
  • OBB: YOLO-OBB (oriented bounding boxes) with flip inference strategy

Results: Baseline (H5) vs OBB (Axis-Aligned)

  • Toe: Baseline 83.27 px → OBB 30.72 px (2.7x better)
  • Finger: Baseline 40.64 px → OBB 38.25 px (1.06x better)

Test error = average pixel deviation between predicted and ground-truth landmarks (144
hyperparameter configs searched per condition).

OBB wins both categories. Oriented bounding boxes converted to axis-aligned rects with
30% padding produce tighter, more relevant crops for the shape predictor.

Failed Experiment: Crop+Rotate OBB

Attempted to crop and rotate images so OBB becomes upright with 10% padding. Results
were significantly worse (toe: 92.02, finger: 76.11) due to rotation interpolation
artifacts and landmark transformation issues. See docs/EXPERIMENT_CROP_ROTATE_OBB.md.

What's Included

  • ml-morph framework: dlib shape predictor training with TPS→XML→YOLO bbox pipeline
  • Preprocessing scripts: tps_to_xml.py, generate_yolo_bbox_xml.py,
    create_noflip_obb_dataset.py
  • Inference: predict_landmarks_flip.py — end-to-end OBB detection + landmark prediction
    with flip strategy
  • Hyperparameter search: 144-config grid search over tree_depth, cascade_depth, nu,
    num_trees
  • SLURM sbatch files: For preprocessing, training, and hyperparameter search (baseline,
    OBB, OBB-aligned)
  • Documentation: COMPARISON_BASELINE_VS_OBB.md, EXPERIMENT_CROP_ROTATE_OBB.md

leyangloh and others added 19 commits December 10, 2025 14:31
Adds --obb flag to preprocessing to generate rotated bbox labels using
PCA angle from landmarks, enabling YOLO to predict rotation angle at
inference time. This solves the chicken-and-egg problem where the dlib
shape predictor needs rotation angle but can't get it without landmarks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New 6-class scheme: up_finger, up_toe, bot_finger, bot_toe, ruler, id.
Merge script converts upper-view standard bboxes to OBB format and
remaps bottom-view OBB classes, producing a unified dataset. Also fixes
yolo model name typo (yolov11n -> yolo11n).
Remove generated outputs, symlinks, upstream assets, and accidental
copies that shouldn't be in version control.
Add crop+rotate support to generate_yolo_bbox_xml.py and
predict_landmarks_flip.py for OBB-based landmark prediction.
Update .gitignore files, sbatch configs, and hyperparameter search.
Add configs, sbatch files, and preprocessing for baseline, OBB
crop+rotate, and OBB axis-aligned experiments. Document results:
OBB axis-aligned wins (toe 30.72 vs 83.27, finger 38.25 vs 40.64).
Debug scripts moved to .gitignore. Remove obsolete 2-class OBB
config, sbatch, and dataset creation script.
@leyangloh leyangloh changed the title Leyang/ml morph ML morph implementation Feb 16, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an end-to-end landmark prediction pipeline that combines YOLO (detect + OBB) detections with classic dlib shape predictors (ml-morph), including dataset-prep utilities, flip-strategy inference, SLURM job scripts, and experiment documentation.

Changes:

  • Introduce ml-morph (dlib) training + hyperparameter search utilities and configs, integrated with the repo’s uv/pyproject.toml workflow.
  • Add OBB dataset generation scripts (merged 6-class + no-flip variants) and flip-strategy inference scripts for OBB models.
  • Add reproducibility artifacts: sbatch scripts, comparison docs, and experiment notes.

Reviewed changes

Copilot reviewed 71 out of 82 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
scripts/visualization/visualize_progression.py Visualizes bbox→OBB→landmarks progression.
scripts/training/train_yolo.py Add --task and pass YOLO task into training.
scripts/preprocessing/create_noflip_obb_dataset.py Build 6-class OBB dataset without flip labels.
scripts/preprocessing/create_merged_obb_dataset.py Merge bottom OBB + upper bbox into 6-class OBB labels.
scripts/preprocessing/consolidate_tps_by_category.py Consolidate TPS files into category-specific files.
scripts/inference/predict_landmarks_flip.py OBB flip inference + dlib landmark prediction.
scripts/inference/predict.py Extend inference printing to support OBB outputs.
scripts/inference/inference_with_flip.py Standalone flip-strategy OBB visualization inference.
scripts/extract_id_from_yolo.py Adjust config key for labels directory lookup.
sbatch/train_yolo_obb_noflip.sbatch SLURM job to create no-flip dataset + train OBB model.
sbatch/train_yolo_6class.sbatch SLURM job to train 6-class OBB model.
sbatch/train_mlmorph_toe.sbatch SLURM job for toe ml-morph training w/ YOLO-derived bboxes.
sbatch/train_mlmorph_finger.sbatch SLURM job for finger ml-morph training w/ YOLO-derived bboxes.
sbatch/preprocess_obb_aligned.sbatch SLURM preprocessing for axis-aligned OBB conversion.
sbatch/preprocess_obb.sbatch SLURM preprocessing for crop+rotate OBB experiment.
sbatch/preprocess_baseline.sbatch SLURM preprocessing for baseline detect bboxes.
sbatch/hyperparam_search_toe_obb_aligned.sbatch SLURM hyperparam search (toe, OBB aligned).
sbatch/hyperparam_search_toe_obb.sbatch SLURM hyperparam search (toe, crop+rotate OBB).
sbatch/hyperparam_search_toe_baseline.sbatch SLURM hyperparam search (toe, baseline detect).
sbatch/hyperparam_search_toe.sbatch SLURM quick hyperparam search (toe).
sbatch/hyperparam_search_finger_obb_aligned.sbatch SLURM hyperparam search (finger, OBB aligned).
sbatch/hyperparam_search_finger_obb.sbatch SLURM hyperparam search (finger, crop+rotate OBB).
sbatch/hyperparam_search_finger_baseline.sbatch SLURM hyperparam search (finger, baseline detect).
sbatch/hyperparam_search_finger.sbatch SLURM quick hyperparam search (finger).
pyproject.toml Add pandas/lightning deps (+ tensorboard dev dep).
ml-morph/utils/utils.py Classic ml-morph utility functions (xml/tps helpers).
ml-morph/utils/init.py Package init + re-exports.
ml-morph/shape_trainer.py Classic dlib shape predictor training script.
ml-morph/shape_tester.py Classic dlib shape predictor testing script.
ml-morph/scripts/training/hyperparameter_search.py Grid search for dlib shape predictor hyperparams.
ml-morph/scripts/train_workflow.py Config-driven classic dlib workflow runner.
ml-morph/scripts/preprocessing/tps_to_xml.py TPS→XML converter (no dlib dependency).
ml-morph/scripts/preprocessing/split_train_val_test.py Train/val/test split + XML regeneration utility.
ml-morph/scripts/preprocessing/remove_landmarks_from_tps.py Remove selected landmarks from TPS files.
ml-morph/scripts/preprocessing/merge_tps_files.py Merge TPS files into consolidated TPS.
ml-morph/scripts/preprocessing/generate_yolo_bbox_xml.py Replace XML bboxes with YOLO-derived (incl OBB/crop-rotate).
ml-morph/scripts/preprocessing/generate_obb_from_tps.py Generate OBB labels from TPS landmarks.
ml-morph/scripts/preprocessing/extract_scale_tps.py Extract scale landmarks into separate TPS files.
ml-morph/scripts/preprocessing/consolidate_all_tps.py Consolidate TPS by category with optional landmark removal.
ml-morph/scripts/plot_hyperparam_results.py Plot hyperparam search results.
ml-morph/scripts/evaluate.py Standalone evaluation helper for PyTorch workflow.
ml-morph/requirements.txt ml-morph requirements list.
ml-morph/preprocessing.py Original ml-morph preprocessing entrypoint.
ml-morph/prediction.py Original ml-morph prediction entrypoint.
ml-morph/makefile Convenience install make target.
ml-morph/detector_trainer.py Original ml-morph detector trainer.
ml-morph/detector_tester.py Original ml-morph detector tester.
ml-morph/configs/toe_training_yolo_obb.yaml Toe dlib training config using OBB-derived bboxes.
ml-morph/configs/toe_training_yolo_bbox.yaml Toe dlib training config using detect-derived bboxes.
ml-morph/configs/toe_training_yolo_baseline.yaml Toe baseline config for detect comparison.
ml-morph/configs/toe_training.yaml Toe classic dlib workflow config.
ml-morph/configs/finger_training_yolo_obb.yaml Finger dlib training config using OBB-derived bboxes.
ml-morph/configs/finger_training_yolo_bbox.yaml Finger dlib training config using detect-derived bboxes.
ml-morph/configs/finger_training_yolo_baseline.yaml Finger baseline config for detect comparison.
ml-morph/configs/default.yaml Default config template for classic workflow.
ml-morph/README_ml-morph.md Upstream-style ml-morph README content.
ml-morph/README.md Repo-integrated ml-morph README and workflow guide.
ml-morph/.gitignore Ignore generated training artifacts/crops/results.
docs/assets/bilateral_detection_plan/pre-augmentated/1201_flipud.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/pre-augmentated/1186_flipud.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/maunal-labeling/1237.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/maunal-labeling/1126.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/maunal-labeling/1111.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/maunal-labeling/1020.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/conf_Iou_treshhold_adjust/1007.jpg Added documentation asset image.
docs/assets/bilateral_detection_plan/conf_Iou_treshhold_adjust/1004.jpg Added documentation asset image.
docs/INFERENCE_WITH_FLIP.md Documentation for flip inference strategy.
docs/EXPERIMENT_CROP_ROTATE_OBB.md Write-up of crop+rotate OBB failed experiment.
docs/COMPARISON_BASELINE_VS_OBB.md Baseline vs OBB comparison results and reproduction steps.
configs/H5_obb_noflip.yaml YOLO dataset config for no-flip 6-class OBB dataset.
configs/H1_obb_6class.yaml YOLO dataset config for merged 6-class OBB dataset.
.gitignore Ignore logs/scratch and generated assets.
Comments suppressed due to low confidence (1)

scripts/inference/predict.py:32

  • A '--task' argument was added but isn't used anywhere, and model discovery is hardcoded to runs/detect/. This is confusing for users trying OBB inference. Either remove the flag, or use it to select runs/obb/ for default model paths and/or pass the intended task through to prediction logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2 to +8
import argparse
import os
from pathlib import Path
import cv2
import numpy as np
from ultralytics import YOLO
import sys
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports 'os' and 'sys' are unused. With ruff enabled, this will be reported as unused imports. Please remove them or use them.

Copilot uses AI. Check for mistakes.
Comment on lines 344 to 346
image_dir = get_opt('image-dir', None)
label_dir = get_opt('label-dir', None)
label_dir = get_opt('labels-dir', None)
output_dir = get_opt('output-dir', 'data/processed')
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config key was changed from 'label-dir' to 'labels-dir', but existing configs (e.g. configs/H1.yaml under extracting) still use 'label-dir'. This will cause label_dir to be None and the script to error. Consider supporting both keys for backward compatibility or updating all configs to use 'labels-dir'.

Copilot uses AI. Check for mistakes.
Comment on lines +386 to +387
print(f"Warning: Texture file for model '{stem}' is missing.")
print(f" Please provide it at: /Users/leyangloh/Downloads/3D_Fish/mussels_31 (or verify filename matches model name).")
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning message references an unrelated local path ('/Users/leyangloh/Downloads/3D_Fish/mussels_31'), which is confusing for this repo/workflow. Please replace with a repo-relevant instruction (e.g., print the searched paths and suggest passing --source explicitly).

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +107
# Paths
img_path = Path("/storage/ice-shared/cs8903onl/miami_fall_24_jpgs/1001.jpg")
bbox_model = Path("/home/hice1/yloh30/scratch/Lizard_Toepads/yolo_bounding_box.pt")
obb_model = Path("/home/hice1/yloh30/scratch/Lizard_Toepads/runs/obb/H1_obb_2class2/weights/best.pt")
output_dir = Path("inference_results/progression")
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script hardcodes absolute, environment-specific paths for the image and model weights. This makes it hard to reuse outside the original machine/HPC setup. Consider adding CLI arguments (or reading from configs/) for these paths with the current values as examples in the README/docs.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@JunlingZhuang JunlingZhuang Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Leyang, I think what copliot said might be right, some of them is hardcoded and in your user path. We may wnat to have config driven script also there are a centralized download yolo model donwload script scripts/download_models.py You might just want to add YOLO OBB model download code there

Comment on lines +27 to +30
# 1. Standard Inference
# We keep bot_finger(2), bot_toe(3), ruler(4), id(5)
# We IGNORE up_finger(0) and up_toe(1) from this pass as they are inaccurate axis-aligned boxes
results_orig = model.predict(img, imgsz=1280, conf=conf, iou=iou, verbose=False)[0]
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about which class IDs are kept in the standard pass doesn't match the 2-class model described above (it references bot_finger(2)/bot_toe(3)/ruler/id). This is misleading for readers and future maintenance; please update it to match the actual class IDs used in the code (0/1 only).

Copilot uses AI. Check for mistakes.
target_class: toe

inference:
predictor: toe_predictor_yolo_baseline.datyes
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predictor filename looks malformed ('toe_predictor_yolo_baseline.datyes'), which will break downstream inference steps that read this path. It should point to a real .dat file name (likely 'toe_predictor_yolo_baseline.dat').

Suggested change
predictor: toe_predictor_yolo_baseline.datyes
predictor: toe_predictor_yolo_baseline.dat

Copilot uses AI. Check for mistakes.
Comment on lines +247 to +249
# Run YOLO
results = model(img_path, conf=conf_threshold, device=0, verbose=False)
best_box, best_xywhr = find_best_toe_detection(
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YOLO inference is hardcoded to run on GPU device 0. This will fail on CPU-only machines or on nodes where the GPU isn't device 0. Please make device configurable (CLI arg / config) and pass it through, or omit 'device=' to let Ultralytics decide.

Copilot uses AI. Check for mistakes.
Comment on lines +293 to +295
# Update XML: change image path to crop, update box, update landmarks
image_elem.set("file", crop_path)
box_elem.set("left", str(new_left))
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This updates the path inside the per-box loop. If an image contains multiple entries, the image path will be overwritten for all boxes, and earlier boxes/parts will no longer match the referenced image. To support multiple boxes, create separate elements per crop (or only enable crop+rotate when there is exactly one box per image and validate that assumption).

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +26
x1 = max(0, x - px)
y1 = max(0, y - py)
x2 = min(img_w, x + w + px)
y2 = min(img_h, y + h + py)
return dlib.rectangle(int(x1), int(y1), int(x2), int(y2))
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dlib.rectangle expects right/bottom coordinates to be inclusive. Using x2=min(img_w, ...) and y2=min(img_h, ...) can produce x2==img_w or y2==img_h (one past the last valid pixel). Clamp to img_w-1/img_h-1 (and similarly in the rotated-crop path) to avoid out-of-bounds rectangles.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
from ultralytics import YOLO
import dlib
import sys
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports ('dlib', 'sys') will be flagged by ruff and can fail CI/lint. Please remove them or use them.

Copilot uses AI. Check for mistakes.
@JunlingZhuang
Copy link
Collaborator

JunlingZhuang commented Feb 20, 2026

I actually not quite sure how to reduplicate the preproccess and traning script as there are lots of new script without step by step doc. That makes the PR a little bit difficult for others to use. And may I know how those new scripts coupling/works with existed scripts. It might be good for reference #8 , there are some update in README.md to point out how to use the new scripts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants