Conversation
Adds --obb flag to preprocessing to generate rotated bbox labels using PCA angle from landmarks, enabling YOLO to predict rotation angle at inference time. This solves the chicken-and-egg problem where the dlib shape predictor needs rotation angle but can't get it without landmarks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New 6-class scheme: up_finger, up_toe, bot_finger, bot_toe, ruler, id. Merge script converts upper-view standard bboxes to OBB format and remaps bottom-view OBB classes, producing a unified dataset. Also fixes yolo model name typo (yolov11n -> yolo11n).
Remove generated outputs, symlinks, upstream assets, and accidental copies that shouldn't be in version control.
Add crop+rotate support to generate_yolo_bbox_xml.py and predict_landmarks_flip.py for OBB-based landmark prediction. Update .gitignore files, sbatch configs, and hyperparameter search.
Add configs, sbatch files, and preprocessing for baseline, OBB crop+rotate, and OBB axis-aligned experiments. Document results: OBB axis-aligned wins (toe 30.72 vs 83.27, finger 38.25 vs 40.64).
Debug scripts moved to .gitignore. Remove obsolete 2-class OBB config, sbatch, and dataset creation script.
There was a problem hiding this comment.
Pull request overview
Adds an end-to-end landmark prediction pipeline that combines YOLO (detect + OBB) detections with classic dlib shape predictors (ml-morph), including dataset-prep utilities, flip-strategy inference, SLURM job scripts, and experiment documentation.
Changes:
- Introduce ml-morph (dlib) training + hyperparameter search utilities and configs, integrated with the repo’s
uv/pyproject.tomlworkflow. - Add OBB dataset generation scripts (merged 6-class + no-flip variants) and flip-strategy inference scripts for OBB models.
- Add reproducibility artifacts: sbatch scripts, comparison docs, and experiment notes.
Reviewed changes
Copilot reviewed 71 out of 82 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/visualization/visualize_progression.py | Visualizes bbox→OBB→landmarks progression. |
| scripts/training/train_yolo.py | Add --task and pass YOLO task into training. |
| scripts/preprocessing/create_noflip_obb_dataset.py | Build 6-class OBB dataset without flip labels. |
| scripts/preprocessing/create_merged_obb_dataset.py | Merge bottom OBB + upper bbox into 6-class OBB labels. |
| scripts/preprocessing/consolidate_tps_by_category.py | Consolidate TPS files into category-specific files. |
| scripts/inference/predict_landmarks_flip.py | OBB flip inference + dlib landmark prediction. |
| scripts/inference/predict.py | Extend inference printing to support OBB outputs. |
| scripts/inference/inference_with_flip.py | Standalone flip-strategy OBB visualization inference. |
| scripts/extract_id_from_yolo.py | Adjust config key for labels directory lookup. |
| sbatch/train_yolo_obb_noflip.sbatch | SLURM job to create no-flip dataset + train OBB model. |
| sbatch/train_yolo_6class.sbatch | SLURM job to train 6-class OBB model. |
| sbatch/train_mlmorph_toe.sbatch | SLURM job for toe ml-morph training w/ YOLO-derived bboxes. |
| sbatch/train_mlmorph_finger.sbatch | SLURM job for finger ml-morph training w/ YOLO-derived bboxes. |
| sbatch/preprocess_obb_aligned.sbatch | SLURM preprocessing for axis-aligned OBB conversion. |
| sbatch/preprocess_obb.sbatch | SLURM preprocessing for crop+rotate OBB experiment. |
| sbatch/preprocess_baseline.sbatch | SLURM preprocessing for baseline detect bboxes. |
| sbatch/hyperparam_search_toe_obb_aligned.sbatch | SLURM hyperparam search (toe, OBB aligned). |
| sbatch/hyperparam_search_toe_obb.sbatch | SLURM hyperparam search (toe, crop+rotate OBB). |
| sbatch/hyperparam_search_toe_baseline.sbatch | SLURM hyperparam search (toe, baseline detect). |
| sbatch/hyperparam_search_toe.sbatch | SLURM quick hyperparam search (toe). |
| sbatch/hyperparam_search_finger_obb_aligned.sbatch | SLURM hyperparam search (finger, OBB aligned). |
| sbatch/hyperparam_search_finger_obb.sbatch | SLURM hyperparam search (finger, crop+rotate OBB). |
| sbatch/hyperparam_search_finger_baseline.sbatch | SLURM hyperparam search (finger, baseline detect). |
| sbatch/hyperparam_search_finger.sbatch | SLURM quick hyperparam search (finger). |
| pyproject.toml | Add pandas/lightning deps (+ tensorboard dev dep). |
| ml-morph/utils/utils.py | Classic ml-morph utility functions (xml/tps helpers). |
| ml-morph/utils/init.py | Package init + re-exports. |
| ml-morph/shape_trainer.py | Classic dlib shape predictor training script. |
| ml-morph/shape_tester.py | Classic dlib shape predictor testing script. |
| ml-morph/scripts/training/hyperparameter_search.py | Grid search for dlib shape predictor hyperparams. |
| ml-morph/scripts/train_workflow.py | Config-driven classic dlib workflow runner. |
| ml-morph/scripts/preprocessing/tps_to_xml.py | TPS→XML converter (no dlib dependency). |
| ml-morph/scripts/preprocessing/split_train_val_test.py | Train/val/test split + XML regeneration utility. |
| ml-morph/scripts/preprocessing/remove_landmarks_from_tps.py | Remove selected landmarks from TPS files. |
| ml-morph/scripts/preprocessing/merge_tps_files.py | Merge TPS files into consolidated TPS. |
| ml-morph/scripts/preprocessing/generate_yolo_bbox_xml.py | Replace XML bboxes with YOLO-derived (incl OBB/crop-rotate). |
| ml-morph/scripts/preprocessing/generate_obb_from_tps.py | Generate OBB labels from TPS landmarks. |
| ml-morph/scripts/preprocessing/extract_scale_tps.py | Extract scale landmarks into separate TPS files. |
| ml-morph/scripts/preprocessing/consolidate_all_tps.py | Consolidate TPS by category with optional landmark removal. |
| ml-morph/scripts/plot_hyperparam_results.py | Plot hyperparam search results. |
| ml-morph/scripts/evaluate.py | Standalone evaluation helper for PyTorch workflow. |
| ml-morph/requirements.txt | ml-morph requirements list. |
| ml-morph/preprocessing.py | Original ml-morph preprocessing entrypoint. |
| ml-morph/prediction.py | Original ml-morph prediction entrypoint. |
| ml-morph/makefile | Convenience install make target. |
| ml-morph/detector_trainer.py | Original ml-morph detector trainer. |
| ml-morph/detector_tester.py | Original ml-morph detector tester. |
| ml-morph/configs/toe_training_yolo_obb.yaml | Toe dlib training config using OBB-derived bboxes. |
| ml-morph/configs/toe_training_yolo_bbox.yaml | Toe dlib training config using detect-derived bboxes. |
| ml-morph/configs/toe_training_yolo_baseline.yaml | Toe baseline config for detect comparison. |
| ml-morph/configs/toe_training.yaml | Toe classic dlib workflow config. |
| ml-morph/configs/finger_training_yolo_obb.yaml | Finger dlib training config using OBB-derived bboxes. |
| ml-morph/configs/finger_training_yolo_bbox.yaml | Finger dlib training config using detect-derived bboxes. |
| ml-morph/configs/finger_training_yolo_baseline.yaml | Finger baseline config for detect comparison. |
| ml-morph/configs/default.yaml | Default config template for classic workflow. |
| ml-morph/README_ml-morph.md | Upstream-style ml-morph README content. |
| ml-morph/README.md | Repo-integrated ml-morph README and workflow guide. |
| ml-morph/.gitignore | Ignore generated training artifacts/crops/results. |
| docs/assets/bilateral_detection_plan/pre-augmentated/1201_flipud.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/pre-augmentated/1186_flipud.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/maunal-labeling/1237.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/maunal-labeling/1126.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/maunal-labeling/1111.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/maunal-labeling/1020.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/conf_Iou_treshhold_adjust/1007.jpg | Added documentation asset image. |
| docs/assets/bilateral_detection_plan/conf_Iou_treshhold_adjust/1004.jpg | Added documentation asset image. |
| docs/INFERENCE_WITH_FLIP.md | Documentation for flip inference strategy. |
| docs/EXPERIMENT_CROP_ROTATE_OBB.md | Write-up of crop+rotate OBB failed experiment. |
| docs/COMPARISON_BASELINE_VS_OBB.md | Baseline vs OBB comparison results and reproduction steps. |
| configs/H5_obb_noflip.yaml | YOLO dataset config for no-flip 6-class OBB dataset. |
| configs/H1_obb_6class.yaml | YOLO dataset config for merged 6-class OBB dataset. |
| .gitignore | Ignore logs/scratch and generated assets. |
Comments suppressed due to low confidence (1)
scripts/inference/predict.py:32
- A '--task' argument was added but isn't used anywhere, and model discovery is hardcoded to runs/detect/. This is confusing for users trying OBB inference. Either remove the flag, or use it to select runs/obb/ for default model paths and/or pass the intended task through to prediction logic.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import argparse | ||
| import os | ||
| from pathlib import Path | ||
| import cv2 | ||
| import numpy as np | ||
| from ultralytics import YOLO | ||
| import sys |
There was a problem hiding this comment.
Imports 'os' and 'sys' are unused. With ruff enabled, this will be reported as unused imports. Please remove them or use them.
| image_dir = get_opt('image-dir', None) | ||
| label_dir = get_opt('label-dir', None) | ||
| label_dir = get_opt('labels-dir', None) | ||
| output_dir = get_opt('output-dir', 'data/processed') |
There was a problem hiding this comment.
The config key was changed from 'label-dir' to 'labels-dir', but existing configs (e.g. configs/H1.yaml under extracting) still use 'label-dir'. This will cause label_dir to be None and the script to error. Consider supporting both keys for backward compatibility or updating all configs to use 'labels-dir'.
| print(f"Warning: Texture file for model '{stem}' is missing.") | ||
| print(f" Please provide it at: /Users/leyangloh/Downloads/3D_Fish/mussels_31 (or verify filename matches model name).") |
There was a problem hiding this comment.
This warning message references an unrelated local path ('/Users/leyangloh/Downloads/3D_Fish/mussels_31'), which is confusing for this repo/workflow. Please replace with a repo-relevant instruction (e.g., print the searched paths and suggest passing --source explicitly).
| # Paths | ||
| img_path = Path("/storage/ice-shared/cs8903onl/miami_fall_24_jpgs/1001.jpg") | ||
| bbox_model = Path("/home/hice1/yloh30/scratch/Lizard_Toepads/yolo_bounding_box.pt") | ||
| obb_model = Path("/home/hice1/yloh30/scratch/Lizard_Toepads/runs/obb/H1_obb_2class2/weights/best.pt") | ||
| output_dir = Path("inference_results/progression") |
There was a problem hiding this comment.
The script hardcodes absolute, environment-specific paths for the image and model weights. This makes it hard to reuse outside the original machine/HPC setup. Consider adding CLI arguments (or reading from configs/) for these paths with the current values as examples in the README/docs.
There was a problem hiding this comment.
Hi Leyang, I think what copliot said might be right, some of them is hardcoded and in your user path. We may wnat to have config driven script also there are a centralized download yolo model donwload script scripts/download_models.py You might just want to add YOLO OBB model download code there
| # 1. Standard Inference | ||
| # We keep bot_finger(2), bot_toe(3), ruler(4), id(5) | ||
| # We IGNORE up_finger(0) and up_toe(1) from this pass as they are inaccurate axis-aligned boxes | ||
| results_orig = model.predict(img, imgsz=1280, conf=conf, iou=iou, verbose=False)[0] |
There was a problem hiding this comment.
The comment about which class IDs are kept in the standard pass doesn't match the 2-class model described above (it references bot_finger(2)/bot_toe(3)/ruler/id). This is misleading for readers and future maintenance; please update it to match the actual class IDs used in the code (0/1 only).
| target_class: toe | ||
|
|
||
| inference: | ||
| predictor: toe_predictor_yolo_baseline.datyes |
There was a problem hiding this comment.
The predictor filename looks malformed ('toe_predictor_yolo_baseline.datyes'), which will break downstream inference steps that read this path. It should point to a real .dat file name (likely 'toe_predictor_yolo_baseline.dat').
| predictor: toe_predictor_yolo_baseline.datyes | |
| predictor: toe_predictor_yolo_baseline.dat |
| # Run YOLO | ||
| results = model(img_path, conf=conf_threshold, device=0, verbose=False) | ||
| best_box, best_xywhr = find_best_toe_detection( |
There was a problem hiding this comment.
YOLO inference is hardcoded to run on GPU device 0. This will fail on CPU-only machines or on nodes where the GPU isn't device 0. Please make device configurable (CLI arg / config) and pass it through, or omit 'device=' to let Ultralytics decide.
| # Update XML: change image path to crop, update box, update landmarks | ||
| image_elem.set("file", crop_path) | ||
| box_elem.set("left", str(new_left)) |
There was a problem hiding this comment.
This updates the path inside the per-box loop. If an image contains multiple entries, the image path will be overwritten for all boxes, and earlier boxes/parts will no longer match the referenced image. To support multiple boxes, create separate
elements per crop (or only enable crop+rotate when there is exactly one box per image and validate that assumption).
| x1 = max(0, x - px) | ||
| y1 = max(0, y - py) | ||
| x2 = min(img_w, x + w + px) | ||
| y2 = min(img_h, y + h + py) | ||
| return dlib.rectangle(int(x1), int(y1), int(x2), int(y2)) |
There was a problem hiding this comment.
dlib.rectangle expects right/bottom coordinates to be inclusive. Using x2=min(img_w, ...) and y2=min(img_h, ...) can produce x2==img_w or y2==img_h (one past the last valid pixel). Clamp to img_w-1/img_h-1 (and similarly in the rotated-crop path) to avoid out-of-bounds rectangles.
| from ultralytics import YOLO | ||
| import dlib | ||
| import sys |
There was a problem hiding this comment.
Unused imports ('dlib', 'sys') will be flagged by ruff and can fail CI/lint. Please remove them or use them.
|
I actually not quite sure how to reduplicate the preproccess and traning script as there are lots of new script without step by step doc. That makes the PR a little bit difficult for others to use. And may I know how those new scripts coupling/works with existed scripts. It might be good for reference #8 , there are some update in README.md to point out how to use the new scripts |
Complete pipeline for lizard toepad landmark prediction using dlib shape predictors
(ml-morph) with YOLO-detected bounding boxes. Compares two detection approaches:
Results: Baseline (H5) vs OBB (Axis-Aligned)
Test error = average pixel deviation between predicted and ground-truth landmarks (144
hyperparameter configs searched per condition).
OBB wins both categories. Oriented bounding boxes converted to axis-aligned rects with
30% padding produce tighter, more relevant crops for the shape predictor.
Failed Experiment: Crop+Rotate OBB
Attempted to crop and rotate images so OBB becomes upright with 10% padding. Results
were significantly worse (toe: 92.02, finger: 76.11) due to rotation interpolation
artifacts and landmark transformation issues. See docs/EXPERIMENT_CROP_ROTATE_OBB.md.
What's Included
create_noflip_obb_dataset.py
with flip strategy
num_trees
OBB, OBB-aligned)