Spatial Transformer Network You Only Look Once (STN-YOLO) for Improved Object Detection
Yash Zambre, Ekdev Rajkitikul, Akshatha Mohan and Joshua Peeples
Zendo. https://zenodo.org/records/10905984
Note: If this code is used, cite it: Yash Zambre, Joshua Peeples, Akshatha Mohan and Ekdev Rajkitikul.
In this repository, we provide the code for the "Spatial Transformer Network You Only Look Once (STN-YOLO) for Improved Object Detection"
This code uses python, pytorch and YOLO model.
Please use Pytorch's website to download necessary packages.
YOLO is used for the object detection model and the framework used is Ultralytics. Please follow the instructions on each website to download the modules.
Run demo.py in Python IDE (e.g., Spyder) or command line.
The STN-YOLO runs using the following functions.
- Intialize model
model, input_size = intialize_model(data, epochs, batch, device, pretrained, etc..)
- Prepare dataset(s) for model
The dataset should be in YOLOV8 format
A sample dataset thet we used for this project is given here, this dataset is an inhouse dataset grown in the Texas A&M Agrilife facility - College Station, TX
Dataset
- Train model
model.train(data, epochs, batch, device, pretrained, etc..)
- Test model
model.test(data, epochs, batch, device, pretrained, etc..)
https://github.com/Advanced-Vision-and-Learning-Lab/STN-YOLO
└── root dir
├── demo.py //Run this. Main demo file.
├── Ultralytics
├── cfg/datasets.yaml
├── cfg/models/models.yaml (change for the addition of STN here)
├── data (does the data loading)
├── models (All the models in the Ultralytics framework are present here)
├── modules/block.py (The STN is defined here with its localization network.)
└── Utils //utility functions
├── Network_functions.py // Contains functions to initialize, train, and test model.
This source code is licensed under the license found in the LICENSE
file in the root directory of this source tree.
This product is Copyright (c) 2023 Yash Zambre and Ekdev Rajkitkul and Akshatha Mohan and Joshua Peeples. All rights reserved.
If you use the code, please cite the following reference using the following entry.
Plain Text:
Yash Zambre and Ekdev Rajkitkul and Akshatha Mohan and Joshua Peeples, "Spatial Transformer Network You Only Look Once (STN-YOLO) for Improved Object Detection,"2024 23rd IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, Florida, 2024, pp. 1-7. doi:https://doi.org/10.48550/arXiv.2407.21652 keywords:{Spatial transformer network, object detection, YOLO, plant phenotyping}
BibTex:
@misc{zambre2024spatialtransformernetworkyolo,
title={Spatial Transformer Network YOLO Model for Agricultural Object Detection},
author={Yash Zambre and Ekdev Rajkitkul and Akshatha Mohan and Joshua Peeples},
year={2024},
eprint={2407.21652},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.21652},
}
