Training YOLOv3 on a Custom Dataset

This guide provides a step-by-step workflow for training YOLOv3 to detect a custom object, in this case, a specific cow. The workflow combines both local and cloud-based computational resources for optimal performance.

Kaggle Dataset

The full dataset, including images, YOLO labels, and Darknet configuration files, is available on Kaggle.

Introduction

YOLO (You Only Look Once) is a state-of-the-art object detection model capable of identifying and localizing objects in images or video frames. Pre-trained versions are available for common objects (COCO dataset, 80 classes). In this project, we train YOLOv3 to detect a specific cow, but the workflow can be adapted for any object.

We use:

Darknet: Open-source neural network framework for YOLO.
Google Colab: Linux-based GPU environment for training.
Local machine: Windows-based environment for preprocessing, manual labeling, and tracking using OpenCV, LabelImg, and dlib.

Hardware and Software Environment

Workflow Overview

Image and Video Preprocessing
- Done locally using OpenCV and FFmpeg.
- Includes resizing, frame extraction, format conversion, etc.
Manual or Semi-Automated Labeling
- Use LabelImg for manual labeling.
- Semi-automated labeling: Track objects in video with dlib, save bounding boxes as YOLO labels.
Training and Testing YOLOv3
- Conducted on Google Colab with GPU support.
- Custom anchor sizes calculated and YOLOv3 configured for your dataset.
Interoperability
- Data is exchanged between local machine and Google Drive to manage preprocessing, training, and post-processing.

Preprocessing Input Data

YOLO requires both images and corresponding .txt label files. Each line in the label file contains:


<class_id> <center_x> <center_y> <width> <height>

<class_id>: Object class (0 if single class)
<center_x> & <center_y>: Center of bounding box relative to image dimensions
<width> & <height>: Width and height relative to image dimensions

Labeling Input Images

Manual Labeling with LabelImg

Steps:

Take snapshots of the target cow from video using FFmpeg:

ffmpeg -i input_video.mp4 -vf "fps=0.5,scale=1920:-1" -q:v 1 snapshot%d.png

Label snapshots in LabelImg:

# Navigate to LabelImg folder
cd C:\your_path
# Run LabelImg
python labelImg.py

Draw bounding boxes and assign class.
Save annotations (.txt) for YOLO.
If LabelImg assigns a different class ID (e.g., 15), adjust it to 0 manually or via provided script.

Semi-Automated Labeling using Object Tracking

Select a clear frame of the target object and manually annotate the bounding box.
Use dlib correlation tracker to track the object in video.
Save frames and bounding boxes at intervals, converting coordinates from dlib format to YOLO format.
Save snapshots and .txt labels with consistent filenames.

Configuring YOLOv3

Download YOLOv3

Download Darknet YOLOv3 from AlexeyAB/darknet.

Modify YOLO Configuration

.cfg file:
- batch = 64
- subdivisions = 64
- max_batches = 2000 (for 1 class)
- classes = 1
- filters = (classes + 5) * 3 = 18
- Update [yolo] layers with calculated anchors.
Custom anchors (from k-means clustering):

20 27, 16 23, 16 31, 18 27, 15 21, 19 26, 15 27, 15 25, 13 23

obj.data:

classes = 1
train = /content/darknet/train.txt
valid = /content/darknet/valid.txt
names = /content/darknet/obj.names
backup = /content/darknet/backup/

obj.names:

cow13

train.txt / valid.txt:

/content/darknet/data/train/image1.jpg
/content/darknet/data/train/image2.jpg
...

Training YOLOv3 on Google Colab

Upload the following files to Google Drive:

backup folder
obj.data
obj.names
train.txt
valid.txt

Run in Colab:

# Add execute permission
!chmod +x /content/drive/MyDrive/darknet/darknet

# Change directory
%cd /content/drive/MyDrive/darknet

# Start training
!./darknet detector train /content/drive/MyDrive/darknet/data/obj.data /content/drive/MyDrive/darknet/cfg/yolov3.cfg -dont_show

Save trained weights to backup folder.
Test detection on a video:

!./darknet detector demo /content/drive/MyDrive/darknet/data/obj.data \
/content/drive/MyDrive/darknet/cfg/yolov3.cfg \
/content/drive/MyDrive/darknet/backup/yolov3_best.weights \
-dont_show -out /content/drive/MyDrive/outputVideo.mp4 \
/content/drive/MyDrive/cutVideo.mp4

Detect on a single image:

!./darknet detector test /content/drive/MyDrive/darknet/data/obj.data \
/content/drive/MyDrive/darknet/cfg/yolov3.cfg \
/content/drive/MyDrive/darknet/backup/yolov3_best.weights \
/content/drive/MyDrive/cow3.png -dont_show -out /content/drive/MyDrive/outputImage.jpg

References

Redmon, J., Divvala, S., Girshick, R., Farhadi, A., You Only Look Once: Unified, Real-Time Object Detection, 2016.
Lin, T.Y., et al., Microsoft COCO: Common Objects in Context, 2014.
FFmpeg
LabelImg
AlexeyAB Darknet
Kaggle dataset for YOLOv3 Cow Detection: https://www.kaggle.com/datasets/gayebartos/yolov3-cow-detection

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training YOLOv3 on a Custom Dataset

Kaggle Dataset

Table of Contents

Introduction

Hardware and Software Environment

Workflow Overview

Preprocessing Input Data

Labeling Input Images

Manual Labeling with LabelImg

Semi-Automated Labeling using Object Tracking

Configuring YOLOv3

Download YOLOv3

Modify YOLO Configuration

Training YOLOv3 on Google Colab

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Training YOLOv3 on a Custom Dataset

Kaggle Dataset

Table of Contents

Introduction

Hardware and Software Environment

Workflow Overview

Preprocessing Input Data

Labeling Input Images

Manual Labeling with LabelImg

Semi-Automated Labeling using Object Tracking

Configuring YOLOv3

Download YOLOv3

Modify YOLO Configuration

Training YOLOv3 on Google Colab

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages