YOLOv4: Train on Custom Dataset
Clone and build Darknet
Clone darknet repo
git clone https://github.com/AlexeyAB/darknet
Change makefile to have GPU and OPENCV enabled
cd darknet
sed -i 's/OPENCV=0/OPENCV=1/' Makefile
sed -i 's/GPU=0/GPU=1/' Makefile
sed -i 's/CUDNN=0/CUDNN=1/' Makefile
sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
Verify CUDA
/usr/local/cuda/bin/nvcc --version
Compile on Linux using make
Make darknet
make
GPU=1
: build with CUDA to accelerate by using GPUCUDNN=1
: build with cuDNN v5-v7 to accelerate training by using GPUCUDNN_HALF=1
to build for Tensor Cores (on Titan V / Tesla V100 / DGX-2 and later) speedup Detection 3x, Training 2xOPENCV=1
to build with OpenCV 4.x/3.x/2.4.x - allows to detect on video files and video streams from network cameras or web-camsDEBUG=1
to bould debug version of YoloOPENMP=1
to build with OpenMP support to accelerate Yolo by using multi-core CPU
make
command.Prepare custom dataset
The custom dataset should be in YOLOv4 or darknet format:
For each
.jpg
image file, there should be a corresponding.txt
fileIn the same directory, with the same name, but with
.txt
-extensionFor example, if there’s an
.jpg
image namedBloodImage_00001.jpg
, there should also be a corresponding.txt
file namedBloodImage_00001.txt
In this
.txt
file: object number and object coordinates on this image, for each object in new line.Format:
<object-class> <x_center> <y_center> <width> <height>
<object-class>
: integer object number from0
to(classes-1)
<x_center> <y_center> <width> <height>
: float values relative to width and height of image, it can be equal from(0.0 to 1.0]
<x_center> <y_center>
are center of rectangle (are not top-left corner)
Configure files for training
For training
cfg/yolov4-custom.cfg
download the pre-trained weights-file yolov4.conv.137cd darknet wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137
In folder
./cfg
, create custom config file (let’s call itcustom-yolov4-detector.cfg
) with the same content as inyolov4-custom.cfg
andchange line batch to
batch=64
change line subdivisions to
subdivisions=16
change line max_batches to
classes*2000
but- NOT less than number of training images
- NOT less than number of training images
- NOT less than 6000
e.g.
max_batches=6000
if you train for 3 classeschange line steps to 80% and 90% of max_batches (e.g.
steps=4800, 5400
)set network size
width=416 height=416
or any value multiple of 32change line
classes=80
to number of objects in each of 3[yolo]
-layerschange [
filters=255
] to $ \text{filters}=(\text{classes} + 5) \times 3$ in the 3[convolutional]
before each[yolo]
layer, keep in mind that it only has to be the last[convolutional]
before each of the[yolo]
layers.Note: Do not write in the cfg-file:
filters=(classes + 5) x 3
!!!It has to be the specific number!
E.g.
classes=1
then should befilters=18
;classes=2
then should befilters=21
So for example, for 2 objects, your custom config file should differ from
yolov4-custom.cfg
in such lines in each of 3 [yolo]-layers:[convolutional] filters=21 [region] classes=2
when using
[Gaussian_yolo]
layers, change [filters=57
] $ \text{filters}=(\text{classes} + 9) \times 3$ in the 3[convolutional]
before each[Gaussian_yolo]
layer
Create file
obj.names
in the directorydata/
, with objects names - each in new lineCreate fiel
obj.data
in the directorydata/
, containing (where classes = number of objects):For example, if we two objects
classes = 2 train = data/train.txt valid = data/test.txt names = data/obj.names backup = backup/
Put image files (
.jpg
) of your objects in the directorydata/obj/
Create
train.txt
in directorydata/
with filenames of your images, each filename in new line, with path relative todarknet
.For example containing:
data/obj/img1.jpg data/obj/img2.jpg data/obj/img3.jpg
Download pre-trained weights for the convolutional layers and put to the directory
darknet
(root directory of the project)- for
yolov4.cfg
,yolov4-custom.cfg
(162 MB): yolov4.conv.137 - for
yolov4-tiny.cfg
,yolov4-tiny-3l.cfg
,yolov4-tiny-custom.cfg
(19 MB): yolov4-tiny.conv.29 - for
csresnext50-panet-spp.cfg
(133 MB): csresnext50-panet-spp.conv.112 - for
yolov3.cfg, yolov3-spp.cfg
(154 MB): darknet53.conv.74 - for
yolov3-tiny-prn.cfg , yolov3-tiny.cfg
(6 MB): yolov3-tiny.conv.11 - for
enet-coco.cfg (EfficientNetB0-Yolov3)
(14 MB): enetb0-coco.conv.132
- for
Start training
./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show
file
yolo-obj_last.weights
will be saved to thebackup\
for each 100 iterations-dont_show
: disable Loss-Window, if you train on computer without monitor (e.g remote server)
To see the mAP & loss0chart during training on remote server:
- use command
./darknet detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -dont_show -mjpeg_port 8090 -map
- then open URL
http://ip-address:8090
in Chrome/Firefox browser)
After training is complete, you can get weights from backup/
If you want the training to output only main information (e.g loss, mAP, remaining training time) instead of full logging, you can use this command
./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show -map 2>&1 | tee log/train.log | grep -E "hours left|mean_average"
Then the output will look like followings:
1189: 1.874030, 2.934438 avg loss, 0.002610 rate, 2.930427 seconds, 76096 images, 3.905244 hours left
Notes
If during training you see
nan
values foravg
(loss) field - then training goes wrong! 🤦♂️But if
nan
is in some other lines - then training goes well.if error
Out of memory
occurs then in.cfg
-file you should increasesubdivisions=16
, 32 or 64
Train tiny-YOLO
Do all the same steps as for the full yolo model as described above. With the exception of:
Download file with the first 29-convolutional layers of yolov4-tiny:
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.conv.29
(Or get this file from yolov4-tiny.weights file by using command:
./darknet partial cfg/yolov4-tiny-custom.cfg yolov4-tiny.weights yolov4-tiny.conv.29 29
)Make your custom model
yolov4-tiny-obj.cfg
based oncfg/yolov4-tiny-custom.cfg
instead ofyolov4.cfg
import re # num_classes: number of object classes max_batches = max(num_classes * 2000, num_train_images, 6000) steps1 = .8 * max_batches steps2 = .9 * max_batches num_filters = (num_classes + 5) * 3 # Assuming that we have already defined the following hyperparameters: # - TINY_CONFIG_FILE: config file we're gonna use for training # - WIDTH, HEIGHT: width and height of image with open("cfg/yolov4-tiny-custom.cfg", "r") as reader, open(TINY_CONFIG_FILE, "w") as writer: content = reader.read() content = re.sub("subdivisions=\d*", f"subdivisions={SUBDIVISION}", content) content = re.sub("width=\d*", f"width={WIDTH}", content) content = re.sub("height=\d*", f"height={HEIGHT}", content) content = re.sub("max_batches = \d*", f"max_batches = {max_batches}", content) content = re.sub("steps=\d*,\d*", f"steps={steps1},{steps2}", content) content = re.sub("classes=\d*", f"classes={num_classes}", content) content = re.sub("pad=1\nfilters=\d*", f"pad=1\nfilters={num_filters}", content) writer.write(content)
Start training:
./darknet detector train data/obj.data yolov4-tiny-obj.cfg yolov4-tiny.conv.29
Google Colab Notebook
Small hacks to keep colab notebook training
Open up the inspector view on Chrome
Switch to the console window
Paste the following code
function ClickConnect(){ console.log("Working"); document .querySelector('#top-toolbar > colab-connect-button') .shadowRoot.querySelector('#connect') .click() } setInterval(ClickConnect,60000)
and hit Enter.
It will click the screen every 10 minutes so that you don’t get kicked off for being idle!
Convert YOLOv4 to TensorRT through ONNX
To convert YOLOv4 to TensorRT engine through ONNX, I used the code from TensorRT_demos following its step-by-step instructions. For more details about the code, check out this blog post.
Note that the Code in this repo was designed to run on Jetson platforms. In my case, conversion from YOLOv4 to TensorRT engine was conducted on Jetson Nano.
Convert YOLOv4 for custom trained models
To apply the conversion for custom trained models, see TensorRT YOLOv3 For Custom Trained Models. You need to stick to the naming convention {yolo_version}-{custom_name}-{image_size}
. Otherwise you’ll get errors during conversion.
Reference
Guide from AlexeyAB/darknet repo: How to train (to detect your custom objects)
Tutorials
👨🏫 How to Train YOLOv4 on a Custom Dataset in Darknet
Train YOLOv4-tiny on custom dataset: Train YOLOv4-tiny on Custom Data - Lightning Fast Object Detection
YOLOv4 in the CLOUD: Build and Train Custom Object Detector (FREE GPU)
Video tutorial: