YOLOv4: Train on Custom Dataset
Clone and build Darknet
Clone darknet repo
git clone https://github.com/AlexeyAB/darknet
Change makefile to have GPU and OPENCV enabled
cd darknet
sed -i 's/OPENCV=0/OPENCV=1/' Makefile
sed -i 's/GPU=0/GPU=1/' Makefile
sed -i 's/CUDNN=0/CUDNN=1/' Makefile
sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
Verify CUDA
/usr/local/cuda/bin/nvcc --version
Compile on Linux using make
Make darknet
make
GPU=1: build with CUDA to accelerate by using GPUCUDNN=1: build with cuDNN v5-v7 to accelerate training by using GPUCUDNN_HALF=1to build for Tensor Cores (on Titan V / Tesla V100 / DGX-2 and later) speedup Detection 3x, Training 2xOPENCV=1to build with OpenCV 4.x/3.x/2.4.x - allows to detect on video files and video streams from network cameras or web-camsDEBUG=1to bould debug version of YoloOPENMP=1to build with OpenMP support to accelerate Yolo by using multi-core CPU
make command.Prepare custom dataset
The custom dataset should be in YOLOv4 or darknet format:
For each
.jpgimage file, there should be a corresponding.txtfileIn the same directory, with the same name, but with
.txt-extensionFor example, if there’s an
.jpgimage namedBloodImage_00001.jpg, there should also be a corresponding.txtfile namedBloodImage_00001.txt
In this
.txtfile: object number and object coordinates on this image, for each object in new line.Format:
<object-class> <x_center> <y_center> <width> <height><object-class>: integer object number from0to(classes-1)<x_center> <y_center> <width> <height>: float values relative to width and height of image, it can be equal from(0.0 to 1.0]<x_center> <y_center>are center of rectangle (are not top-left corner)
Configure files for training
For training
cfg/yolov4-custom.cfgdownload the pre-trained weights-file yolov4.conv.137cd darknet wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137In folder
./cfg, create custom config file (let’s call itcustom-yolov4-detector.cfg) with the same content as inyolov4-custom.cfgandchange line batch to
batch=64change line subdivisions to
subdivisions=16change line max_batches to
classes*2000but- NOT less than number of training images
- NOT less than number of training images
- NOT less than 6000
e.g.
max_batches=6000if you train for 3 classeschange line steps to 80% and 90% of max_batches (e.g.
steps=4800, 5400)set network size
width=416 height=416or any value multiple of 32change line
classes=80to number of objects in each of 3[yolo]-layerschange [
filters=255] to $ \text{filters}=(\text{classes} + 5) \times 3$ in the 3[convolutional]before each[yolo]layer, keep in mind that it only has to be the last[convolutional]before each of the[yolo]layers.Note: Do not write in the cfg-file:
filters=(classes + 5) x 3!!!It has to be the specific number!
E.g.
classes=1then should befilters=18;classes=2then should befilters=21So for example, for 2 objects, your custom config file should differ from
yolov4-custom.cfgin such lines in each of 3 [yolo]-layers:[convolutional] filters=21 [region] classes=2when using
[Gaussian_yolo]layers, change [filters=57] $ \text{filters}=(\text{classes} + 9) \times 3$ in the 3[convolutional]before each[Gaussian_yolo]layer
Create file
obj.namesin the directorydata/, with objects names - each in new lineCreate fiel
obj.datain the directorydata/, containing (where classes = number of objects):For example, if we two objects
classes = 2 train = data/train.txt valid = data/test.txt names = data/obj.names backup = backup/Put image files (
.jpg) of your objects in the directorydata/obj/Create
train.txtin directorydata/with filenames of your images, each filename in new line, with path relative todarknet.For example containing:
data/obj/img1.jpg data/obj/img2.jpg data/obj/img3.jpgDownload pre-trained weights for the convolutional layers and put to the directory
darknet(root directory of the project)- for
yolov4.cfg,yolov4-custom.cfg(162 MB): yolov4.conv.137 - for
yolov4-tiny.cfg,yolov4-tiny-3l.cfg,yolov4-tiny-custom.cfg(19 MB): yolov4-tiny.conv.29 - for
csresnext50-panet-spp.cfg(133 MB): csresnext50-panet-spp.conv.112 - for
yolov3.cfg, yolov3-spp.cfg(154 MB): darknet53.conv.74 - for
yolov3-tiny-prn.cfg , yolov3-tiny.cfg(6 MB): yolov3-tiny.conv.11 - for
enet-coco.cfg (EfficientNetB0-Yolov3)(14 MB): enetb0-coco.conv.132
- for
Start training
./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show
file
yolo-obj_last.weightswill be saved to thebackup\for each 100 iterations-dont_show: disable Loss-Window, if you train on computer without monitor (e.g remote server)
To see the mAP & loss0chart during training on remote server:
- use command
./darknet detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -dont_show -mjpeg_port 8090 -map - then open URL
http://ip-address:8090in Chrome/Firefox browser)
After training is complete, you can get weights from backup/
If you want the training to output only main information (e.g loss, mAP, remaining training time) instead of full logging, you can use this command
./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show -map 2>&1 | tee log/train.log | grep -E "hours left|mean_average"
Then the output will look like followings:
1189: 1.874030, 2.934438 avg loss, 0.002610 rate, 2.930427 seconds, 76096 images, 3.905244 hours left
Notes
If during training you see
nanvalues foravg(loss) field - then training goes wrong! 🤦♂️But if
nanis in some other lines - then training goes well.if error
Out of memoryoccurs then in.cfg-file you should increasesubdivisions=16, 32 or 64
Train tiny-YOLO
Do all the same steps as for the full yolo model as described above. With the exception of:
Download file with the first 29-convolutional layers of yolov4-tiny:
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.conv.29(Or get this file from yolov4-tiny.weights file by using command:
./darknet partial cfg/yolov4-tiny-custom.cfg yolov4-tiny.weights yolov4-tiny.conv.29 29)Make your custom model
yolov4-tiny-obj.cfgbased oncfg/yolov4-tiny-custom.cfginstead ofyolov4.cfgimport re # num_classes: number of object classes max_batches = max(num_classes * 2000, num_train_images, 6000) steps1 = .8 * max_batches steps2 = .9 * max_batches num_filters = (num_classes + 5) * 3 # Assuming that we have already defined the following hyperparameters: # - TINY_CONFIG_FILE: config file we're gonna use for training # - WIDTH, HEIGHT: width and height of image with open("cfg/yolov4-tiny-custom.cfg", "r") as reader, open(TINY_CONFIG_FILE, "w") as writer: content = reader.read() content = re.sub("subdivisions=\d*", f"subdivisions={SUBDIVISION}", content) content = re.sub("width=\d*", f"width={WIDTH}", content) content = re.sub("height=\d*", f"height={HEIGHT}", content) content = re.sub("max_batches = \d*", f"max_batches = {max_batches}", content) content = re.sub("steps=\d*,\d*", f"steps={steps1},{steps2}", content) content = re.sub("classes=\d*", f"classes={num_classes}", content) content = re.sub("pad=1\nfilters=\d*", f"pad=1\nfilters={num_filters}", content) writer.write(content)Start training:
./darknet detector train data/obj.data yolov4-tiny-obj.cfg yolov4-tiny.conv.29
Google Colab Notebook
Small hacks to keep colab notebook training
Open up the inspector view on Chrome
Switch to the console window
Paste the following code
function ClickConnect(){ console.log("Working"); document .querySelector('#top-toolbar > colab-connect-button') .shadowRoot.querySelector('#connect') .click() } setInterval(ClickConnect,60000)and hit Enter.
It will click the screen every 10 minutes so that you don’t get kicked off for being idle!
Convert YOLOv4 to TensorRT through ONNX
To convert YOLOv4 to TensorRT engine through ONNX, I used the code from TensorRT_demos following its step-by-step instructions. For more details about the code, check out this blog post.
Note that the Code in this repo was designed to run on Jetson platforms. In my case, conversion from YOLOv4 to TensorRT engine was conducted on Jetson Nano.
Convert YOLOv4 for custom trained models
To apply the conversion for custom trained models, see TensorRT YOLOv3 For Custom Trained Models. You need to stick to the naming convention {yolo_version}-{custom_name}-{image_size}. Otherwise you’ll get errors during conversion.
Reference
Guide from AlexeyAB/darknet repo: How to train (to detect your custom objects)
Tutorials
👨🏫 How to Train YOLOv4 on a Custom Dataset in Darknet
Train YOLOv4-tiny on custom dataset: Train YOLOv4-tiny on Custom Data - Lightning Fast Object Detection
YOLOv4 in the CLOUD: Build and Train Custom Object Detector (FREE GPU)
Video tutorial: