YOLOv4: Train on Custom Dataset

YOLOv4: Train on Custom Dataset

Clone and build Darknet

Clone darknet repo

git clone https://github.com/AlexeyAB/darknet

Change makefile to have GPU and OPENCV enabled

cd darknet
sed -i 's/OPENCV=0/OPENCV=1/' Makefile
sed -i 's/GPU=0/GPU=1/' Makefile
sed -i 's/CUDNN=0/CUDNN=1/' Makefile
sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile

Verify CUDA

/usr/local/cuda/bin/nvcc --version

Compile on Linux using make

Make darknet

make
  • GPU=1 : build with CUDA to accelerate by using GPU
  • CUDNN=1 : build with cuDNN v5-v7 to accelerate training by using GPU
  • CUDNN_HALF=1 to build for Tensor Cores (on Titan V / Tesla V100 / DGX-2 and later) speedup Detection 3x, Training 2x
  • OPENCV=1 to build with OpenCV 4.x/3.x/2.4.x - allows to detect on video files and video streams from network cameras or web-cams
  • DEBUG=1 to bould debug version of Yolo
  • OPENMP=1 to build with OpenMP support to accelerate Yolo by using multi-core CPU
Do not worry about any warnings when running make command.

Prepare custom dataset

The custom dataset should be in YOLOv4 or darknet format:

  • For each .jpg image file, there should be a corresponding .txt file

    • In the same directory, with the same name, but with .txt-extension

      For example, if there’s an .jpg image named BloodImage_00001.jpg, there should also be a corresponding .txt file named BloodImage_00001.txt

  • In this .txt file: object number and object coordinates on this image, for each object in new line.

    Format:

    <object-class> <x_center> <y_center> <width> <height>
    
    • <object-class> : integer object number from 0 to (classes-1)
    • <x_center> <y_center> <width> <height> : float values relative to width and height of image, it can be equal from (0.0 to 1.0]
      • <x_center> <y_center> are center of rectangle (are not top-left corner)

Configure files for training

  1. For training cfg/yolov4-custom.cfg download the pre-trained weights-file yolov4.conv.137

    cd darknet
    wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137
    
  2. In folder ./cfg, create custom config file (let’s call it custom-yolov4-detector.cfg) with the same content as in yolov4-custom.cfg and

    • change line batch to batch=64

    • change line subdivisions to subdivisions=16

    • change line max_batches to classes*2000 but

      • NOT less than number of training images
      • NOT less than number of training images
      • NOT less than 6000

      e.g. max_batches=6000 if you train for 3 classes

    • change line steps to 80% and 90% of max_batches (e.g. steps=4800, 5400)

    • set network size width=416 height=416 or any value multiple of 32

    • change line classes=80 to number of objects in each of 3 [yolo]-layers

    • change [filters=255] to $ \text{filters}=(\text{classes} + 5) \times 3$ in the 3 [convolutional] before each [yolo] layer, keep in mind that it only has to be the last [convolutional] before each of the [yolo] layers.

      Note: Do not write in the cfg-file: filters=(classes + 5) x 3!!!

      It has to be the specific number!

      E.g. classes=1 then should be filters=18; classes=2 then should be filters=21

      So for example, for 2 objects, your custom config file should differ from yolov4-custom.cfg in such lines in each of 3 [yolo]-layers:

      [convolutional]
      filters=21
      
      [region]
      classes=2
      
    • when using [Gaussian_yolo] layers, change [filters=57] $ \text{filters}=(\text{classes} + 9) \times 3$ in the 3 [convolutional] before each [Gaussian_yolo] layer

  3. Create file obj.names in the directory data/, with objects names - each in new line

  4. Create fiel obj.data in the directory data/, containing (where classes = number of objects):

    For example, if we two objects

    classes = 2
    train  = data/train.txt
    valid  = data/test.txt
    names = data/obj.names
    backup = backup/
    
  5. Put image files (.jpg) of your objects in the directory data/obj/

  6. Create train.txt in directory data/ with filenames of your images, each filename in new line, with path relative to darknet.

    For example containing:

    data/obj/img1.jpg
    data/obj/img2.jpg
    data/obj/img3.jpg
    
  7. Download pre-trained weights for the convolutional layers and put to the directory darknet (root directory of the project)

Start training

./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show
  • file yolo-obj_last.weights will be saved to the backup\ for each 100 iterations

  • -dont_show: disable Loss-Window, if you train on computer without monitor (e.g remote server)

To see the mAP & loss0chart during training on remote server:

  • use command ./darknet detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -dont_show -mjpeg_port 8090 -map
  • then open URL http://ip-address:8090 in Chrome/Firefox browser)

After training is complete, you can get weights from backup/

If you want the training to output only main information (e.g loss, mAP, remaining training time) instead of full logging, you can use this command

./darknet detector train data/obj.data custom-yolov4-detector.cfg yolov4.conv.137 -dont_show -map 2>&1 | tee log/train.log | grep -E "hours left|mean_average"

Then the output will look like followings:

 1189: 1.874030, 2.934438 avg loss, 0.002610 rate, 2.930427 seconds, 76096 images, 3.905244 hours left

Notes

  • If during training you see nan values for avg (loss) field - then training goes wrong! ​🤦‍♂️​

    But if nan is in some other lines - then training goes well.

  • if error Out of memory occurs then in .cfg-file you should increase subdivisions=16, 32 or 64

Train tiny-YOLO

Do all the same steps as for the full yolo model as described above. With the exception of:

  • Download file with the first 29-convolutional layers of yolov4-tiny:

    wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.conv.29
    

    (Or get this file from yolov4-tiny.weights file by using command: ./darknet partial cfg/yolov4-tiny-custom.cfg yolov4-tiny.weights yolov4-tiny.conv.29 29)

  • Make your custom model yolov4-tiny-obj.cfg based on cfg/yolov4-tiny-custom.cfg instead of yolov4.cfg

    import re
    
    # num_classes: number of object classes
    max_batches = max(num_classes * 2000, num_train_images, 6000)
    steps1 = .8 * max_batches
    steps2 = .9 * max_batches
    num_filters = (num_classes + 5) * 3
    
    # Assuming that we have already defined the following hyperparameters:
    # - TINY_CONFIG_FILE: config file we're gonna use for training
    # - WIDTH, HEIGHT: width and height of image
    with open("cfg/yolov4-tiny-custom.cfg", "r") as reader, open(TINY_CONFIG_FILE, "w") as writer:
        content = reader.read()
    
        content = re.sub("subdivisions=\d*", f"subdivisions={SUBDIVISION}", content)
        content = re.sub("width=\d*", f"width={WIDTH}", content)
        content = re.sub("height=\d*", f"height={HEIGHT}", content)
        content = re.sub("max_batches = \d*", f"max_batches = {max_batches}", content)
        content = re.sub("steps=\d*,\d*", f"steps={steps1},{steps2}", content)
        content = re.sub("classes=\d*", f"classes={num_classes}", content)
        content = re.sub("pad=1\nfilters=\d*", f"pad=1\nfilters={num_filters}", content)
    
        writer.write(content)
    
  • Start training:

    ./darknet detector train data/obj.data yolov4-tiny-obj.cfg yolov4-tiny.conv.29
    

Google Colab Notebook

Colab Notebook

Small hacks to keep colab notebook training

  1. Open up the inspector view on Chrome

  2. Switch to the console window

  3. Paste the following code

    function ClickConnect(){
    console.log("Working"); 
    document
      .querySelector('#top-toolbar > colab-connect-button')
      .shadowRoot.querySelector('#connect')
      .click() 
    }
    setInterval(ClickConnect,60000)
    

    and hit Enter.

It will click the screen every 10 minutes so that you don’t get kicked off for being idle!

Convert YOLOv4 to TensorRT through ONNX

To convert YOLOv4 to TensorRT engine through ONNX, I used the code from TensorRT_demos following its step-by-step instructions. For more details about the code, check out this blog post.

Note that the Code in this repo was designed to run on Jetson platforms. In my case, conversion from YOLOv4 to TensorRT engine was conducted on Jetson Nano.

Convert YOLOv4 for custom trained models

To apply the conversion for custom trained models, see TensorRT YOLOv3 For Custom Trained Models. You need to stick to the naming convention {yolo_version}-{custom_name}-{image_size}. Otherwise you’ll get errors during conversion.

Reference