Computer Vision

Computer Vision

Computer Vision (CV) Tasks

截屏2020-08-20 13.23.32

Object Localization: Coordinate prediction

截屏2020-08-20 13.36.50

Sliding Window

Object Localization

截屏2020-08-20 16.52.34

Classification & Localization

截屏2020-08-20 16.53.21

Detection

截屏2020-08-20 16.54.26 截屏2020-08-20 16.54.44

Sliding Window + Classification:

截屏2020-08-20 16.55.40

Regioning

截屏2020-08-20 16.56.37
  • Sliding Window Problem: Need to test many positions and scales, and use a computationally demanding classifier

  • Solution: Only look at a tiny subset of possible positions

    • Regioning => propose image regions that are likely to contain objects
    • Classify individual regions and correct regions
    • R-CNN -> Fast R-CNN -> Faster R-CNN

R-CNN

  • Propose approx. 2k different regions (bounding boxes) for image classification
  • For each box, do image classification with CNN
    • Discard unlikely boxes
  • Refine bounding boxes with regression

Object Detection for Dummies Part 3: R-CNN Family

Fast R-CNN

  • 9x faster training, 213x faster test time
  • R-CNN is not end to end (first train softmax classifier, use that for training bounding box regressor)
  • Similar to R-CNN
    • Apply Region Proposals on feature map result of applied CNN to input image
    • Reshape region proposals on feature map into fixed size
    • Feed into FC layer
Object Detection for Dummies Part 3: R-CNN Family

Faster R-CNN

  • Both R-CNN and R-CNN rely on Selective Search for region proposals -> most time consuming part 🤪
  • Use a seperate Network for predicting the regions of interest 💪
Object Detection for Dummies Part 3: R-CNN Family

YOLO

  • You Only Look Once: Unified Real-Time Object Detection

  • „Simple network“, directly from pixels to bounding box / object detection / class prediction

Image Segmentation

  • Grouping Pixels into regions that belong to same properties
  • Eg: Segmenting an Image into meaningful objects
截屏2020-08-20 17.31.10

Semantic Segmentation

Sliding Window

  • Label each pixel in image with a category label

  • Don‘t differentiate instances, only care about pixels

  • => just extract small patches from an image and classify center pixel with a normal CNN classifier

    截屏2020-08-20 17.33.57
  • Problem: very inefficient

Fully convolutional

  • Keep the network as an end to end convolutional Neural Network

  • Predictions are made for all pixels at once

    截屏2020-08-20 17.34.17

  • Convolutions at original image resolution are very expensive

Reference