Contents

Computer Vision

Computer Vision (CV) Tasks

截屏2020-08-20 13.23.32

Classification
Classification + Localization
Object Detection
Instance Segmentation

Object Localization: Coordinate prediction

截屏2020-08-20 13.36.50

Sliding Window

Object Localization

截屏2020-08-20 16.52.34

Classification & Localization

截屏2020-08-20 16.53.21

Detection

截屏2020-08-20 16.54.26

截屏2020-08-20 16.54.44

Sliding Window + Classification:

截屏2020-08-20 16.55.40

Regioning

截屏2020-08-20 16.56.37

Sliding Window Problem: Need to test many positions and scales, and use a computationally demanding classifier
Solution: Only look at a tiny subset of possible positions
- Regioning => propose image regions that are likely to contain objects
- Classify individual regions and correct regions
- R-CNN -> Fast R-CNN -> Faster R-CNN

R-CNN

Propose approx. 2k different regions (bounding boxes) for image classification
For each box, do image classification with CNN
- Discard unlikely boxes
Refine bounding boxes with regression

Object Detection for Dummies Part 3: R-CNN Family

Fast R-CNN

9x faster training, 213x faster test time
R-CNN is not end to end (first train softmax classifier, use that for training bounding box regressor)
Similar to R-CNN
- Apply Region Proposals on feature map result of applied CNN to input image
- Reshape region proposals on feature map into fixed size
- Feed into FC layer

Object Detection for Dummies Part 3: R-CNN Family

Faster R-CNN

Both R-CNN and R-CNN rely on Selective Search for region proposals -> most time consuming part 🤪
Use a seperate Network for predicting the regions of interest 💪

Object Detection for Dummies Part 3: R-CNN Family

YOLO

You Only Look Once: Unified Real-Time Object Detection
„Simple network“, directly from pixels to bounding box / object detection / class prediction

Image Segmentation

Grouping Pixels into regions that belong to same properties
Eg: Segmenting an Image into meaningful objects

截屏2020-08-20 17.31.10

Semantic Segmentation

Sliding Window

Label each pixel in image with a category label
Don‘t differentiate instances, only care about pixels
=> just extract small patches from an image and classify center pixel with a normal CNN classifier
Problem: very inefficient

Fully convolutional

Keep the network as an end to end convolutional Neural Network
Predictions are made for all pixels at once
Convolutions at original image resolution are very expensive

Reference

Object Detection for Dummies Part 3: R-CNN Family

Deep Learning CNN

Last updated on Apr 3, 2022

comments powered by Disqus