Face Detection: Neural-Network-Based

Motivation

  • Idea: Use a search-window to scan over an image
  • Train a classifier to decide whether the search windows contains a face or not?

截屏2020-11-13 16.16.57

Detection

Simple neuron model

截屏2020-11-13 16.20.47

Topologies

截屏2020-11-13 16.21.15

Parameters

Adjustable Parameters are

  • Connection weights (to be learned)

  • Activation function (fixed)

  • Number of layers (fixed)

  • Number of neurons per layer (fixed)

Training

Backpropagation with gradient descent

Neural Network Based Face Detection1

  • Idea: Use an artifical neural network to detect upright frontal faces
    • Network receives as input a 20x20 pixel region of an image

    • output ranges from -1 (no face present) to +1 (face present)

    • the neural network „face-filter“ is applied at every location in the image

    • to detect faces with different sizes, the input image is repeatedly scaled down

Network Topology

截屏2020-11-13 16.28.33

  • 20x20 pixel input retina
  • 4 types of receptive hidden fields
  • One real-valued output

System Overview

截屏2020-11-13 16.29.19

Network Training

Training Set

  • 1050 normalized face images

  • 15 face images generated by rotating and scaling original face images

  • 1000 randomly chosen non-face images

Preprocessing

  • correct for different lighting conditions (overall brightness, shadows)
  • rescale images to fixed size

Histogram equalization

  • Defines a mapping of gray levels $p$ into gray levels $q$ such that the distribution of $q$ is close to being uniform

  • Stretches contrast (expands the range of gray levels)

  • Transforms different input images so that they have similar intensity distributions (thus reducing the effect of different illumination)

  • Example

    截屏2020-11-13 16.32.18
  • Algorithm

    • The probability of an occurrence of a pixel of level $i$ in the image: $$ p\left(x_{i}\right)=\frac{n_{i}}{n}, \qquad i \in 0, \ldots, L-1 $$

      • $L$: number of gray levels
      • $n_i$: number of occurences of gray level $i$
    • Define $c$ as the cumulative distribution function: $$ c(i)=\sum_{j=0}^{i} p\left(x_{j}\right) $$

    • Create a transformation of the form $$ y_i = T(x_i) = c(i), \qquad y_i \in [0, 1] $$ will produce a level $y$ for each level $x$ in the original image, such that the cumulative probability function of $y$ will be linearized across the value range. $$ y_{i}^{\prime}=y_{i} \cdot(\max -\min )+\min $$

Training Procedure

  1. Randomly choose 1000 non-face images

  2. Train network to produce 1 for faces, -1 for non-faces

  3. Run network on images containing no faces. Collect subimages in which network incorrectly identifes a face (output > 0)

  4. Select up to 250 of these „false positives“ at random and add them to the training set as negative examples

Neural Network Based Face Filter

  • Output of ANN defines a filter for faces

  • Search

    • Scan input image with search window, apply ANN to search window

    • Input image needs to be rescaled in order to detect faces with different size

  • Output needs to be post-processed

    • Noise removal

    • Merging overlapping detections

  • Speed up can be achieved

    • Increase step size

    • Make ANN more flexible to translation

    • Hierarchical, pyramidal search

Localization and Ground-Truth

  • For localization, the test data is mostly annotated with ground-truth bounding boxes

  • Comparing hypotheses to Ground-Truth

    • Overlap $$ O = \frac{\text{GT } \cap \text{ DET}}{\text{GT } \cup \text{ DET}} $$

      截屏2020-11-13 16.43.11

      Also called Intersection over Union (IoU)

    • Often used as threshold: Overlap>50%


  1. Neural Network Based Face Detection, by Henry A. Rowley, Shumeet Baluja, and Takeo Kanade. IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 20, number 1, pages 23-38, January 1998. ↩︎

Previous
Next