Face Detection: Neural-Network-Based

Motivation

Idea: Use a search-window to scan over an image
Train a classifier to decide whether the search windows contains a face or not?

Detection

Simple neuron model

Topologies

Parameters

Adjustable Parameters are

Connection weights (to be learned)
Activation function (fixed)
Number of layers (fixed)
Number of neurons per layer (fixed)

Training

Backpropagation with gradient descent

Neural Network Based Face Detection¹

Idea: Use an artifical neural network to detect upright frontal faces
- Network receives as input a 20x20 pixel region of an image
- output ranges from -1 (no face present) to +1 (face present)
- the neural network „face-filter“ is applied at every location in the image
- to detect faces with different sizes, the input image is repeatedly scaled down

Network Topology

20x20 pixel input retina
4 types of receptive hidden fields
One real-valued output

System Overview

Network Training

Training Set

1050 normalized face images
15 face images generated by rotating and scaling original face images
1000 randomly chosen non-face images

Preprocessing

correct for different lighting conditions (overall brightness, shadows)
rescale images to fixed size

Histogram equalization

Defines a mapping of gray levels $p$ into gray levels $q$ such that the distribution of $q$ is close to being uniform
Stretches contrast (expands the range of gray levels)
Transforms different input images so that they have similar intensity distributions (thus reducing the effect of different illumination)
Example
Algorithm
- The probability of an occurrence of a pixel of level $i$ in the image: $$ p\left(x_{i}\right)=\frac{n_{i}}{n}, \qquad i \in 0, \ldots, L-1 $$
  - $L$: number of gray levels
  - $n_i$: number of occurences of gray level $i$
- Define $c$ as the cumulative distribution function: $$ c(i)=\sum_{j=0}^{i} p\left(x_{j}\right) $$
- Create a transformation of the form $$ y_i = T(x_i) = c(i), \qquad y_i \in [0, 1] $$ will produce a level $y$ for each level $x$ in the original image, such that the cumulative probability function of $y$ will be linearized across the value range. $$ y_{i}^{\prime}=y_{i} \cdot(\max -\min )+\min $$

Training Procedure

Randomly choose 1000 non-face images
Train network to produce 1 for faces, -1 for non-faces
Run network on images containing no faces. Collect subimages in which network incorrectly identifes a face (output > 0)
Select up to 250 of these „false positives“ at random and add them to the training set as negative examples

Neural Network Based Face Filter

Output of ANN defines a filter for faces
Search
- Scan input image with search window, apply ANN to search window
- Input image needs to be rescaled in order to detect faces with different size
Output needs to be post-processed
- Noise removal
- Merging overlapping detections
Speed up can be achieved
- Increase step size
- Make ANN more flexible to translation
- Hierarchical, pyramidal search

Localization and Ground-Truth

For localization, the test data is mostly annotated with ground-truth bounding boxes
Comparing hypotheses to Ground-Truth
- Overlap $$ O = \frac{\text{GT } \cap \text{ DET}}{\text{GT } \cup \text{ DET}} $$
  Also called Intersection over Union (IoU)
- Often used as threshold: Overlap>50%

Neural Network Based Face Detection, by Henry A. Rowley, Shumeet Baluja, and Takeo Kanade. IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 20, number 1, pages 23-38, January 1998. ↩︎

Last updated on Apr 3, 2022

Face Detection: Neural-Network-Based

Motivation

Detection

Simple neuron model

Topologies

Parameters

Training

Neural Network Based Face Detection1

Network Topology

System Overview

Network Training

Training Set

Preprocessing

Histogram equalization

Training Procedure

Neural Network Based Face Filter

Localization and Ground-Truth

Neural Network Based Face Detection¹