Face
Detection & Tracking
Definition
Face detection is concerned with finding
whether or not there are any faces in a given image (usually in gray scale)
and, if present, return the image location and content of each face. This is
the first step of any fully automatic system that analyzes the information
contained in faces (e.g., identity, gender, expression, age, race and pose).
While earlier work dealt mainly with upright frontal faces, several systems
have been developed that are able to detect faces fairly accurately with
in-plane or out-of-plane rotations in real time. Although a face detection
module is typically designed to deal with single images, its performance can be
further improved if video stream is available.
Main
Body Text
Introduction
The advances of computing technology has
facilitated the development of real-time vision modules that interact with
humans in recent years. Examples abound, particularly in biometrics and human
computer interaction as the information contained in faces needs to be analyzed
for systems to react accordingly. For biometric systems that use faces as
non-intrusive input modules, it is imperative to locate faces in a scene before
any recognition algorithm can be applied. An intelligent visionbased user
interface should be able to tell the attention focus of the user (i.e., where
the user is looking at) in order to
respond accordingly. To detect facial
features accurately for applications such as digital cosmetics, faces need to
be located and registered first to facilitate further processing. It is evident
that face detection plays an important and critical role for the success of any
face processing systems.
The face detection problem is
challenging as it needs to account for all possible appearance variation caused
by change in illumination, facial features, occlusions, etc. In addition, it
has to detect faces that appear at different scale, pose, with inplane
rotations. In spite of all these difficulties, tremendous progress has been
made in the last decade and many systems have shown impressive real-time
performance. The recent advances of these algorithms have also made significant
contributions
in detecting other objects such as
humans/pedestrians, and cars.
Operation of a Face Detection SystemMost
detection systems carry out the task by extracting certain properties (e.g.,
local features or holistic intensity patterns) of a set of training images
acquired at a fixed pose (e.g., upright frontal pose) in an off-line setting.
To reduce the effectsof illumination change, these images are processed with
histogram equalization [3, 1] or standardization (i.e., zero mean unit
variance) [2]. Based on the extracted properties, these systems typically scan
through the entire image at every possible location and scale in order to
locate faces. The extracted properties can be either manually coded (with human
knowledge) or learned from a set of data as adopted in the recent systems that
have demonstrated impressive results [3, 1, 4, 5, 2]. In order to detect faces
at different scale, the detection process is usually repeated to a pyramid of
images whose resolution are reduced by a certain factor (e.g., 1.2) from the
original one [3, 1]. Such procedures may be expedited when other visual cues
can be accurately incorporated (e.g., color and motion) as pre-processing steps
to reduce the search space [5]. As faces are often detected across scale, the
raw detected faces are usually further processed to combine overlapped results
and remove false positives with heuristics (e.g., faces typically do not
overlap in images) [1] or further processing (e.g., edge detection and
intensity variance).