MODELLING OF HUMAN AGEING, COMPOUND EMOTIONS, AND INTENSITY FOR AUTOMATIC FACIAL EXPRESSION RECOGNITION

UoM administered thesis: Phd

  • Authors:
  • Nora Al-Garaawi

Abstract

After decades of research, automatic facial expression recognition (AFER) has been shown to work well when restricted to subjects with a limited range of ages, expressions, and intensities of expression. Recognition of the expressions of subjects across a large range of ages (including older people), expressions (compound emotions), and intensities (ranging from a neutral expression to the apex of the target expression) is harder and, to date, has not been studied in any particular depth. This thesis focuses on studying the influence of these problems on the accuracy of AFER. The main concern is to investigate the possibilities that can be used for modelling facial expression recognition against the impact of the problems under study in order to ensure the solution is more generalized and effective. Since the face image is a collection of texture and shape parameters, the study starts by using texture measurement methods to understand the influence of those problems on face texture features and hence on texture-based AFER. Our first contribution shows that by using binary robust independent elementary features (BRIEF) (Calonder et al., 2012), we can develop a new face descriptor model that is able to describe face images and can generalize to new data sets. The BRIEF descriptor is able to generate the discriminative features globally from the image with an explicit shape. However, when BRIEF is used to generate feature from an image with no explicit shape such as the face image, BRIEF is unable to generate discriminative feature. We thus propose to use BRIEF locally to ensure that each pixel in the image is evaluated locally to capture the local shape surrounding around it. Empirical and comprehensive evaluation using three facial expression datasets demonstrates that this model gives satisfactory performance compared to other local face descriptors techniques evaluated on the same datasets. The study also shows that the patterns of the problems under consideration have a significant effect on the face texture features and on the accuracy of texture-based AFER. The study is then extended by using shape measurement methods to investigate the influence of those problems on the face shape features and hence on shape-based AFER. Our second contribution shows that by using random forest regression voting in a constrained local model (RFRV-CLM) framework (Cootes et al., 2012; Lindner et al., 2015), we can develop a fully automated facial expression localization (FEL) system that is able to detect the facial key points in a multiple-stage (coarse-to-fine) scheme and can generalize accurately to new data sets with a wide range of variations of facial appearances. Empirical and comprehensive evaluation using five different facial expression datasets demonstrates that this model gives excellent agreement with ground truth data and outperforms the results of alternative methods evaluated on the same datasets. The study also shows that the patterns of the problems under study have a significant effect on the performance of FEL, and that the FEL based on RFRV-CLM achieved good performance against that effect. It also demonstrates that appearance-based AFER (combining shape with texture) gives better results than texture-based AFER. Our final contribution builds on the second and it is the development of an age-based AFER system that explicitly estimates age group and expression in a single framework. In this system, we show that by using the age information, in particular apparent age since some people might look younger or older than their real age, as prior knowledge to the expression recognition through using a weighted combination rule of a set of age group classifier and age-specific expression classifiers, we can significantly eliminate the influence of age features on the expression classification accuracy. Tested on three age-expression datasets, we show that the results of our novel system were encouraging in comparison to the state-of-art systems which ignore age and alternative models recently applied to the problem. In summary, the results of the BRIEF-based face descriptor, RFRV-CLM-based FEL, and age-based AFER are encouraging and could be basic building blocks for many face applications in computer vision such face detection, face recognition ...etc.

Details

Original languageEnglish
Awarding Institution
Supervisors/Advisors
Award date1 Aug 2019