Deep Binary Representation of Facial Expressions
Automatic pain assessment is crucial in clinical diagnosis. Experiencing pain causes deformations in the facial structure resulting in different spontaneous facial expressions. In this paper, we aim to represent the facial expressions as a compact binary code for classification of different pain intensity levels. We divide a given face video into non-overlapping equal-length segments. Using a Convolutional Neural Network (CNN), we extract features from randomly sampled frames from all segments. The obtained features are aggregated by exploiting statistics to incorporate low-level visual patterns and high-level structural information. Finally, this processed information is encoded using a deep network to obtain a single binary code such that videos with the same pain intensity level have smaller Hamming distance than those of different levels. Extensive experiments on the publicly available UNBC-McMaster database demonstrates that our proposed method achieves superior performance compared to the state-of-the-art.