HRUNet: Assessing Uncertainty in Heart Rates Measured From Facial Videos

Video-based Photoplethysmography (VPPG) offers the capability to measure heart rate (HR) from facial videos. However, the reliability of the HR values extracted through this method remains uncertain, especially when videos are affected by various disturbances. Confronted by this challenge, we introduce an innovative framework for VPPG-based HR measurements, with a focus on capturing diverse sources of uncertainty in the predicted HR values. In this context, a neural network named HRUNet is structured for HR extraction from input facial videos. Departing from the conventional training approach of learning specific weight (and bias) values, we leverage the Bayesian posterior estimation to derive weight distributions within HRUNet. These distributions allow for sampling to encode uncertainty stemming from HRUNet’s limited performance. On this basis, we redefine HRUNet’s output as a distribution of potential HR values, as opposed to the traditional emphasis on the single most probable HR value. The underlying goal is to discover the uncertainty arising from inherent noise in the input video. HRUNet is evaluated across 1,098 videos from seven datasets, spanning three scenarios: undisturbed, motion-disturbed, and light-disturbed. The ensuing test outcomes demonstrate that uncertainty in the HR measurements increases significantly in the scenarios marked by disturbances, compared to that in the undisturbed scenario. Moreover, HRUNet outperforms state-of-the-art methods in HR accuracy when excluding HR values with > 0.4 uncertainty. This underscores that uncertainty emerges as an informative indicator of potentially erroneous HR measurements. With enhanced reliability affirmed, the VPPG technique holds the promise for applications in safety-critical domains.