ArXiv Preprint
Remote photoplethysmography (rPPG) enables non-contact heart rate (HR)
estimation from facial videos which gives significant convenience compared with
traditional contact-based measurements. In the real-world long-term health
monitoring scenario, the distance of the participants and their head movements
usually vary by time, resulting in the inaccurate rPPG measurement due to the
varying face resolution and complex motion artifacts. Different from the
previous rPPG models designed for a constant distance between camera and
participants, in this paper, we propose two plug-and-play blocks (i.e.,
physiological signal feature extraction block (PFE) and temporal face alignment
block (TFA)) to alleviate the degradation of changing distance and head motion.
On one side, guided with representative-area information, PFE adaptively
encodes the arbitrary resolution facial frames to the fixed-resolution facial
structure features. On the other side, leveraging the estimated optical flow,
TFA is able to counteract the rPPG signal confusion caused by the head movement
thus benefit the motion-robust rPPG signal recovery. Besides, we also train the
model with a cross-resolution constraint using a two-stream dual-resolution
framework, which further helps PFE learn resolution-robust facial rPPG
features. Extensive experiments on three benchmark datasets (UBFC-rPPG, COHFACE
and PURE) demonstrate the superior performance of the proposed method. One
highlight is that with PFE and TFA, the off-the-shelf spatio-temporal rPPG
models can predict more robust rPPG signals under both varying face resolution
and severe head movement scenarios. The codes are available at
https://github.com/LJW-GIT/Arbitrary_Resolution_rPPG.
Jianwei Li, Zitong Yu, Jingang Shi
2022-11-30