In Journal of imaging
OBJECTIVE : The application of computer models in continuous patient activity monitoring using video cameras is complicated by the capture of images of varying qualities due to poor lighting conditions and lower image resolutions. Insufficient literature has assessed the effects of image resolution, color depth, noise level, and low light on the inference of eye opening and closing and body landmarks from digital images.
METHOD : This study systematically assessed the effects of varying image resolutions (from 100 × 100 pixels to 20 × 20 pixels at an interval of 10 pixels), lighting conditions (from 42 to 2 lux with an interval of 2 lux), color-depths (from 16.7 M colors to 8 M, 1 M, 512 K, 216 K, 64 K, 8 K, 1 K, 729, 512, 343, 216, 125, 64, 27, and 8 colors), and noise levels on the accuracy and model performance in eye dimension estimation and body keypoint localization using the Dlib library and OpenPose with images from the Closed Eyes in the Wild and the COCO datasets, as well as photographs of the face captured at different light intensities.
RESULTS : The model accuracy and rate of model failure remained acceptable at an image resolution of 60 × 60 pixels, a color depth of 343 colors, a light intensity of 14 lux, and a Gaussian noise level of 4% (i.e., 4% of pixels replaced by Gaussian noise).
CONCLUSIONS : The Dlib and OpenPose models failed to detect eye dimensions and body keypoints only at low image resolutions, lighting conditions, and color depths.
CLINICAL IMPACT : Our established baseline threshold values will be useful for future work in the application of computer vision in continuous patient monitoring.
Ye Run Zhou, Subramanian Arun, Diedrich Daniel, Lindroth Heidi, Pickering Brian, Herasevich Vitaly
2022-Dec-19
deep learning, facial feature extraction, image quality, pose estimation