In Computer methods and programs in biomedicine
BACKGROUND : A reliable anticipation of a difficult airway may notably enhance safety during anaesthesia. In current practice, clinicians use bedside screenings by manual measurements of patients' morphology.
OBJECTIVE : To develop and evaluate algorithms for the automated extraction of orofacial landmarks, which characterize airway morphology.
METHODS : We defined 27 frontal + 13 lateral landmarks. We collected n=317 pairs of pre-surgery photos from patients undergoing general anaesthesia (140 females, 177 males). As ground truth reference for supervised learning, landmarks were independently annotated by two anaesthesiologists. We trained two ad-hoc deep convolutional neural network architectures based on InceptionResNetV2 (IRNet) and MobileNetV2 (MNet), to predict simultaneously: (a) whether each landmark is visible or not (occluded, out of frame), (b) its 2D-coordinates (x,y). We implemented successive stages of transfer learning, combined with data augmentation. We added custom top layers on top of these networks, whose weights were fully tuned for our application. Performance in landmark extraction was evaluated by 10-fold cross-validation (CV) and compared against 5 state-of-the-art deformable models.
RESULTS : With annotators' consensus as the 'gold standard', our IRNet-based network performed comparably to humans in the frontal view: median CV loss L=1.277·10-3, inter-quartile range (IQR) [1.001, 1.660]; versus median 1.360, IQR [1.172, 1.651], and median 1.352, IQR [1.172, 1.619], for each annotator against consensus, respectively. MNet yielded slightly worse results: median 1.471, IQR [1.139, 1.982]. In the lateral view, both networks attained performances statistically poorer than humans: median CV loss L=2.141·10-3, IQR [1.676, 2.915], and median 2.611, IQR [1.898, 3.535], respectively; versus median 1.507, IQR [1.188, 1.988], and median 1.442, IQR [1.147, 2.010] for both annotators. However, standardized effect sizes in CV loss were small: 0.0322 and 0.0235 (non-significant) for IRNet, 0.1431 and 0.1518 (p<0.05) for MNet; therefore quantitatively similar to humans. The best performing state-of-the-art model (a deformable regularized Supervised Descent Method, SDM) behaved comparably to our DCNNs in the frontal scenario, but notoriously worse in the lateral view.
CONCLUSIONS : We successfully trained two DCNN models for the recognition of 27 + 13 orofacial landmarks pertaining to the airway. Using transfer learning and data augmentation, they were able to generalize without overfitting, reaching expert-like performances in CV. Our IRNet-based methodology achieved a satisfactory identification and location of landmarks: particularly in the frontal view, at the level of anaesthesiologists. In the lateral view, its performance decayed, although with a non-significant effect size. Independent authors had also reported lower lateral performances; as certain landmarks may not be clear salient points, even for a trained human eye.
García-García Fernando, Lee Dae-Jin, Mendoza-Garcés Francisco J, Irigoyen-Miró Sofía, Legarreta-Olabarrieta María J, García-Gutiérrez Susana, Arostegui Inmaculada
2023-Feb-25
Anaesthesia, Deep learning, Difficult airway, Facial landmarks, Transfer learning