In Future science OA
Aim : We propose a method for screening full blood count metadata for evidence of communicable and noncommunicable diseases using machine learning (ML).
Materials & methods : High dimensional hematology metadata was extracted over an 11-month period from Sysmex hematology analyzers from 43,761 patients. Predictive models for age, sex and individuality were developed to demonstrate the personalized nature of hematology data. Both numeric and raw flow cytometry data were used for both supervised and unsupervised ML to predict the presence of pneumonia, urinary tract infection and COVID-19. Heart failure was used as an objective to prove method generalizability.
Results : Chronological age was predicted by a deep neural network with R2: 0.59; mean absolute error: 12; sex with AUROC: 0.83, phi: 0.47; individuality with 99.7% accuracy, phi: 0.97; pneumonia with AUROC: 0.74, sensitivity 58%, specificity 79%, 95% CI: 0.73-0.75, p < 0.0001; urinary tract infection AUROC: 0.68, sensitivity 52%, specificity 79%, 95% CI: 0.67-0.68, p < 0.0001; COVID-19 AUROC: 0.8, sensitivity 82%, specificity 75%, 95% CI: 0.79-0.8, p = 0.0006; and heart failure area under the receiver operator curve (AUROC): 0.78, sensitivity 72%, specificity 72%, 95% CI: 0.77-0.78; p < 0.0001.
Conclusion : ML applied to hematology data could predict communicable and noncommunicable diseases, both at local and global levels.
Gladding Patrick A, Ayar Zina, Smith Kevin, Patel Prashant, Pearce Julia, Puwakdandawa Shalini, Tarrant Dianne, Atkinson Jon, McChlery Elizabeth, Hanna Merit, Gow Nick, Bhally Hasan, Read Kerry, Jayathissa Prageeth, Wallace Jonathan, Norton Sam, Kasabov Nick, Calude Cristian S, Steel Deborah, Mckenzie Colin
COVID-19, biological age, full blood count, heart failure, hematology, machine learning, pneumonia