In Schizophrenia research ; h5-index 61.0
Numerous studies have used machine learning with neuroimaging data for identifying individuals with a schizophrenia diagnosis. However, inconsistent results have limited the ability of the psychiatric community to objectively judge and accept the value of this approach. One factor that has contributed to the inconsistency, but has long been ignored, is randomness in the practice of machine learning. This is manifest when executing the same machine learning pipeline multiple times on the same dataset but getting different results. In the current study, a dataset of anatomical MRI scans from 158 patients with first-episode medication-naïve schizophrenia and 166 matched controls was used to investigate the effect of randomness on classifier performance estimates under different algorithm complexity and data splitting ratios. The maximum discriminatory accuracy that could be reached was 62.6 % ± 4.7 % (43.5 %-79.3 %) obtained when using extra-trees classifiers without feature normalization. Regions contributing to discrimination were located at bilateral temporal lobes and right frontal lobe. The results show that randomness has a significant impact on the precision of model performance estimates, especially when the size of test set is small. Current neuroimaging feature engineering combined with machine learning still falls short of being able to make diagnoses in the clinical context, but has value in revealing patterns of regional brain alteration associated with the illness. The current results indicate that effects of randomness on model performance should be reported and considered in interpreting model utility and it is necessary to evaluate models on large test sets to obtain valid estimates of model performance.
Sun Huaiqiang, Lui Su, Huang Xiaoqi, Sweeney John, Gong Qiyong
2023-Jan-20
Machine learning, Performance, Psychoradiology, Randomness, Schizophrenia