In Computers in biology and medicine
Non-Small Cell Lung Cancer (NSCLC) exhibits intrinsic heterogeneity at the molecular level that aids in distinguishing between its two prominent subtypes - Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC). This paper proposes a novel explainable AI (XAI)-based deep learning framework to discover a small set of NSCLC biomarkers. The proposed framework comprises three modules - an autoencoder to shrink the input feature space, a feed-forward neural network to classify NSCLC instances into LUAD and LUSC, and a biomarker discovery module that leverages the combined network comprising the autoencoder and the feed-forward neural network. In the biomarker discovery module, XAI methods uncovered a set of 52 relevant biomarkers for NSCLC subtype classification. To evaluate the classification performance of the discovered biomarkers, multiple machine-learning models are constructed using these biomarkers. Using 10-Fold cross-validation, Multilayer Perceptron achieved an accuracy of 95.74% (±1.27) at 95% confidence interval. Further, using Drug-Gene Interaction Database, we observe that 14 of the discovered biomarkers are druggable. In addition, 28 biomarkers aid the prediction of the survivability of the patients. Out of 52 discovered biomarkers, we find that 45 biomarkers have been reported in previous studies on distinguishing between the two NSCLC subtypes. To the best of our knowledge, the remaining seven biomarkers have not yet been reported for NSCLC subtyping and could be further explored for their contribution to targeted therapy of lung cancer.
Dwivedi Kountay, Rajpal Ankit, Rajpal Sheetal, Agarwal Manoj, Kumar Virendra, Kumar Naveen
2023-Jan-12
Biomarkers, Classification, Explainable AI, Machine learning, Neural network, Non-Small Cell Lung Cancer