In Frontiers in bioengineering and biotechnology
One of the most well-known cancer subtypes worldwide is triple-negative breast cancer (TNBC) which has reduced prediction due to its antagonistic biotic actions and target's deficiency for the treatment. The current work aims to discover the countenance outlines and possible roles of lncRNAs in the TNBC via computational approaches. Long non-coding RNAs (lncRNAs) exert profound biological functions and are widely applied as prognostic features in cancer. We aim to identify a prognostic lncRNA signature for the TNBC. First, samples were filtered out with inadequate tumor purity and retrieved the lncRNA expression data stored in the TANRIC catalog. TNBC sufferers were divided into two prognostic classes which were dependent on their survival time (shorter or longer than 3 years). Random forest was utilized to select lncRNA features based on the lncRNAs differential expression between shorter and longer groups. The Stochastic gradient boosting method was used to construct the predictive model. As a whole, 353 lncRNAs were differentially transcribed amongst the shorter and longer groups. Using the recursive feature elimination, two lncRNAs were further selected. Trained by stochastic gradient boosting, we reached the highest accuracy of 69.69% and area under the curve of 0.6475. Our findings showed that the two-lncRNA signs can be proved as potential biomarkers for the prognostic grouping of TNBC's sufferers. Many lncRNAs remained dysregulated in TNBC, while most of them are likely play a role in cancer biology. Some of these lncRNAs were linked to TNBC's prediction, which makes them likely to be promising biomarkers.
Kaushik Aman Chandra, Mehmood Aamir, Wang Xiangeng, Wei Dong-Qing, Dai Xiaofeng
lncRNA, long non-coding RNA, mRNA, machine learning, miRNA, triple-negative breast cancer