In Journal of bioinformatics and computational biology
Gleason score (GS) is a powerful prognostic factor in prostate cancer (PCa). A GS-7 tumor typically has the primary Gleason (architectural) pattern and secondary prevalent one being graded with 3 and 4 (or 4 and 3), respectively. Due to the well-known intratumoral multifocal occurrence of different patterns, a biological sample from a GS-7 tumor used in a molecular experiment will be uncertain regarding the actually represented pattern if no special attention is given to specimen preparation. In this study, by an integrative analysis of several published gene expression datasets, one of which is the profiling of the paired GP-3 (Gleason pattern 3) and GP-4 (Gleason pattern 4) specimens of 13 GS-7 tumors, we demonstrate that such an uncertainty can be frequently observed in the published data. More specifically, our results suggest that the GS-7 specimens used to generate the frequently-cited The Cancer Genome Atlas (TCGA) data and the Gene Expression Omnibus (GEO) dataset GSE21032 which largely are individual GP-3 or GP-4 specimens rather than the "intermediate" specimens of GP-3 and GP-4. This indicates a pitfall in the existing molecular research of prostate tumors relevant to GS and in GS-related molecular biomarker identification using the previously documented data.
Zhang Wensheng, Dong Yan, Zhang Kun
Gleason pattern, Gleason score, Prostate cancer, gene expression, integrative analysis, machine learning