In Genomics, proteomics & bioinformatics
Epithelial and stromal tissues are components of the tumor microenvironment and play a major role in tumor initiation and progression. Distinguishing stroma from epithelial tissues is critically important for spatial characterization of the tumor microenvironment. We propose BrcaSeg, an image analysis pipeline based on a convolutional neural network (CNN) model to classify epithelial and stromal regions in whole-slide hematoxylin and eosin (H&E) stained histopathological images. The CNN model was trained using well-annotated breast cancer tissue microarrays and validated with images from The Cancer Genome Atlas (TCGA) Program. BrcaSeg achieves a classification accuracy of 91.02%, which outperforms other state-of-the-art methods. Using this model, we generated pixel-level epithelial/stromal tissue maps for 1000 TCGA breast cancer slide images that are paired with gene expression data. We subsequently estimated the epithelial and stromal ratios and performed correlation analysis to model the relationship between gene expression and tissue ratios. Gene Ontology (GO) enrichment analyses of genes that were highly correlated with tissue ratios suggest that the same tissue was associated with similar biological processes in different breast cancer subtypes, whereas each subtype also had its own idiosyncratic biological processes governing the development of these tissues. Taken all together, our approach can lead to new insights in exploring relationships between image-based phenotypes and their underlying genomic events and biological processes for all types of solid tumors. BrcaSeg can be accessed at https://github.com/Serian1992/ImgBio.
Lu Zixiao, Zhan Xiaohui, Wu Yi, Cheng Jun, Shao Wei, Ni Dong, Han Zhi, Zhang Jie, Feng Qianjin, Huang Kun
Breast cancer, Computational pathology, Deep learning, Integrative genomics, Whole-slide tissue image