In Genomics, proteomics & bioinformatics
Genome-wide transcriptome profiling identifies genes that are prone to differential expression across contexts, as well as genes with changes specific to the experimental manipulation. Distinguishing between common differentially expressed genes (DEGs) from those that are specifically changed in a context of interest allows more efficient prediction of which genes are specific to a given biological process under scrutiny. Currently, common DEGs or pathways can only be identified through the laborious manual curation of experiments, an inordinately time-consuming endeavor. Here we pioneer an approach, Specific cOntext Pattern Highlighting In Expression data (SOPHIE), for distinguishing common and specific transcriptional patterns using a generative neural network to create a background set of experiments from which a null distribution of gene and pathway changes can be generated. We apply SOPHIE to diverse datasets including those from human, human cancer, and the bacteria pathogen Pseudomonas aeruginosa. SOPHIE identifies common DEGs in concordance with previously described, manually and systematically determined common DEGs. Further, molecular validation indicates that SOPHIE detects highly specific, but low magnitude, biologically relevant, transcriptional changes. SOPHIE's measure of specificity can complement log fold change values generated from traditional differential expression analyses. For example, by filtering the set of DEGs, one can identify those genes that are specifically relevant to the experimental condition of interest. Consequently, these results can inform future research directions. All scripts used in these analyses are available at https://github.com/greenelab/generic-expression-patterns. To run SOPHIE on your own data use https://github.com/greenelab/sophie.
Lee Alexandra J, Mould Dallas L, Crawford Jake, Hu Dongbo, Powers Rani K, Doing Georgia, Costello James C, Hogan Deborah A, Greene Casey S
2022-Oct-07
Differential expression analysis, Machine learning, Neural network, Software, Transcriptomics