Box C/D small nucleolar RNAs (snoRNAs) are a conserved class of RNA known for their role in guiding ribosomal RNA 2'-O-ribose methylation through base pairing with targeted sequences. Recently, C/D snoRNAs were also implicated in regulating the expression of non-ribosomal genes through different modes of binding. Large scale RNA-RNA interaction datasets detect many snoRNAs binding messenger RNA. However, these studies provide a narrow portrait of snoRNA targets forming under specific experimental conditions. To enable a more comprehensive study of C/D snoRNA interactions, we created snoGloBe, a human C/D snoRNA machine learning interaction predictor based on a gradient boosting classifier. SnoGloBe considers the target type, and position and sequence of the interactions, enabling it to outperform existing predictors. Interestingly, for specific snoRNAs, snoGloBe identifies strong enrichment of interactions near gene expression regulatory elements including splice sites. Abundance and splicing of predicted targets were altered upon the knockdown of their associated snoRNA. Strikingly, the predicted snoRNA interactions often overlap with the binding sites of functionally related RNA binding proteins, reinforcing their role in gene expression regulation. The interactions of snoRNAs are not randomly distributed but often accumulate in functionally related transcripts sharing common regulatory elements suggesting coordinated regulatory function. The wide scope of snoGloBe makes it an excellent tool for discovering viral RNA targets, which is evident from its capacity to identify snoRNAs targeting SARS-CoV-2 RNA, known to be heavily methylated. Overall, snoGloBe is capable of identifying experimentally validated binding sites and predicting novel sites with shared regulatory function.
Deschamps-Francoeur, G.; Couture, S.; Abou Elela, S.; Scott, M. S.