Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Computational and structural biotechnology journal

N6-methyladenine (6mA) plays a critical role in various epigenetic processing including DNA replication, DNA repair, silencing, transcription, and diseases such as cancer. To understand such epigenetic mechanisms, 6 mA has been detected by high-throughput technologies on a genome-wide scale at single-base resolution, together with conventional methods such as immunoprecipitation, mass spectrometry and capillary electrophoresis, but these experimental approaches are time-consuming and laborious. To complement these problems, we have developed a CNN-based 6 mA site predictor, named CNN6mA, which proposed two new architectures: a position-specific 1-D convolutional layer and a cross-interactive network. In the position-specific 1-D convolutional layer, position-specific filters with different window sizes were applied to an inquiry sequence instead of sharing the same filters over all positions in order to extract the position-specific features at different levels. The cross-interactive network explored the relationships between all the nucleotide patterns within the inquiry sequence. Consequently, CNN6mA outperformed the existing state-of-the-art models in many species and created the contribution score vector that intelligibly interpret the prediction mechanism. The source codes and web application in CNN6mA are freely accessible at https://github.com/kuratahiroyuki/CNN6mA.git and http://kurata35.bio.kyutech.ac.jp/CNN6mA/, respectively.

Tsukiyama Sho, Hasan Md Mehedi, Kurata Hiroyuki

2023

6mA, N6-methyladenine, AUCs, Area under the curves, BERT, Bidirectional Encoder Representations from Transformers, CNN, CNN, Convolutional neural network, DNA modification, Deep learning, Interpretable prediction, LSTM, Long short-term memory, MCC, Matthews correlation coefficient, Machine learning, N6-methyladenine, RF, Random forest, SMRT, Single-molecule real-time, SN, Sensitivity, SP, Specificity, UMAP, Uniform manifold approximation and projection, t-SNE, t-distributed stochastic neighbor embedding