{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T18:33:09Z","timestamp":1732041189004},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,28]],"date-time":"2022-02-28T00:00:00Z","timestamp":1646006400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["19H04208","19F19377"],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["U19AG05537301","R01AR069055"],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,10]]},"abstract":"Abstract<\/jats:title>N6-methyladenine (6mA) is associated with important roles in DNA replication, DNA repair, transcription, regulation of gene expression. Several experimental methods were used to identify DNA modifications. However, these experimental methods are costly and time-consuming. To detect the 6mA and complement these shortcomings of experimental methods, we proposed a novel, deep leaning approach called BERT6mA. To compare the BERT6mA with other deep learning approaches, we used the benchmark datasets including 11 species. The BERT6mA presented the highest AUCs in eight species in independent tests. Furthermore, BERT6mA showed higher and comparable performance with the state-of-the-art models while the BERT6mA showed poor performances in a few species with a small sample size. To overcome this issue, pretraining and fine-tuning between two species were applied to the BERT6mA. The pretrained and fine-tuned models on specific species presented higher performances than other models even for the species with a small sample size. In addition to the prediction, we analyzed the attention weights generated by BERT6mA to reveal how the BERT6mA model extracts critical features responsible for the 6mA prediction. To facilitate biological sciences, the BERT6mA online web server and its source codes are freely accessible at https:\/\/github.com\/kuratahiroyuki\/BERT6mA.git, respectively.<\/jats:p>","DOI":"10.1093\/bib\/bbac053","type":"journal-article","created":{"date-parts":[[2022,2,1]],"date-time":"2022-02-01T20:10:51Z","timestamp":1643746251000},"source":"Crossref","is-referenced-by-count":31,"title":["BERT6mA: prediction of DNA N6-methyladenine site using deep learning-based approaches"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-7871-0868","authenticated-orcid":false,"given":"Sho","family":"Tsukiyama","sequence":"first","affiliation":[{"name":"Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4952-0739","authenticated-orcid":false,"given":"Md Mehedi","family":"Hasan","sequence":"additional","affiliation":[{"name":"Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA"}]},{"given":"Hong-Wen","family":"Deng","sequence":"additional","affiliation":[{"name":"Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4254-2214","authenticated-orcid":false,"given":"Hiroyuki","family":"Kurata","sequence":"additional","affiliation":[{"name":"Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan"}]}],"member":"286","published-online":{"date-parts":[[2022,2,28]]},"reference":[{"key":"2022031506403542400_ref1","doi-asserted-by":"crossref","first-page":"516","DOI":"10.1016\/j.cbpa.2012.10.002","article-title":"Nucleic acid modifications with epigenetic significance","volume":"16","author":"Fu","year":"2012","journal-title":"Curr Opin Chem Biol"},{"key":"2022031506403542400_ref2","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1016\/0092-8674(90)90271-F","article-title":"Coli oriC and the dnaA gene promoter are sequestered from dam methyltransferase following the passage of the chromosomal replication fork","volume":"62","author":"Campbell","year":"1990","journal-title":"Cell"},{"key":"2022031506403542400_ref3","doi-asserted-by":"crossref","first-page":"7027","DOI":"10.1128\/JB.187.20.7027-7037.2005","article-title":"Analysis of global gene expression and double-strand-break formation in DNA adenine methyltransferase- and mismatch repair-deficient Escherichia coli","volume":"187","author":"Robbins-Manke","year":"2005","journal-title":"J Bacteriol"},{"key":"2022031506403542400_ref4","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1093\/genetics\/104.4.571","article-title":"Effects of high levels of DNA adenine methylation on methyl-directed mismatch repair in Escherichia coli","volume":"104","author":"Pukkila","year":"1983","journal-title":"Genetics"},{"key":"2022031506403542400_ref5","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1038\/nrmicro1350","article-title":"N6-methyl-adenine: an epigenetic signal for DNA-protein interactions","volume":"4","author":"Wion","year":"2006","journal-title":"Nat Rev Microbiol"},{"key":"2022031506403542400_ref6","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1128\/MMBR.00044-12","article-title":"Diverse functions of restriction-modification systems in addition to cellular defense","volume":"77","author":"Vasu","year":"2013","journal-title":"Microbiol Mol Biol Rev"},{"key":"2022031506403542400_ref7","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1016\/j.molcel.2018.06.015","article-title":"N6-Methyladenine DNA modification in the human genome","volume":"71","author":"Xiao","year":"2018","journal-title":"Mol Cell"},{"key":"2022031506403542400_ref8","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1038\/nmeth.1459","article-title":"Direct detection of DNA methylation during single-molecule, real-time sequencing","volume":"7","author":"Flusberg","year":"2010","journal-title":"Nat Methods"},{"key":"2022031506403542400_ref9","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1126\/science.1162986","article-title":"Real-time DNA sequencing from single polymerase molecules","volume":"323","author":"Eid","year":"2009","journal-title":"Science"},{"key":"2022031506403542400_ref10","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1038\/s41467-017-01195-y","article-title":"DNA N6-methyladenine is dynamically regulated in the mouse brain following environmental stress","volume":"8","author":"Yao","year":"2017","journal-title":"Nat Commun"},{"key":"2022031506403542400_ref11","first-page":"79","article-title":"Detection of DNA methylation in genomic DNA by UHPLC-MS\/MS, methods in molecular biology","volume":"2198","author":"Boulias","year":"2021","journal-title":"Clifton, NJ"},{"key":"2022031506403542400_ref12","doi-asserted-by":"crossref","first-page":"100991","DOI":"10.1016\/j.isci.2020.100991","article-title":"iDNA-MS: an integrated computational tool for detecting DNA modification sites in multiple genomes","volume":"23","author":"Lv","year":"2020","journal-title":"iScience"},{"key":"2022031506403542400_ref13","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1016\/j.omtn.2019.04.019","article-title":"Meta-4mCpred: a sequence-based meta-predictor for accurate DNA 4mC site prediction using effective feature representation","volume":"16","author":"Manavalan","year":"2019","journal-title":"Mol Ther Nucleic Acids"},{"key":"2022031506403542400_ref14","doi-asserted-by":"crossref","first-page":"1332","DOI":"10.3390\/cells8111332","article-title":"4mCpred-EL: an ensemble learning framework for identification of DNA N4-methylcytosine sites in the mouse genome","volume":"8","author":"Manavalan","year":"2019","journal-title":"Cell"},{"key":"2022031506403542400_ref15","doi-asserted-by":"crossref","first-page":"e10813","DOI":"10.7717\/peerj.10813","article-title":"6mA-Pred: identifying DNA N6-methyladenine sites based on deep learning","volume":"9","author":"Huang","year":"2021","journal-title":"PeerJ"},{"key":"2022031506403542400_ref16","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1186\/s12859-019-3006-z","article-title":"PTPD: predicting therapeutic peptides by deep learning and word2vec","volume":"20","author":"Wu","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2022031506403542400_ref17","doi-asserted-by":"crossref","first-page":"2009","DOI":"10.1093\/bioinformatics\/bty937","article-title":"Identifying antimicrobial peptides using word embedding with deep recurrent neural networks","volume":"35","author":"Hamid","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506403542400_ref18","doi-asserted-by":"crossref","first-page":"178577","DOI":"10.1109\/ACCESS.2019.2958618","article-title":"iIM-CNN: intelligent identifier of 6mA sites on different species by using convolution neural network","volume":"7","author":"Wahab","year":"2019","journal-title":"IEEE Access"},{"key":"2022031506403542400_ref19","article-title":"SICD6mA: identifying 6mA sites using deep memory network","author":"Liu","year":"2002","journal-title":"bioRxiv"},{"key":"2022031506403542400_ref20","doi-asserted-by":"crossref","first-page":"e1008767","DOI":"10.1371\/journal.pcbi.1008767","article-title":"Deep6mA: a deep learning framework for exploring similar patterns in DNA N6-methyladenine sites across different species","volume":"17","author":"Li","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"2022031506403542400_ref21","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018"},{"key":"2022031506403542400_ref22","article-title":"On the application of BERT models for nanopore methylation detection","author":"Zhang","year":"2021","journal-title":"bioRxiv"},{"key":"2022031506403542400_ref23","doi-asserted-by":"crossref","first-page":"4603","DOI":"10.1093\/bioinformatics\/btab677","article-title":"iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization","volume":"37","author":"Yu","year":"2021","journal-title":"Bioinformatics"},{"key":"2022031506403542400_ref24","doi-asserted-by":"crossref","first-page":"1071","DOI":"10.3389\/fgene.2019.01071","article-title":"SNNRice6mA: a deep learning method for predicting DNA N6-methyladenine sites in Rice genome","volume":"10","author":"Yu","year":"2019","journal-title":"Front Genet"},{"key":"2022031506403542400_ref25","doi-asserted-by":"crossref","first-page":"bbaa124","DOI":"10.1093\/bib\/bbaa124","article-title":"DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites","volume":"22","author":"Liu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506403542400_ref26","doi-asserted-by":"crossref","first-page":"bbaa202","DOI":"10.1093\/bib\/bbaa202","article-title":"Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework","volume":"22","author":"Hasan","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506403542400_ref27","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1007\/s11103-020-00988-y","article-title":"i6mA-fuse: improved and robust prediction of DNA 6mA sites in the Rosaceae genome by fusing multiple feature representation","volume":"103","author":"Hasan","year":"2020","journal-title":"Plant Mol Biol"},{"key":"2022031506403542400_ref28","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/j.omtn.2019.08.011","article-title":"SDM6A: a web-based integrative machine-learning framework for predicting 6mA sites in the Rice genome","volume":"18","author":"Basith","year":"2019","journal-title":"Mol Ther Nucleic Acids"},{"key":"2022031506403542400_ref29","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1016\/j.ygeno.2018.01.005","article-title":"iDNA6mA-PseKNC: identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC","volume":"111","author":"Feng","year":"2019","journal-title":"Genomics"},{"key":"2022031506403542400_ref30","doi-asserted-by":"crossref","first-page":"4","DOI":"10.3389\/fpls.2020.00004","article-title":"6mA-RicePred: a method for identifying DNA N (6)-methyladenine sites in the Rice genome based on feature fusion","volume":"11","author":"Huang","year":"2020","journal-title":"Front Plant Sci"},{"key":"2022031506403542400_ref31","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1111\/tpj.14159","article-title":"De novo genome assembly of the stress tolerant forest species Casuarina equisetifolia provides insight into secondary growth","volume":"97","author":"Ye","year":"2019","journal-title":"Plant J"},{"key":"2022031506403542400_ref32","doi-asserted-by":"crossref","first-page":"D85","DOI":"10.1093\/nar\/gkw950","article-title":"MethSMRT: an integrative database for DNA N6-methyladenine and N4-methylcytosine generated by single-molecular real-time sequencing","volume":"45","author":"Ye","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506403542400_ref33","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1038\/s41438-019-0160-4","article-title":"MDR: an integrative DNA N6-methyladenine and N4-methylcytosine modification database for Rosaceae","volume":"6","author":"Liu","year":"2019","journal-title":"Horticulture Res"},{"key":"2022031506403542400_ref34","doi-asserted-by":"crossref","first-page":"11594","DOI":"10.1093\/nar\/gkx883","article-title":"N6-adenine DNA methylation is associated with the linker DNA of H2A.Z-containing well-positioned nucleosomes in pol II-transcribed genes in Tetrahymena","volume":"45","author":"Wang","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506403542400_ref35","doi-asserted-by":"crossref","first-page":"1266","DOI":"10.1089\/cmb.2018.0004","article-title":"iRNA-2OM: a sequence-based predictor for identifying 2'-O-methylation sites in Homo sapiens","volume":"25","author":"Yang","year":"2018","journal-title":"J Comput Biol"},{"key":"2022031506403542400_ref36","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013"},{"key":"2022031506403542400_ref37","article-title":"Distributed representations of words and phrases and their compositionality","author":"Mikolov","year":"2013"},{"key":"2022031506403542400_ref38","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"2022031506403542400_ref39","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1016\/j.neunet.2005.06.042","article-title":"Framewise phoneme classification with bidirectional LSTM and other neural network architectures","volume":"18","author":"Graves","year":"2005","journal-title":"Neural Netw"},{"key":"2022031506403542400_ref40","article-title":"Empirical evaluation of gated recurrent neural networks on sequence Modeling","author":"Chung","year":"2014"},{"key":"2022031506403542400_ref41","doi-asserted-by":"crossref","first-page":"145395","DOI":"10.1109\/ACCESS.2019.2939947","article-title":"A deep bidirectional GRU network model for biometric electrocardiogram classification based on recurrent neural networks","volume":"7","author":"Lynn","year":"2019","journal-title":"IEEE Access"},{"key":"2022031506403542400_ref42","first-page":"473","volume-title":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","author":"Jagannatha","year":"2016"},{"key":"2022031506403542400_ref43","article-title":"Attention is all you need","author":"Vaswani","year":"2017"},{"key":"2022031506403542400_ref44","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btab133","article-title":"BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides","volume":"37","author":"Charoenkwan","year":"2021","journal-title":"Bioinformatics"},{"key":"2022031506403542400_ref45","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1038\/nmeth.2646","article-title":"pLogo: a probabilistic approach to visualizing sequence motifs","volume":"10","author":"O'Shea","year":"2013","journal-title":"Nat Methods"},{"key":"2022031506403542400_ref46","doi-asserted-by":"crossref","first-page":"W534","DOI":"10.1093\/nar\/gkx323","article-title":"kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences","volume":"45","author":"Wu","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506403542400_ref47","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/W19-4828","article-title":"What does BERT look at? An analysis of BERT's attention","author":"Clark","year":"2019"},{"key":"2022031506403542400_ref48","volume-title":"Statistical Power Analysis for the Behavioral Sciences","author":"Cohen","year":"1988"},{"key":"2022031506403542400_ref49","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1093\/bioinformatics\/btz556","article-title":"MM-6mAPred: identifying DNA N6-methyladenine sites based on Markov model","volume":"36","author":"Pian","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506403542400_ref50","doi-asserted-by":"crossref","first-page":"793","DOI":"10.3389\/fgene.2019.00793","article-title":"iDNA6mA-Rice: a computational tool for detecting N6-methyladenine sites in Rice","volume":"10","author":"Lv","year":"2019","journal-title":"Front Genet"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac053\/42806331\/bbac053.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac053\/42806331\/bbac053.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,16]],"date-time":"2023-11-16T18:28:15Z","timestamp":1700159295000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac053\/6539171"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,28]]},"references-count":50,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,10]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac053","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,3]]},"published":{"date-parts":[[2022,2,28]]}}}