Abstract
Identification of enhancers and their strength prediction plays an important role in gene expression regulation and currently an active area of research. However, its identification specifically through experimental approaches is extremely time consuming and labor-intensive task. Several machine learning methodologies have been proposed to accurately discriminate enhancers from regulatory elements and to estimate their strength. Existing approaches utilise different statistical measures for feature encoding which mainly capture residue specific physico-chemical properties upto certain extent but ignore semantic and positional information of residues. This paper presents “Enhancer-DSNet”, a two-layer precisely deep neural network which makes use of a novel k-mer based sequence representation scheme prepared by fusing associations between k-mer positions and sequence type. Proposed Enhancer-DSNet methodology is evaluated on a publicly available benchmark dataset and independent test set. Experimental results over benchmark independent test set indicate that proposed Enhancer-DSNet methodology outshines the performance of most recent predictor by the figure of 2%, 1%, 2%, and 5% in terms of accuracy, specificity, sensitivity and matthews correlation coefficient for enhancer identification task and by the figure of 15%, 21%, and 39% in terms of accuracy, specificity, and matthews correlation coefficient for strong/weak enhancer prediction task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Almeida, F., Xexéo, G.: Word embeddings: a survey. arXiv preprint arXiv:1901.09069 (2019)
Bepler, T., Berger, B.: Learning protein sequence embeddings using information from structure. arXiv preprint arXiv:1902.08661 (2019)
Boyle, A.P., et al.: High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 21(3), 456–464 (2011)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Bu, H., Gan, Y., Wang, Y., Zhou, S., Guan, J.: A new method for enhancer prediction based on deep belief network. BMC Bioinformatics 18(12), 418 (2017)
Chen, J., Liu, H., Yang, J., Chou, K.C.: Prediction of linear b-cell epitopes using amino acid pair antigenicity scale. Amino Acids 33(3), 423–428 (2007)
Ernst, J., et al.: Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473(7345), 43–49 (2011)
Erwin, G.D., et al.: Integrating diverse datasets improves developmental enhancer prediction. PLoS Comput. Biol. 10(6), e1003677 (2014)
Firpi, H.A., Ucar, D., Tan, K.: Discover regulatory DNA elements using chromatin signatures and artificial neural network. Bioinformatics 26(13), 1579–1586 (2010)
He, W., Jia, C.: EnhancerPred2.0: predicting enhancers and their strength based on position-specific trinucleotide propensity and electron-ion interaction potential feature selection. Mol. Biosyst. 13(4), 767–774 (2017)
Heintzman, N.D., Ren, B.: Finding distal regulatory elements in the human genome. Curr. Opin. Genet. Dev. 19(6), 541–549 (2009)
Jia, C., He, W.: EnhancerPred: a predictor for discovering enhancers based on the combination and selection of multiple features. Sci. Rep. 6, 38741 (2016)
de Lara, J.C.F., Arzate-Mejía, R.G., Recillas-Targa, F.: Enhancer RNAs: insights into their biological role. Epigenetics Insights 12, 2516865719846093 (2019)
Le, N.Q.K., Yapp, E.K.Y., Ho, Q.T., Nagasundaram, N., Ou, Y.Y., Yeh, H.Y.: iEnhancer-5Step: identifying enhancers using hidden information of DNA sequences via Chou’s 5-step rule and word embedding. Anal. Biochem. 571, 53–61 (2019)
Liu, B., Fang, L., Long, R., Lan, X., Chou, K.C.: iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics 32(3), 362–369 (2016)
Liu, B., Li, K., Huang, D.S., Chou, K.C.: iEnhancer-EL: identifying enhancers and their strength with ensemble learning approach. Bioinformatics 34(22), 3835–3842 (2018)
Liu, B., Yang, F., Huang, D.S., Chou, K.C.: iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34(1), 33–40 (2018)
Ng, P.: dna2vec: consistent vector representations of variable-length k-mers. arXiv preprint arXiv:1701.06279 (2017)
Omar, N., Wong, Y.S., Li, X., Chong, Y.L., Abdullah, M.T., Lee, N.K.: Enhancer prediction in proboscis monkey genome: a comparative study. J. Telecommun. Electron. Comput. Eng. (JTEC) 9(2–9), 175–179 (2017)
Rajagopal, N., et al.: RFECS: a random-forest based algorithm for enhancer identification from chromatin state. PLoS Comput. Biol. 9(3), e1002968 (2013)
Tan, K.K., Le, N.Q.K., Yeh, H.Y., Chua, M.C.H.: Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties. Cells 8(7), 767 (2019)
Visel, A., et al.: ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457(7231), 854–858 (2009)
Wang, Y., et al.: A comparison of word embeddings for the biomedical natural language processing. J. Biomed. Inform. 87, 12–20 (2018)
Yang, B., et al.: BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone. Bioinformatics 33(13), 1930–1936 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Asim, M.N., Ibrahim, M.A., Malik, M.I., Dengel, A., Ahmed, S. (2020). Enhancer-DSNet: A Supervisedly Prepared Enriched Sequence Representation for the Identification of Enhancers and Their Strength. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-63836-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)