Abstract
With growth of sequenced genome, a number of algorithms for gene identification were created. These algorithms use fixed gene features which are chosen based on observation or experience. These features may not be major features of a genome. In this paper, we illustrate several candidate features and propose a dynamic feature choosing algorithm to determine the major features. We describe nucleotide sequence by feature vector and use Discriminant analysis to them to make decision on coding/non-coding. To test the algorithm, we apply the algorithm to the S.cerevisiae genome and achieve accuracy of above 98%.
Chapter PDF
Similar content being viewed by others
References
Bennetzen, J.L., Benjamin, D.H.: J.Biol.Chem 257, 3026–3031 (1982)
Zhang, C., Wang, J: Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acid Res.2000 28(14), 2804–2814 (2000)
Wang, Y.H, Zhang, C.-T., Dong, P.X.: Recognizing shorter coding regions of human genes based on the statistics of stop codons[J ]. Biopolymers 63, 207–216 (2002)
Kotlar, D., Lavner, Y.: Gene Prediction by Spectral Rotation Measure: A New Method for Identigying Protein-Coding Regions. Genome Res. 13, 1930–1937 (2003)
Fickett, J.W., Tung, C.S.: Assessment of protein coding measures. Nucleic Acids Res. 20, 6441–6450 (1992)
Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, London, UK (1979)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, JW., Yang, L., Zhang, XZ. (2007). A Method for Gene Identification by Dynamic Feature Choosing. In: Duffy, V.G. (eds) Digital Human Modeling. ICDHM 2007. Lecture Notes in Computer Science, vol 4561. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73321-8_78
Download citation
DOI: https://doi.org/10.1007/978-3-540-73321-8_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73318-8
Online ISBN: 978-3-540-73321-8
eBook Packages: Computer ScienceComputer Science (R0)