Abstract
Probabilistic phylogenetic models which relax the site independence evolution assumption often face the problem of infeasible likelihood computations, for example for the task of selecting suitable parameters for the model. We present a new approximation method, applicable for a wide range of probabilistic models, which guarantees to upper bound the true likelihood of data, and apply it to the problem of probabilistic phylogenetic models. The new method is complementary to known variational methods that lower bound the likelihood, and it uses similar methods to optimize the bounds from above and below. We applied our method to aligned DNA sequences of various lengths from human in the region of the CFTR gene and homologous from eight mammals, and found the upper bounds to be appreciably close to the true likelihood whenever it could be computed. When computing the exact likelihood was not feasible, we demonstrated the proximity of the upper and lower variational bounds, implying a tight approximation of the likelihood.
Preview
Unable to display preview. Download preview PDF.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Bishop, C., Winn, J.: Structured variational distributions in VIBES. In: Artificial Intelligence and Statistics, Key West, Florida, USA, 2003, Society for Artificial Intelligence and Statistics (2003)
Cooper, G.: Probabilistic inference using belief networks is NP-hard. Artificial Intelligence 42, 393–405 (1990)
Dagum, P., Luby, M.: Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artificial Intelligence 60(1), 141–153 (1993)
Dechter, R.: Bucket elimination: A unifying framework for reasoning. Artificial Intelligence 113(1-2), 41–85 (1999)
Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17, 368–376 (1981)
Fishelson, M., Geiger, D.: Exact genetic linkage computations for general pedigrees. Bioinformatics 18, S189–S198 (2002)
Geiger, D., Meek, C., Wexler, Y.: A variational inference procedure allowing internal structure for overlapping clusters and deterministic constraints. Journal of Artificial Intelligence Research (JAIR) 27, 1–23 (2006)
Ghahramani, Z., Jordan, M.I.: Factorial hidden Markov models. Machine Learning 29, 245–273 (1997)
Jensen, F.V.: Bayesian Networks and Decision Graphs. Springer, New York (2001)
Jojic, V., Jojic, N., Meek, C., Geiger, D., Siepel, A., Haussler, D., Heckerman, D.: Efficient approximations for learning phylogenetic HMM models from data. Bioinformatics 20, 161–168 (2004)
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. In: Learning Graphical Models, MIT Press, Cambridge (1999)
Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on information theory 47(2), 498–519 (2001)
Neyman, J.: Molecular studies of evolution: a source of novel statistical problems. In: Gupta, S.S., Yackel, J. (eds.) Statistical desicion theory and related topics, pp. 1–27. Academic Press, New York (1971)
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Mateo (1988)
Saul, L.K., Jordan, M.I.: Exploiting tractable substructures in intractable networks. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 8, pp. 486–492. MIT Press, Cambridge (1996)
Siepel, A., Haussler, D.: Combining phylogenetic and hidden markov models in biosequence analysis. In: RECOMB ’03: Proceedings of the seventh annual international conference on Research in computational molecular biology, pp. 277–286. ACM Press, New York (2003)
Wiegerinck, W.: Variational approximations between mean field theory and the junction tree algorithm. In: UAI ’00: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, pp. 626–633. Morgan Kaufmann Publishers, San Francisco (2000)
Xing, E.P., Jordan, M.I., Russell, S.: Graph partition strategies for generalized mean field inference. In: AUAI ’04: Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp. 602–610. AUAI Press, Arlington (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Wexler, Y., Geiger, D. (2007). Variational Upper Bounds for Probabilistic Phylogenetic Models. In: Speed, T., Huang, H. (eds) Research in Computational Molecular Biology. RECOMB 2007. Lecture Notes in Computer Science(), vol 4453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71681-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-71681-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71680-8
Online ISBN: 978-3-540-71681-5
eBook Packages: Computer ScienceComputer Science (R0)