Abstract
We have developed a non-heuristic tool (LASA) for the multiple sequence alignment problem (MSA), one of the most important problems in computational molecular biology. It is based on a dynamic programming algorithm for solving a Lagrangian relaxation of an integer linear programming (ILP) formulation for MSA. The objective function that is optimized by LASA models the sum-of-pairs scoring scheme and “truly” affine gap costs. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation LASA outperforms all exact algorithms for the multiple sequence alignment problem. Furthermore, the quality of the alignments ranks among the best computed so far.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Althaus, E., Canzar, S.: A lagrangian relaxation approach for the multiple sequence alignment problem. In: Dress, A.W.M., Xu, Y., Zhu, B. (eds.) COCOA. LNCS, vol. 4616. Springer, Heidelberg (2007)
Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Aligning multiple sequences by cutting planes. Mathematical Programming 105, 387–425 (2006)
Beasley, J.: Lagrangian Relaxation. In: Modern heuristic techniques for combinatorial problems. Blackwell Scientific Publications (1993)
Caprara, A., Fischetti, M., Toth, P.: A heuristic method for the set cover problem. Operations Research 47, 730–743 (1999)
Edgar, R.C.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32(5), 1792–1797 (2004)
Elias, I.: Settling the intractability of multiple alignment. In: Ibaraki, T., Katoh, N., Ono, H. (eds.) ISAAC 2003. LNCS, vol. 2906, pp. 352–363. Springer, Heidelberg (2003)
Eppstein, D.: Sequence comparison with mixed convex and concave costs. Journal of Algorithms (11), 85–101 (1990)
Gupta, S., Kececioglu, J., Schaeffer, A.: Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J. Comput. Biol. 2, 459–472 (1995)
Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge (1997)
Held, M., Karp, R.: The traveling salesman problem and minimum spanning trees: part ii. Mathematical Programming 1, 6–25 (1971)
Katoh, K., Ichi Kuma, K., Toh, H., Miyata, T.: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33, 511 (2005)
Lee, C., Grasso, C., Sharlow, M.F.: Multiple sequence alignment using partial order graphs. Bioinformatics 18(3), 452–464 (2002)
Lipman, D., Altschul, S., Kececioglu, J.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. U.S.A. 86, 4412–4415 (1989)
Lucena, A.: Steiner problem in graphs: Lagrangean relaxation and cutting-planes. COAL Bulletin 21, 2–7 (1993)
Mehlhorn, K., Näher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999)
Morgenstern, B.: DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucl. Acids Res. 32(2), 33–36 (2004)
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)
Reinert, K.: A Polyhedral Approach to Sequence Alignment Problems. PhD thesis, Universität des Saarlandes (1999)
Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families (1998)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22(22), 4673–4680 (1994)
Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15(1), 87–88 (1999)
Walle, I.V., Lasters, I., Wyns, L.: SABmark - a benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21, 1267–1268 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Althaus, E., Canzar, S. (2008). LASA: A Tool for Non-heuristic Alignment of Multiple Sequences. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-70600-7_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70598-7
Online ISBN: 978-3-540-70600-7
eBook Packages: Computer ScienceComputer Science (R0)