Affiliations: EML Research, Schloss-Wolfsbrunnenweg 33, D-69118
Heidelberg, Germany. E-mail: {wegner, gauges,
kummer}@eml-r.villa-bosch.de | F. Hoffmann-La Roche Ltd, Grenzacherstr. 124, CH -
4070 Basel, Switzerland. E-mail: [email protected] | Department of Physics, University of Notre Dame 225
Nieuwland Science Hall, Notre Dame, IN 46556, USA. E-mail: [email protected]
Note: [] Corresponding author
Abstract: The basic linear treatment of sequence comparisons limits the
ability of contemporary sequence alignment algorithms to detect
non-order-conserving recombinations. Here, we introduce the algorithm combAlign
which addresses the assessment of pairwise sequence similarity on
non-order-conserving recombinations on a large scale. Emphasizing a two-level
approach, combAlign first detects locally well conserved subsequences in a
target and a source sequence. Subsequently, the relative placement of
alignments is mapped to a graph. Concatenating local alignments to reassemble
the target sequence to the fullest extent, the maximum scoring path through the
graph denotes the best attainable combAlignment. Parameters influencing this
process can be set to meet the user's specific demands. combAlign is applied to
examples demonstrating the possibility to reflect evolutionary kinship of
proteins even if their domains and motifs are strongly rearranged.
Keywords: point mutations, shuffling events, dynamic programming, graph theory, DAG