Abstract
This study introduces a method to address the problem of building a draft de novo assembly of complex genomes when a collection of well-assembled long-insert pools is available. Sequencing and assembling a collection of such pools reduces the complexity of the assembly and has been proven to be a viable strategy in order to carry out downstream analyses in recent sequencing projects. In this work we depict a framework to tackle this problem: we propose a novel fingerprinting technique to speed up overlap detection and we describe a merging technique based on the well established string graph structure in order to carry out the reconciliation step. Finally, we show some preliminary results on simulated data sets based on the human chromosome 14 obtained with an early implementation of a tool we called Hierarchical Assemblies Merger.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alexeyenko, A., Nystedt, B., Vezzi, F., Sherwood, E., Ye, R., Knudsen, B., Simonsen, M., Turner, B., de Jong, P., Wu, C.C., Lundeberg, J.: Efficient de novo assembly of large and complex genomes by massively parallel sequencing of fosmid pools. BMC Genomics 15(1), 439 (2014)
Nystedt, B., Street, N., Wetterbom, A., Zuccolo, A., Lin, Y., Scofield, D., Vezzi, F., Delhomme, N., Giacomello, S., Alexeyenko, A., Vicedomini, R., Sahlin, K., Sherwood, E., Elfstrand, M., Gramzow, L., Holmberg, K., Hallman, J., Keech, O., Klasson, L., Koriabine, M., Kucukoglu, M., Kaller, M., Luthman, J., Lysholm, F., Niittyla, T., Olson, A., Rilakovic, N., Ritland, C., Rossello, J., Sena, J.: The norway spruce genome sequence and conifer genome evolution. Nature 497, 579–584 (2013)
Zhang, G., Fang, X., Guo, X., Li, L., Luo, R., Xu, F., Yang, P., Zhang, L., Wang, X., Qi, H., Xiong, Z., Que, H., Xie, Y., Holland, P., Paps, J., Zhu, Y., Wu, F., Chen, Y., Wang, J., Peng, C., Meng, J., Yang, L., Liu, J., Wen, B., Zhang, N., Huang, Z., Zhu, Q., Feng, Y., Mount, A., Hedgecock, D., Xu, Z., Liu, Y., Domazet-Loso, T., Du, Y., Sun, X., Zhang, S., Liu, B., Cheng, P., Jiang, X., Li, J., Fan, D., Wang, W., Fu, W., Wang, T., Wang, B., Zhang, J., Peng, Z., Li, Y., Li, N., Wang, J., Chen, M., He, Y., Tan, F., Song, X., Zheng, Q., Huang, R., Yang, H., Du, X., Chen, L., Yang, M., Gaffney, P., Wang, S., Luo, L., She, Z., Ming, Y., Huang, W., Zhang, S., Huang, B., Zhang, Y., Qu, T., Ni, P., Miao, G., Wang, J., Wang, Q., Steinberg, C., Wang, H., Li, N., Qian, L., Zhang, G., Li, Y., Yang, H., Liu, X., Wang, J., Yin, Y., Wang, J.: The oyster genome reveals stress adaptation and complexity of shell formation. Nature 490, 49–54 (2012)
Yao, G., Ye, L., Gao, H., Minx, P., Warren, W.C., Weinstock, G.M.: Graph accordance of next-generation sequence assemblies. Bioinformatics (2011)
Vicedomini, R., Vezzi, F., Scalabrin, S., Arvestad, L., Policriti, A.: Gam-ngs: genomic assemblies merger for next generation sequencing. BMC Bioinformatics 14(suppl. 7), S6 (2013)
Soueidan, H., Maurier, F., Groppi, A., Sirand-Pugnet, P., Tardy, F., Citti, C., Dupuy, V., Nikolski, M.: Finishing bacterial genome assemblies with mix. BMC Bioinformatics 14(suppl. 15), S16 (2013)
Myers, E.W.: The fragment assembly string graph. Bioinformatics 21(suppl. 2), ii79–ii85 (2005)
Hunt, M., Kikuchi, T., Sanders, M., Newbold, C., Berriman, M., Otto, T.: Reapr: a universal tool for genome assembly evaluation. Genome Biology 14(5), R47 (2013)
Vezzi, F., Narzisi, G., Mishra, B.: Reevaluating assembly evaluations with feature response curves: Gage and assemblathons. PLoS One 7(12), e52210 (2012)
Policriti, A., Prezza, N.: Hashing and indexing: Succinct data structures and smoothed analysis. In: Ahn, H.-K., Shin, C.-S. (eds.) ISAAC 2014. LNCS, vol. 8889, pp. 155–166. Springer, Heidelberg (2014)
Hu, X., Yuan, J., Shi, Y., Lu, J., Liu, B., Li, Z., Chen, Y., Mu, D., Zhang, H., Li, N., Yue, Z., Bai, F., Li, H., Fan, W.: pirs: Profile-based illumina pair-end reads simulator. Bioinformatics 28(11), 1533–1535 (2012)
Salzberg, S.L., Phillippy, A.M., Zimin, A.V., Puiu, D., Magoc, T., Koren, S., Treangen, T., Schatz, M.C.: Delcher, a.L., Roberts, M., Marcais, G., Pop, M., Yorke, J.A.: GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome Research (December 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Vicedomini, R., Vezzi, F., Scalabrin, S., Arvestad, L., Policriti, A. (2015). Hierarchical Assembly of Pools. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9044. Springer, Cham. https://doi.org/10.1007/978-3-319-16480-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-16480-9_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16479-3
Online ISBN: 978-3-319-16480-9
eBook Packages: Computer ScienceComputer Science (R0)