{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T13:21:53Z","timestamp":1720444913213},"reference-count":20,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,12,16]],"date-time":"2022-12-16T00:00:00Z","timestamp":1671148800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,12,16]],"date-time":"2022-12-16T00:00:00Z","timestamp":1671148800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100006229","name":"Oak Ridge Institute for Science and Education","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006229","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"Abstract<\/jats:title>\n Background<\/jats:title>\n The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters. The efficacy of parallelization depends on the data partitioning, where the optimal data partitioning relies on an accurate performance model. In previous studies, a BLAST job was sped up by 27 times by partitioning the database and query among thousands of processor nodes. However, the optimality of the partitioning method was not studied. Unlike BLAST performance models proposed in the literature that usually have problem size and hardware configuration as the only variables, the execution time of a BLAST job is a function of database size, query size, and hardware capability. In this work, the nucleotide BLAST application BLASTN was profiled using three methods: shell-level profiling with the Unix \u201ctime\u201d command, code-level profiling with the built-in \u201cprofiler\u201d module, and system-level profiling with the Unix \u201cgprof\u201d program. The runtimes were measured for six node types, using six different database files and 15 query files, on a heterogeneous HPC cluster with 500+ nodes. The empirical measurement data were fitted with quadratic functions to develop performance models that were used to guide the data parallelization for BLASTN jobs.<\/jats:p>\n <\/jats:sec>\n Results<\/jats:title>\n Profiling results showed that BLASTN contains more than 34,500 different functions, but a single function, RunMTBySplitDB, takes 99.12% of the total runtime. Among its 53 child functions, five core functions were identified to make up 92.12% of the overall BLASTN runtime. Based on the performance models, static load balancing algorithms can be applied to the BLASTN input data to minimize the runtime of the longest job on an HPC cluster. Four test cases being run on homogeneous and heterogeneous clusters were tested. Experiment results showed that the runtime can be reduced by 81% on a homogeneous cluster and by 20% on a heterogeneous cluster by re-distributing the workload.<\/jats:p>\n <\/jats:sec>\n Discussion<\/jats:title>\n Optimal data partitioning can improve BLASTN\u2019s overall runtime 5.4-fold in comparison with dividing the database and query into the same number of fragments. The proposed methodology can be used in the other applications in the BLAST+ suite or any other application as long as source code is available.<\/jats:p>\n <\/jats:sec>","DOI":"10.1186\/s12859-022-05029-7","type":"journal-article","created":{"date-parts":[[2022,12,16]],"date-time":"2022-12-16T15:03:22Z","timestamp":1671203002000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Profiling the BLAST bioinformatics application for load balancing on high-performance computing clusters"],"prefix":"10.1186","volume":"23","author":[{"given":"Trinity","family":"Cheng","sequence":"first","affiliation":[]},{"given":"Pei-Ju","family":"Chin","sequence":"additional","affiliation":[]},{"given":"Kenny","family":"Cha","sequence":"additional","affiliation":[]},{"given":"Nicholas","family":"Petrick","sequence":"additional","affiliation":[]},{"given":"Mike","family":"Mikailov","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,12,16]]},"reference":[{"issue":"3","key":"5029_CR1","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403\u201310.","journal-title":"J Mol Biol"},{"issue":"1","key":"5029_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-10-421","volume":"10","author":"C Camacho","year":"2009","unstructured":"Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. Blast+: architecture and applications. BMC Bioinformatics. 2009;10(1):1\u20139.","journal-title":"BMC Bioinformatics"},{"key":"5029_CR3","volume-title":"BLAST \u00aecommand line applications user manual [Internet]","author":"T Madden","year":"2021","unstructured":"Madden T. BLAST \u00aecommand line applications user manual [Internet]. National Center for Biotechnology Information (US), Bethesda: MD; 2021."},{"key":"5029_CR4","volume-title":"The design, implementation, and evaluation of mpiblast","author":"AE Darling","year":"2003","unstructured":"Darling AE, Carey L, Feng WC. The design, implementation, and evaluation of mpiblast. Los Alamos National Laboratory: Technical report; 2003."},{"issue":"8","key":"5029_CR5","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1109\/TPDS.2006.112","volume":"17","author":"C Oehmen","year":"2006","unstructured":"Oehmen C, Nieplocha J. Scalablast: a scalable implementation of blast for high-performance data-intensive bioinformatics analysis. IEEE Trans Parallel Distrib Syst. 2006;17(8):740\u20139.","journal-title":"IEEE Trans Parallel Distrib Syst"},{"issue":"13","key":"5029_CR6","doi-asserted-by":"publisher","first-page":"4335","DOI":"10.1093\/nar\/gki739","volume":"33","author":"YJ Kim","year":"2005","unstructured":"Kim YJ, Boyd A, Athey BD, Patel JM. miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST. Nucleic Acids Res. 2005;33(13):4335\u201344. https:\/\/doi.org\/10.1093\/nar\/gki739.","journal-title":"Nucleic Acids Res."},{"key":"5029_CR7","doi-asserted-by":"publisher","first-page":"3486","DOI":"10.7717\/peerj.3486","volume":"5","author":"WC Yim","year":"2017","unstructured":"Yim WC, Cushman JC. Divide and conquer (dc) blast: fast and easy blast execution within hpc environments. PeerJ. 2017;5:3486.","journal-title":"PeerJ"},{"issue":"1","key":"5029_CR8","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1016\/S1741-8364(04)02383-2","volume":"2","author":"R Luethy","year":"2004","unstructured":"Luethy R, Hoover C. Hardware and software systems for accelerating common bioinformatics sequence analysis algorithms. Drug Discov Today: BIOSILICO. 2004;2(1):12\u20137.","journal-title":"Drug Discov Today: BIOSILICO"},{"issue":"6","key":"5029_CR9","doi-asserted-by":"publisher","first-page":"1678","DOI":"10.1109\/TCBB.2011.33","volume":"8","author":"W Liu","year":"2011","unstructured":"Liu W, Schmidt B, Muller-Wittig W. Cuda-blastp: accelerating blastp on cuda-enabled graphics hardware. IEEE\/ACM Trans Comput Biol Bioinf. 2011;8(6):1678\u201384.","journal-title":"IEEE\/ACM Trans Comput Biol Bioinf"},{"issue":"2","key":"5029_CR10","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1093\/bioinformatics\/btq644","volume":"27","author":"PD Vouzis","year":"2011","unstructured":"Vouzis PD, Sahinidis NV. Gpu-blast: using graphics processors to accelerate protein sequence alignment. Bioinformatics. 2011;27(2):182\u20138.","journal-title":"Bioinformatics"},{"issue":"10","key":"5029_CR11","doi-asserted-by":"publisher","first-page":"1384","DOI":"10.1093\/bioinformatics\/btu047","volume":"30","author":"K Zhao","year":"2014","unstructured":"Zhao K, Chu X. G-blastn: accelerating nucleotide alignment by graphics processors. Bioinformatics. 2014;30(10):1384\u201391.","journal-title":"Bioinformatics"},{"issue":"14","key":"5029_CR12","first-page":"163","volume":"18","author":"M Mikailov","year":"2017","unstructured":"Mikailov M, Luo F-J, Barkley S, Valleru L, Whitney S, Liu Z, Thakkar S, Tong W, Petrick N. Scaling bioinformatics applications on hpc. BMC Bioinformatics. 2017;18(14):163\u20139.","journal-title":"BMC Bioinformatics"},{"key":"5029_CR13","unstructured":"Lastovetsky A, Reddy R: Data partitioning with a realistic performance model of networks of heterogeneous computers with task size limits. In: third international symposium on parallel and distributed computing\/third international workshop on algorithms, models and tools for parallel computing on heterogeneous networks, pp. 133\u2013140 (2004). IEEE"},{"key":"5029_CR14","unstructured":"Lastovetsky A, Reddy R: Data partitioning with a realistic performance model of networks of heterogeneous computers. In: 18th international parallel and distributed processing symposium, 2004. Proceedings., p. 104 (2004). IEEE"},{"key":"5029_CR15","doi-asserted-by":"crossref","unstructured":"Zhong Z, Rychkov V, Lastovetsky A: Data partitioning on heterogeneous multicore and multi-gpu systems using functional performance models of data-parallel applications. In: 2012 IEEE international conference on cluster computing, pp. 191\u2013199 (2012). IEEE","DOI":"10.1109\/CLUSTER.2012.34"},{"issue":"10","key":"5029_CR16","doi-asserted-by":"publisher","first-page":"2176","DOI":"10.1109\/TPDS.2018.2827055","volume":"29","author":"H Khaleghzadeh","year":"2018","unstructured":"Khaleghzadeh H, Manumachu RR, Lastovetsky A. A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous hpc platforms. IEEE Trans Parallel Distrib Syst. 2018;29(10):2176\u201390.","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"5029_CR17","doi-asserted-by":"crossref","unstructured":"Gu Y, Huang Z: Robinia-blast: an extensible parallel blast based on data-intensive distributed computing. In: 2014 IEEE 12th international conference on dependable, autonomic and secure computing, pp. 1\u20136 (2014). IEEE","DOI":"10.1109\/DASC.2014.10"},{"key":"5029_CR18","unstructured":"Cha IE, Rouchka EC: Comparison of current blast software on nucleotide sequences. In: 19th IEEE international parallel and distributed processing symposium, p. 8 (2005). IEEE"},{"issue":"5","key":"5029_CR19","doi-asserted-by":"publisher","first-page":"00307","DOI":"10.1128\/mSphere.00307-17","volume":"2","author":"AS Khan","year":"2017","unstructured":"Khan AS, Ng SH, Vandeputte O, Aljanahi A, Deyati A, Cassart J-P, Charlebois RL, Taliaferro LP. A multicenter study to evaluate the performance of high-throughput sequencing for virus detection. Msphere. 2017;2(5):00307\u201317.","journal-title":"Msphere"},{"issue":"6","key":"5029_CR20","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1145\/872726.806987","volume":"17","author":"SL Graham","year":"1982","unstructured":"Graham SL, Kessler PB, McKusick MK. Gprof: a call graph execution profiler. ACM Sigplan Notices. 1982;17(6):120\u20136.","journal-title":"ACM Sigplan Notices"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-05029-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-022-05029-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-05029-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,16]],"date-time":"2022-12-16T15:03:46Z","timestamp":1671203026000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-022-05029-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,16]]},"references-count":20,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["5029"],"URL":"https:\/\/doi.org\/10.1186\/s12859-022-05029-7","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,16]]},"assertion":[{"value":"19 April 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 October 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 December 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}],"article-number":"544"}}