Abstract
The serverless computing paradigm allows simplifying operations, offers highly parallel execution and high scalability without the need for manual management of underlying infrastructure. This paper aims to evaluate if recent advancements such as container support and increased computing resource limits in AWS Lambda allow it to serve as an underlying platform for running bioinformatics workflows such as basecalling of nanopore reads. For the purposes of the paper, we developed a sample workflow, where we focused on Guppy basecaller, which was tested in multiple scenarios. The results of the experiments showed that AWS Lambda is a viable platform for basecalling, which can support basecalling nanopore reads from multiple sequencing reads at the same time while keeping low infrastructure maintenance overhead. We also believe that recent improvements to AWS Lambda make it an interesting choice for a growing number of bioinformatics applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
AWS Lambda container image support. Accessed 5 Feb 2021. https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/
AWS Lambda limits. Accessed 5 Feb 2021. https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
AWS Lambda pricing. Accessed 5 Feb 2021. https://aws.amazon.com/lambda/pricing/
AWS Lambda support for 10240 MB and 6 vCPU cores. Accessed 5 Feb 2021. https://aws.amazon.com/about-aws/whats-new/2020/12/aws-lambda-supports-10gb-memory-6-vcpu-cores-lambda-functions/
Bonito basecaller repository on Github. Accessed 5 Feb 2021. https://github.com/nanoporetech/bonito
Creating faster AWS Lambda functions with AVX2. Accessed 5 Feb 2021. https://aws.amazon.com/blogs/compute/creating-faster-aws-lambda-functions-with-avx2/
How Intel\(\textregistered \) Advanced Vector Extensions 2 improves performance on server applications. Accessed 5 Feb 2021. https://software.intel.com/content/www/us/en/develop/articles/how-intel-avx2-improves-performance-on-server-applications.html
Augustyn, D.R., Wyciślik, L., Mrozek, D.: Perspectives of using cloud computing in integrative analysis of multi-omics data. Briefings Funct. Genomics 1–23 (2021, in press)
Baldini, I., et al.: Serverless computing: current trends and open problems. In: Chaudhary, S., Somani, G., Buyya, R. (eds.) Research Advances in Cloud Computing, pp. 1–20. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-5026-8_1
Bashari Rad, B., Bhatti, H., Ahmadi, M.: An introduction to Docker and analysis of its performance. IJCSNS Int. J. Comput. Sci. Netw. Secur. 17(3), 228–235 (2017)
Boža, V., Perešíni, P., Brejová, B., Vinař, T.: Deepnano-blitz: a fast base caller for minion nanopore sequencers. Bioinformatics 36, 4191–4192 (2020)
Burkat, K., et al.: Serverless containers - rising viable approach to scientific workflows. ArXiv abs/2010.11320 (2020)
Crespo-Cepeda, R., Agapito, G., Vazquez-Poletti, J.L., Cannataro, M.: Challenges and opportunities of amazon serverless lambda services in bioinformatics. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2019, pp. 663–668. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3307339.3343462
Eismann, S., et al.: A review of serverless use cases and their characteristics. arXiv 2008.11110 (2021)
John, A., Ausmees, K., Muenzen, K., Kuhn, C., Tan, A.: SWEEP: accelerating scientific research through scalable serverless workflows. In: Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing Companion, UCC 2019, pp. 43–50. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3368235.3368839
Jonas, E., Pu, Q., Venkataraman, S., Stoica, I., Recht, B.: Occupy the cloud: distributed computing for the 99 Cloud Computing. In: SoCC 2017, pp. 445–451. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3127479.3128601
Jonas, E., et al.: Cloud programming simplified: a Berkeley view on serverless computing. CoRR abs/1902.03383 (2019). http://arxiv.org/abs/1902.03383
Joyner, S., MacCoss, M., Delimitrou, C., Weatherspoon, H.: Ripple: a practical declarative programming framework for serverless compute. CoRR abs/2001.00222 (2020). http://arxiv.org/abs/2001.00222
Lee, B., Timony, M., Ruiz, P.: DNAvisualization.org: a serverless web tool for DNA sequence visualization. Nucleic Acids Res. 47, W20–W25 (2019)
Malawski, M., Gajek, A., Zima, A., Balis, B., Figiela, K.: Serverless execution of scientific workflows: experiments with hyperflow, AWS lambda and google cloud functions. Future Gener. Comput. Syst. 110, 502–514 (2020). https://www.sciencedirect.com/science/article/pii/S0167739X1730047X
Niu, X., Kumanov, D., Hung, L.H., Lloyd, W., Yeung, K.Y.: Leveraging serverless computing to improve performance for sequence comparison. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2019, pp. 683–687. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3307339.3343465
Scheuner, J., Leitner, P.: Function-as-a-service performance evaluation: a multivocal literature review. J. Syst. Softw. 170, 110708 (2020). https://www.sciencedirect.com/science/article/pii/S0164121220301527
Wick, R.R., Judd, L.M., Holt, K.E.: Performance of neural network basecalling tools for oxford nanopore sequencing. Genome Biol. 20(1), 129 (2019). https://doi.org/10.1186/s13059-019-1727-y
Zeng, J., Cai, H., Peng, H., Wang, H., Zhang, Y., Akutsu, T.: Causalcall: nanopore basecalling using a temporal convolutional network. Frontiers Genet. 10, 1332 (2020). https://www.frontiersin.org/article/10.3389/fgene.2019.01332
Acknowledgments
The research was supported by the Polish Ministry of Science and Higher Education as a part of the CyPhiS program at the Silesian University of Technology, Gliwice, Poland (Contract No. POWR.03.02.00-00-I007/17-00) and by Statutory Research funds of Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland (grant No. BK-221/RAu7/2021).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Grzesik, P., Mrozek, D. (2021). Serverless Nanopore Basecalling with AWS Lambda. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12743. Springer, Cham. https://doi.org/10.1007/978-3-030-77964-1_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-77964-1_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77963-4
Online ISBN: 978-3-030-77964-1
eBook Packages: Computer ScienceComputer Science (R0)