Abstract
Technology enhancements and the growing breadth of application workflows running on high-performance computing (HPC) platforms drive the development of new data services that provide high performance on these new platforms, provide capable and productive interfaces and abstractions for a variety of applications, and are readily adapted when new technologies are deployed. The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices. Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration, Mochi allows each application to use a data service specialized to its needs and access patterns. This paper introduces the Mochi framework and methodology. The Mochi core components and microservices are described. Examples of the application of the Mochi methodology to the development of four specialized services are detailed. Finally, a performance evaluation of a Mochi core component, a Mochi microservice, and a composed service providing an object model is performed. The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Venkatesan S, Aoulaiche M. Overview of 3D NAND technologies and outlook invited paper. In Proc. the 2018 Non-Volatile Memory Technology Symposium, Oct. 2018, Article No. 15.
Hady F T, Foong A, Veal B, Williams D. Platform storage performance with 3D XPoint technology. Proceedings of the IEEE, 2017, 105(9): 1822-1833.
Kim J, Dally W J, Scott S, Abts D. Technology-driven, highly-scalable dragonfly topology. ACM SIGARCH Comput. Architecture News, 2008, 36(3): 77-88.
Besta M, Hoeer T. Slim Fly: A cost effective low-diameter network topology. In Proc. the Int. Conf. for High Performance Comput., Networking, Storage and Anal., November 2014, pp.348-359.
Flajslik M, Borch E, Parker M A. Megafly: A topology for exascale systems. In Proc. the 33rd International Conference on High Performance Computing, June 2018, pp.289-310.
Shpiner A, Haramaty Z, Eliad S, Zdornov V, Gafni B, Zahavi E. Dragonfly+: Low cost topology for scaling datacenters. In Proc. the 3rd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era, February 2017, pp.1-8.
Sivaraman G, Beard E, Vazquez-Mayagoitia A, Vishwanath V, Cole J. UV/vis absorption spectra database autogenerated for optical applications via the Argonne data science program. In Proc. the 2019 APS March Meeting, March 2019.
Lockwood G K, Hazen D, Koziol Q et al. Storage 2020: A vision for the future of HPC storage. Technical Report, National Energy Research Scientific Computing Center, 2017. https://escholarship.org/content/qt744479dp/qt744479dp.pdf, Sept. 2019.
Seo S, Amer A, Balaji P et al. Argobots: A lightweight lowlevel 1threading and tasking framework. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(3): 512-526.
Soumagne J, Kimpe D, Zounmevo J, Chaarawi M, Koziol Q, Afsahi A, Ross R. Mercury: Enabling remote procedure call for high-performance computing. In Proc. the 2013 IEEE International Conference on Cluster Computing, September 2013, Article No. 50.
Das A, Gupta I, Motivala A. SWIM: Scalable weaklyconsistent infection-style process group membership protocol. In Proc. the 2002 International Conference on Dependable Systems and Networks, June 2002, pp.303-312.
Rudoff A. Persistent memory programming. Login: The Usenix Magazine, 2017, 42(2): 34-40.
Carns P, Jenkins J, Cranor C, Atchley S, Seo S, Snyder S, Hoeer T, Ross R. Enabling NVM for data-intensive scientific services. In Proc. the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, November 2016, Article No. 4.
Ghemawat S, Dean J. Level DB — A fast and lightweight key/value database library by Google. https://github.com/google/leveldb, Sept. 2019.
Olson M A, Bostic K, Seltzer M I. Berkeley DB. In Proc. the 1999 USENIX Annual Technical Conference, June 1999, pp.183-191.
Dorier M, Carns P, Harms K et al. Methodology for the rapid development of scalable HPC data services. In Proc. the 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, November 2018, pp.76-87.
van der Walt S, Colbert S C, Varoquaux G. The NumPy array: A structure for efficient numerical computation. Computing in Science & Engineering, 2011, 13(2): 22-30.
Rosenblum M, Ousterhout J K. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems, 1992, 10(1): 26-52.
Brun R, Rademakers F. ROOT — An object oriented data analysis framework. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 1997, 389(1/2): 81-86.
Perez D, Cubuk E D, Waterland A, Kaxiras E, Voter A F. Long-time dynamics through parallel trajectory splicing. Journal of Chemical Theory and Computation, 2015, 12(1): 18-28.
Sevilla M A, Maltzahn C, Alvaro P, Nasirigerdeh R, Settlemyer B W, Perez D, Rich D, Shipman G M. Programmable caches with a data management language and policy engine. In Proc. the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2018, pp.203-212.
Zheng Q, Cranor C D, Guo D H, Ganger G R, Amvrosiadis G, Gibson G A, Settlemyer B W, Grider G, Guo F. Scaling embedded in-situ indexing with deltaFS. In Proc. the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2018, Article No. 3.
Greenberg H, Bent J, Grider G. MDHIM: A parallel key/value framework for HPC. In Proc. the 7th USENIX Workshop on Hot Topics in Storage and File Systems, July 2015, Article No. 10.
Weil S A, Leung A W, Brandt S A, Maltzahn C. RADOS: A scalable, reliable storage service for petabyte-scale storage clusters. In Proc. the 2nd International Petascale Data Storage Workshop, November 2007, pp.35-44.
Weil S A, Brandt S A, Miller E L, Long D D E, Maltzahn C. Ceph: A scalable, high-performance distributed file system. In Proc. the 7th USENIX Symposium on Operating Systems Design and Implementation, November 2006, pp.307-320.
Liu J L, Koziol Q, Butler G F, Fortner N, Chaarawi M, Tang H J, Byna S, Lockwood G K, Cheema R, Kallback-Rose K A, Hazen D, Prabhat. Evaluation of HPC application I/O on object storage systems. In Proc. the 3rd IEEE/ACM International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, November 2018, pp.24-34.
Escriva R, Sirer E G. The design and implementation of the warp transactional file system. In Proc. the 13th USENIX Symposium on Networked Systems Design and Implementation, March 2016, pp.469-483.
Kunkel J, Betke E. An MPI-IO in-memory driver for nonvolatile pooled memory of the Kove XPD. In Proc. the 2017 International Workshops on High Performance Computing, June 2017, pp.679-690.
Latham R, Ross R B, Thakur R. Can MPI be used for persistent parallel services? In Proc. the 13th European PVM/MPI Users’ Group Meeting, September 2006, pp.275-284.
VefM A,Moti N, Süß T, Tocci T, Nou R,Miranda A, Cortes T, Brinkmann A. GekkoFS — A temporary distributed file system for HPC applications. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp.319-324.
Wang T, Mohror K, Moody A, Sato K, Yu W K. An ephemeral burst-buffer file system for scientific applications. In Proc. the 2016 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2016, pp.807-818.
Tang H J, Byna S, Tessier F et al. Toward scalable and asynchronous object-centric data management for HPC. In Proc. the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2018, pp.113-122.
Intel Corporation. DAOS: Revolutionizing high-performance storage with Intel Optane technology. https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/high-performance-storage-brief.pdf, June 2019.
Zhao D F, Zhang Z, Zhou X B, Li T L, Wang K, Kimpe D, Carns P, Ross R, Raicu I. FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems. In Proc. the 2014 IEEE International Conference on Big Data, October 2014, pp.61-70.
Docan C, Parashar M, Klasky S. DataSpaces: An interaction and coordination framework for coupled simulation workflows. Cluster Computing, 2011, 15(2): 163-181.
Docan C, Parashar M, Klasky S. Enabling high-speed asynchronous data extraction and transfer using DART. Concurrency and Computation: Practice and Experience, 2010, 22(9): 1181-1204.
Duro F R, Blas J G, Isaila F, Pérez J C, Wozniak J M, Ross R. Exploiting data locality in Swift/T workflows using Hercules. In Proc. the 1st Network for Sustainable Ultrascale Computing Workshop, October 2014.
Fitzpatrick B. Distributed caching with Memcached. Linux Journal, 2004, 2004(124): 72-76.
Kim J, Lee S, Vetter J S. PapyrusKV: A high-performance parallel key-value store for distributed NVM architectures. In Proc. the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017, Article No. 57.
FringsW, Ahn D H, LeGendre M, Gamblin T, de Supinski B R, Wolf F. Massively parallel loading. In Proc. the 27th International ACM Conference on International Conference on Supercomputing, June 2013, pp.389-398.
Kougkas A, Devarajan H, Lofstead J, Sun X H. LABIOS: A distributed label-based I/O system. In Proc. the 28th International Symposium on High-Performance Parallel and Distributed Computing, June 2019, pp.13-24.
Anwar A, Cheng Y, Huang H, Han J, Sim H, Lee D, Douglis F, Butt A R. BESPOKV: Application tailored scale-out key-value stores. In Proc. the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2018, Article No. 2.
Ulmer C, Mukherjee S, Templet G, Levy S, Lofstead J, Widener P, Kordenbrock T, Lawson M. Faodel: Data management for next-generation application workflows. In Proc. the 9th Workshop on Scientific Cloud Computing, June 2018, Article No. 8.
Sevilla M A, Watkins N, Jimenez I, Alvaro P, Finkelstein S, LeFevre J, Maltzahn C. Malacology: A programmable storage system. In Proc. the 12th European Conference on Computer Systems, April 2017, pp.175-190.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
ESM 1
(PDF 1345 kb)
Rights and permissions
About this article
Cite this article
Ross, R.B., Amvrosiadis, G., Carns, P. et al. Mochi: Composing Data Services for High-Performance Computing Environments. J. Comput. Sci. Technol. 35, 121–144 (2020). https://doi.org/10.1007/s11390-020-9802-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-020-9802-0