Abstract
Parallel file systems are experiencing more and more applications from various fields. Various applications have different I/O workload characteristics, which have diverse requirements on accessing storage resources. However, parallel file systems often adopt the “one-size-fits-all” solution, which fails to meet specific application needs and hinders the full exploitation of potential performance. This paper presents a framework to enable dynamic file I/O path selection with fine granularity at runtime. The framework adopts a file handle-rich scheme to allow file systems choose corresponding optimizations to serve I/O requests. Consistency control algorithms are proposed to ensure data consistency while changing optimizations at runtime. One case study on our prototype shows that choosing proper optimizations can improve the I/O performance for small files and large files by up to 40 and 64.4 %, respectively. Another case study shows that the data prefetch performance for real-world application traces can be improved by up to 193 % by selecting correct prefetch patterns. Simulations in large-scale environment also show that our method is scalable and both the memory consumption and the consistency control overhead can be negligible.
Similar content being viewed by others
References
BerkeleyDB (2012). http://db.cs.berkeley.edu/
HECIOS: the high end computing I/O simulator (2012). http://www.parl.clemson.edu/hecios/
Lustre File System (2012). http://www.lustre.org
Mdtest HPC Benchmark (2012). http://sourceforge.net/projects/mdtest/
The Parallel Virtual File System (2012). http://www.pvfs.org
Zlib compression library (2012). http://www.zlib.net/
Abd-El-Malek M, Courtright II, WV, Cranor C et al (2005) Ursa minor: versatile cluster-based storage. In: USENIX FAST. San Francisco, pp 59–72
Abe Y, Gibson G (2010) pWalrus: towards better integration of parallel file systems into cloud storage. IASDS 2010, Cluster 2010IEEE Computer Society, Heraklion, pp 1–7
Al-Kiswany S (2013) Embracing diversity: optimizing distributed storage systems for diverse deployment environments. Ph.D. thesis, University of British Columbia
Al-Kiswany S, Gharaibeh A, Ripeanu M (2010) The case for a versatile storage system. SIGOPS Oper Syst Rev 44(1):10–14
Bell G, Gray J, Szalay A (2006) Petascale computational systems. Computer 39(1):110–112
Byna S, Chen Y, Sun XH, Thakur R, Gropp W (2008) Parallel I/O prefetching using MPI file caching and I/O signatures. In: SC 2008. ACM/IEEE, Piscataway, pp 1–12
Calderón A, García-Carballeira F, Sánchez LM, García JD, Fernandez J (2009) Fault tolerant file models for parallel file systems: introducing distribution patterns for every file. J Supercomput 47(3):312–334
Carns P, Lang S, Ross R, Vilayannur M, Kunkel J, Ludwig T (2009) Small-file access in parallel file systems. In: IPDPS 2009. IEEE Computer Society, Rome, pp 1–11
Chen J, Roth PC, Chen Y (2013) Using pattern-models to guide SSD deployment for big data in HPC systems. In: BigData 2013
Chen Y, Sun XH, Thakur R, Song H, Jin H (2010) Improving parallel I/O performance with data layout awareness. Cluster 2010, CLUSTER ’10IEEE Computer Society, Washington, DC, pp 302–311
Devulapalli A, Wyckoff P (2007) File creation strategies in a distributed metadata file system. In: IEEE IPDPS, Long Beach, pp 1–10
Dong B, Li X, Wu Q, Xiao L, Ruan L (2012) A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers. J Parallel Distrib Comput 72(10):1254–1268
Gharaibeh A, Al-Kiswany S, Ripeanu M (2008) Configurable security for scavenged storage systems. In: ACM StorageSS, Alexandria, pp 55–62
He J, Bent J, Torres A, Grider G, Gibson G, Maltzahn C, Sun XH (2013) I/O acceleration with pattern detection. In: HPDC 2013. ACM, New York, pp 25–36
Hendricks J, Sambasivan RR, Sinnamohideen S, Ganger GR (2006) Improving small file performance in object-based storage. Tech. Rep. CMU-PDL-06-104, Parallel Data Lab, Carnegie Mellon University
Kuhn M, Kunkel JM, Ludwig T (2009) Dynamic file system semantics to enable metadata optimizations in PVFS. Concurr Comput Pract Exper 21(14):1775–1788
Li J, Qiu M, Ming Z, Quan G, Qin X, Gu Z (2012) Online optimization for scheduling preemptable tasks on IaaS cloud systems. J Parallel Distrib Comput 72(5):666–677
Li X, Dong B, Xiao L, Ruan L, Liu D (2012) CEFLS: a cost-effective file lookup service in a distributed metadata file system. In: CCGrid 2012. IEEE Computer Society, Washington, DC, pp 25–32
Li X, Dong B, Xiao L, Ruan L, Liu D (2012) HCCache: a hybrid client-side cache management scheme for I/O-intensive workloads in network-based file systems. In: PDCAT 2012. IEEE Computer Society, Washington, DC, pp 467–473
Li Z, Chen Z, Srinivasan SM, Zhou Y (2004) C-Miner: mining block correlations in storage systems. In: USENIX FAST, Berkeley, pp 173–186
Madhyastha TM, Reed DA (2002) Learning to classify parallel input/output access patterns. IEEE Trans Parallel Distrib Syst 13(8):802–813
Molina-Estolano E, Gokhale M, Maltzahn C, May J, Bent J, Brandt S (2009) Mixing Hadoop and HPC workloads on parallel filesystems. In: PDSW 2009. ACM, Portland, pp 1–5
Narayan S, Chandy JA (2010) ATTEST: ATTributes-based Extendable STorage. J Syst Softw 83(4):548–556
Pan A, Walters JP, Pai VS, Kang DID, Crago SP (2012) Integrating high performance file systems in a cloud computing environment. In: Proceedings of the 2012 SC companion: high performance computing, networking storage and analysis, SCC ’12IEEE Computer Society, Washington, DC, pp 753–759
Patrick CM, Kandemir M, Karaköy M, Son SW, Choudhary A (2010) Cashing in on hints for better prefetching and caching in PVFS and MPI-IO. In: HPDC 2010. ACM, Chicago, pp 191–202
Pérez MS, Carretero J, García F, Peña JM, Robles V (2006) MAPFS: a flexible multiagent parallel file system for clusters. Future Gener Comput Syst 22(5):620–632
Prost JP, Treumann R, Hedges R, Jia B, Koniges A (2001) MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS. In: SC 2001. ACM, New York, pp 1–17
Qiu M, Sha EHM (2009) Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Trans Des Autom Electron Syst 14(2):25:1–25:30
Schmuck F, Haskin R (2002) GPFS: a shared-disk file system for large computing clusters. In: USENIX FAST. USENIX Association, Berkeley, pp 231–244
Shamsi J, Khojaye M, Qasmi M (2013) Data-intensive cloud computing: requirements, expectations, challenges, and solutions. J Grid Comput 11(2):281–310
Tantisiriroj W, Patil S, Gibson G (2008) Data-intensive file systems for Internet services: a rose by any other name. Tech. Rep. CMU-PDL-08-114, Parallel Data Lab, Carnegie Mellon University
Tantisiriroj W, Son SW, Patil S, Lang SJ, Gibson G, Ross RB (2011) On the duality of data-intensive file system design: reconciling HDFS and PVFS. In: SC 2011. ACM, New York, pp 67:1–67:12
Uysal M, Acharya A, Saltz J (1997) Requirements of I/O systems for parallel machines: an application-driven study. Tech. Rep. UMIACS-TR-97-49, University of Maryland, College Park
Vairavanathan E, Al-Kiswany S, Costa LBa, Zhang Z, Katz DS, Wilde M, Ripeanu M (2012) A workflow-aware storage system: an opportunity study. In: CCGRID 2012. IEEE Computer Society, Washington, DC, pp 326–334
Vilayannur M, Nath P, Sivasubramaniam A (2005) Providing tunable consistency for a parallel file store. In: USENIX FAST, San Francisco, pp 17–30
Wei Q, Xie C, Li X, Cao Q (2007) Research and design of an attribute-managed storage-cluster based on TCP/IP network. In: IEEE NPC, Dalian, pp 332–336
Xia P, Feng D, Jiang H, Tian L, Wang F (2008) FARMER: a novel approach to file access correlation mining and evaluation reference model for optimizing peta-scale file system performance. In: HPDC 2008. ACM, Boston, pp 185–196
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant No. 61232009 and 61370059, the Doctoral Fund of Ministry of Education of China under Grant No. 20101102110018, the fund of the State Key Laboratory of Software Development Environment under Grant No. SKLSDE-2012ZX-23, the Hi-tech Research and Development Program of China (863 Program) under Grant No. 2011AA01A205 and the Beijing Natural Science Foundation under Grant No. 4122042. Prof. Qiu is supported by NSF CNS-1359557 and NSFC 61071061.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, X., Xiao, L., Qiu, M. et al. Enabling dynamic file I/O path selection at runtime for parallel file system. J Supercomput 68, 996–1021 (2014). https://doi.org/10.1007/s11227-013-1077-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-1077-6