Nscale: an efficient RAID-6 online scaling via optimizing data migration | The Journal of Supercomputing Skip to main content
Log in

Nscale: an efficient RAID-6 online scaling via optimizing data migration

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The emergence of novel storage medium relieves pressure caused by massive data on large-scale data centers. However, the storage cost is always a challenge that we cannot be ignored. As a trade-off between capacity and cost, RAID offers big capacity, low cost, high reliability, and flexible scaling, which occupies a large share of the storage market. Today, RAID scaling is the most frequent operation in storage systems. Nevertheless, it still has to face long scaling time and bad user experience. Therefore, we put forward an approach-Nscale for N-Code-based RAID-6 scaling. Nscale shortens the total scaling time by optimizing the data migration process and reducing the amount of data migration. Meanwhile, it ensures that the data only moves in the horizontal direction and in the same parity chain. In addition, it guarantees that the diagonal parity chain is not destroyed as much as possible. Derived from the experimental results, Nscale reduces the data migration by 81.05–92.35% and shortens the total scaling time by 54.5–62.4% under off-line. During and after the scaling process, Nscale also demonstrates excellent user average response time under different workloads, providing favorable user experience.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Arora S, Bala A (2021) An intelligent energy efficient storage system for cloud based big data applications. Simul Model Pract Theory 108:102260

    Article  Google Scholar 

  2. Naeem M, Jamal T, Diaz-Martinez J, Butt SA, Montesano N, Tariq MI, De-la-Hoz-Franco E, De-La-Hoz-Valdiris E (2022) Trends and future perspective challenges in big data. In: Advances in intelligent data analysis and applications. Springer, pp 309–325

  3. Sandhu AK (2021) Big data with cloud computing: discussions and challenges. Big Data Min Anal 5(1):32–40

    Article  Google Scholar 

  4. Saadoon M, Hamid SHA, Sofian H, Altarturi HH, Azizul ZH, Nasuha N (2021) Fault tolerance in big data storage and processing systems: a review on challenges and solutions. Ain Shams Eng J 13(2):101538

    Article  Google Scholar 

  5. Patterson DA, Gibson G, Katz RH (1988) A case for redundant arrays of inexpensive disks (RAID). In: ACM SIGMOD, ACM, Chicago, Illinois, 1–3 June 1988, pp 109–116

  6. Xianghong L, Jiwu S (2012) Summary of research for erasure code in storage system. J Comput Res Dev 49(1):1–11

    Google Scholar 

  7. Blaum M, Brady J, Bruck J, Menon J (1995) EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures. IEEE Transact Comput 44(2):192–202

    Article  MATH  Google Scholar 

  8. Corbett P, English B, Goel A, Grcanac T, Kleiman S, Leong J, Sankar S (2004) Row-diagonal parity for double disk failure correction. In: File and storage technologies (FAST), San Francisco, California, 31 Mar–2 Apr 2004. USENIX, pp 1–14

  9. Huang C, Xu L (2008) STAR: an efficient coding scheme for correcting triple storage node failures. IEEE Transact Comput 57(7):889–901

    Article  MathSciNet  MATH  Google Scholar 

  10. Xu L, Bruck J (1999) X-code: MDS array codes with optimal encoding. IEEE Transact Inf Theory 45(1):272–276

    Article  MathSciNet  MATH  Google Scholar 

  11. Xie P, Yuan Z, Huang J, Qin X (2019) N-Code: an optimal RAID-6 MDS array code for load balancing and high I/O performance. In: The 48th International Conference on Parallel Processing, Kyoto, 5–8 Aug 2019. ACM, pp 34:31-34:10

  12. Yuan Z, Xie P, Geng S (2019) Summary of research for RAID system scaling schemes. Acta Electron Sinica 47(11):2420–2431

    Google Scholar 

  13. Zhang G, Shu J, Xue W, Zheng W (2007) SLAS: an efficient approach to scaling round-robin striped volumes. ACM Transact Storage (TOS) 3(1):3:1-3:29

    Google Scholar 

  14. Zhang G, Zheng W, Shu J (2009) ALV: a new data redistribution approach to RAID-5 scaling. IEEE Transact Comput 59(3):345–357

    Article  MathSciNet  Google Scholar 

  15. Zheng W, Zhang G (2011) Fastscale: accelerate raid scaling by minimizing data migration. In: USENIX Conference on File and Storage Technologies, San Jose, CA, 15–17 Feb 2011. USENIX, pp 149–161

  16. Zhang G, Wang J, Li K, Shu J, Zheng W (2014) Redistribute data to regain load balance during raid-4 scaling. IEEE Transact Parallel Distrib Syst 26(1):219–229

    Article  Google Scholar 

  17. Wu C, He X (2012) GSR: A global stripe-based redistribution approach to accelerate RAID-5 scaling. In: The International Conference on Parallel Processing, Pittsburgh, PA, 10–13 Sept 2012. IEEE Computer Society, pp 460-469

  18. Zhang G, Zheng W, Li K (2013) Rethinking raid-5 data layout for better scalability. IEEE Transact Comput 63(11):2816–2828

    Article  MathSciNet  MATH  Google Scholar 

  19. Mao Y, Wan J, Zhu Y, Xie C (2013) A new parity-based migration method to expand raid-5. IEEE Transact Parallel Distrib Syst 25(8):1945–1954

    Article  Google Scholar 

  20. Liang J, Xu Y, Li Y, Pan Y (2017) ISM-an intra-stripe data migration approach for RAID-5 scaling. In: International Conference on Networking, Architecture, and Storage (NAS), IEEE Computer Society, Shenzhen, 7–9 Aug 2017, pp 1–10

  21. Gonzalez JL, Cortes T (2004) Increasing the capacity of RAID5 by online gradual assimilation. In: The International Workshop on Storage Network Architecture and Parallel I/O, ACM, New York, 30 Sept 2004, pp 17–24

  22. Goel A, Shahabi C, Yao S-YD, Zimmermann R (2002) SCADDAR: an efficient randomized technique to reorganize continuous media blocks. In: The 18th International Conference on Data Engineering, EEE Computer Society, San Jose, CA, 26 Feb–1 Mar 2002, pp 73–82

  23. Wu C, He X, Han J, Tan H, Xie C (2012) SDM: A stripe-based data migration scheme to improve the scalability of RAID-6. In: IEEE International Conference on Cluster Computing, IEEE Computer Society, Beijing, 24–28 Sept 2012, pp 284–292

  24. Zhang G, Li K, Wang J, Zheng W (2013) Accelerate rdp raid-6 scaling by reducing disk i/os and xor operations. IEEE Transact Comput 64(1):32–44

    Article  MathSciNet  MATH  Google Scholar 

  25. Zhang G, Wu G, Lu Y, Wu J, Zheng W (2016) Xscale: online X-code RAID-6 scaling using lightweight data reorganization. IEEE Transact Parallel Distrib Syst 27(12):3687–3700

    Article  Google Scholar 

  26. Yuan Z, You X, Lv X, Li M, Xie P (2021) HS6: an efficient H-code RAID-6 scaling by optimizing data migrating and parity updating. J Supercomput 77(11):12987–13017

    Article  Google Scholar 

  27. Wu C, Wan S, He X, Cao Q, Xie C (2011) H-Code: a hybrid MDS array code to optimize partial stripe writes in RAID-6. In: IEEE International Parallel and Distributed Processing Symposium, IEEE, Anchorage, Alaska, 16–20 May 2011, pp 782–793

  28. Yuan Z, You X, Lv X, Li M, Xie P (2021) HDS: optimizing data migration and parity update to realize RAID-6 scaling for HDP. Cluster Comput 24(4):3815–3835

    Article  Google Scholar 

  29. Wu C, He X, Wu G, Wan S, Liu X, Cao Q, Xie C (2011) HDP code: a horizontal-diagonal parity code to optimize i/o load balancing in raid-6. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), IEEE Computer Society, Hong Kong, 27–30 June 2011, pp 209–220

  30. Fu Y, Shu J, Luo X, Shen Z, Hu Q (2016) Short code: an efficient RAID-6 MDS code for optimizing degraded reads and partial stripe writes. IEEE Transact Comput 66(1):127–137

    Article  MathSciNet  MATH  Google Scholar 

  31. Yuan Z, You X, Lv X, Xie P (2021) SS6: online short-code RAID-6 scaling by optimizing new disk location and data migration. Comput J 64(10):1600–1616

    Article  MathSciNet  Google Scholar 

  32. Jin P, Xie P, Yuan Z, Hu Y, Gao Y, Ma J (2019) An Approach for RAID-6 Scaling Based on D-code. In: International Conference on Computer and Communications (ICCC), IEEE, Chengdu, 6–9 Dec 2019, pp 545-549

  33. Fu Y, Shu J (2015) D-Code: An efficient RAID-6 code to optimize I/O loads and read performance. In: IEEE International Parallel and Distributed Processing Symposium, Hyderabad, 25–29 May 2015. IEEE Computer Society, pp 603-612

  34. Hu Y, Xie P, Gao Y, Liu F, Li F, Wang D (2020) A scheme for RAID-6 Scaling Based on EVENODD. In: International Conference on High Performance Compilation, Computing and Communications, ACM, Guang Zhou, 27–29 June 2020, pp 84−88

  35. Zhong X, Yuan Z, Hu Y, Xie P (2019) An Approach for RAID Scaling Based on STAR-Code. In: International Conference on Computer and Communication Engineering Technology (CCET), Beijing, 16–18 Aug 2019. IEEE, pp 105–108

  36. Hafner JL (2006) HoVer erasure codes for disk arrays. In: International Conference on Dependable Systems and Networks (DSN’06), 2006. IEEE, pp 217–226

  37. Hu Y, Xie P, Gao Y, Geng S (2020) A Scheme for RAID-6 Scaling Based on HoVer. In: International Conference on High Performance Compilation, Computing and Communications, Guangdong, 27–29 June 2020. ACM, pp 168–172

  38. Jin C, Jiang H, Feng D, Tian L (2009) P-Code: A new RAID-6 code with optimal properties. In: the 23rd international conference on Supercomputing, Yorktown Heights, NY, 8–12 June 2009. ACM, pp 360–369

  39. Xie P, Huang J, Cao Q, Xie C (2014) Balanced p-code: A raid-6 code to support highly balanced i/os for disk arrays. In: IEEE International Conference on Networking, Architecture, and Storage, Tianjin, 6–8 Aug 2014. IEEE Computer Society, pp 133–137

  40. Shen Z, Shu J (2014) Hv code: An all-around mds code to improve efficiency and reliability of raid-6 systems. In: IEEE/IFIP International Conference on Dependable Systems and Networks, Atlanta, GA, 23–26 June 2014. IEEE Computer Society pp 550–561

  41. Ajdari M, Park P, Kim J, Kwon D, Kim J (2019) CIDR: A cost-effective in-line data reduction system for terabit-per-second scale SSD arrays. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019. IEEE, pp 28–41

  42. Qiang Z, Jie L, Yinlong X, Yongkun L (2019) Research of SSD array architecture based on workload awareness. J Comput Res Dev 56(4):755–766

    Google Scholar 

  43. Davidović N, Obradović S, Dordević B, Timčenko V (2020) The influence of workloads and depth queue on the performance of SSD disk RAID 0 level array. In: 2020 19th International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, 18-20 March 2020. IEEE, pp 1-6

  44. Zhang X, Hu Y, Lee PP, Zhou P Toward optimal storage scaling via network coding: From theory to practice. In: IEEE Conference on Computer Communications, Honolulu, HI, 16–19 Apr 2018. IEEE, pp 1808–1816

  45. Maturana F, Rashmi K (2020) Bandwidth cost of code conversions in distributed storage: Fundamental limits and optimal constructions. arXiv preprint arXiv:2008.12707

  46. Wu S, Shen Z, Lee PP, Xu Y (2021) Optimal repair-scaling trade-off in locally repairable codes: analysis and evaluation. IEEE Transact Parallel Distrib Syst 33:56–59

    Google Scholar 

  47. Lin Z, Guo H, Wu C (2020) AIR: an approximate intelligent redistribution approach to accelerate RAID scaling. CCF Transact High Perform Comput 2:50–56

    Article  Google Scholar 

  48. Chen C, Jiang J, Fu R, Chen L, Li C, Wan S (2021) An intelligent caching strategy considering time-space characteristics in vehicular named data networks. IEEE Transact Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3128012

    Article  Google Scholar 

  49. Lin Z, Guo H, Wu C, Li J, Xue G, Guo M Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays. In: 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2021. IEEE, pp 892–901

  50. Guo H, Lin Z, Gu Y, Wu C, Jiang L, Li J, Xue G, Guo M Lazy-WL: a wear-aware load balanced data redistribution method for efficient SSD array scaling. In: 2021 IEEE International Conference on Cluster Computing (CLUSTER), 2021. IEEE, pp 157–168

Download references

Acknowledgements

This work is supported by the Key Laboratory Foundation of IoT of Qinghai under Grant 2022-ZJ-Y21. Ping Xie and Zhu Yuan contributed equally to the work and should be regarded as co-first authors.

Funding

The Key Laboratory Foundation of IOT of Qinghai under Grant 2022-ZJ-Y21.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Xie.

Ethics declarations

Conflict of interest

This manuscript belongs to the scope of engineering and does not involve human and animal research. All authors in this manuscript have informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, P., Yuan, Z. & Hu, Y. Nscale: an efficient RAID-6 online scaling via optimizing data migration. J Supercomput 79, 2383–2403 (2023). https://doi.org/10.1007/s11227-022-04752-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04752-5

Keywords