The Optimal ChunkSize Pair Choosing Method for Dual Layered Deduplication Backup System | SpringerLink
Skip to main content

The Optimal ChunkSize Pair Choosing Method for Dual Layered Deduplication Backup System

  • Conference paper
Beyond Databases, Architectures, and Structures (BDAS 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 424))

  • 1367 Accesses

Abstract

This paper proposes a multiple layers deduplication system for backup process in IT environment. The proposed system eliminates the duplication in target data by applying a series of plural ChunkSizes. The effectiveness of proposed system is evaluated by a simulator and a prototype machine on a linux server, comparing to a conventional single chunk system.The system realizes better deduplication capability with much less number of duplicated chunk reduction. This paper then proposes a choosing method to choose the optimal ChunkSizes to be set in layers. It maximizes the deduplication capability and minimizes the performance degradation caused by chunk reductions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: A survey. Internet Math. 1(4), 485–509 (2003)

    Article  MathSciNet  Google Scholar 

  2. Dubnicki, C., Grayz, C., et al.: HYDRAstor: a scalable secondary storage. In: Proc. of The 7th USENIX Conf. on File and Storage Technologies, FAST 2009, pp. 197–210 (2009)

    Google Scholar 

  3. EMC Corporation: EMC data domain boost software (2010)

    Google Scholar 

  4. Liu, C., Lu, Y., et al.: ADMAD: Application-Driven Metadata Aware De-duplication Archival Storage System. In: Proc. of Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2008), pp. 29–35 (2008)

    Google Scholar 

  5. Meister, D., Brinkmann, A.: Multi-level comparison of data deduplication in a backup scenario. In: Proc. of The Israeli Experimental Systems Conf. (SYSTOR 2009), pp. 1–12 (2009)

    Google Scholar 

  6. Ogata, M., Komoda, N.: Improvement of performance and reduction of deduplication backup system using multiple layered architecture. In: Proc. of the First Asian Conf. on Information Systems (ACIS 2012), pp. 196–200 (2012)

    Google Scholar 

  7. Ogata, M., Komoda, N.: The assignment of chunk size associated with the target data characteristics in deduplication backup system. In: Proc. of the Second Asian Conf. on Information Systems (ACIS 2013), pp. 29–34 (2013)

    Google Scholar 

  8. Ogata, M., Komoda, N.: The parameter optimization in multiple layered deduplication system. In: Proc. of the 15th International Conf. on Enterprise Information Systems (ICEIS 2013)., vol. 2, pp. 117–124 (2013)

    Google Scholar 

  9. Quantum Corporation: Data deduplication background: A technical white paper (2009)

    Google Scholar 

  10. Tan, Y., Feng, D., Yan, Z., Zhou, G.: DAM: A data ownership-aware multi-layered de-duplication scheme. In: Proc. of 2010 Fifth IEEE International Conf. on Networking, Architecture and Storage (NAS), pp. 403–411 (2010)

    Google Scholar 

  11. Wallace, G., Douglis, F., Qian, H., Shilane, P., Smaldone, S., Chamness, M., Hsu, W.: Characteristics of backup workloads in production systems. In: Proc. of the 10th USENIX Conf. on File and Storage Technologies (FAST 2012), pp. 33–48 (2012)

    Google Scholar 

  12. Won, Y., Ban, J., Min, J., Hur, J., Oh, S., Lee, J.: Efficient index lookup for de-duplication backup system. In: IEEE International Symposium on Proc. of Modeling, Analysis and Simulation of Computers and Telecommunication Systems (MASCOTS 2008), pp. 1–3 (2008)

    Google Scholar 

  13. Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proc. of The 6th USENIX Conf. on File and Storage Technologies, FAST 2008, pp. 1–14 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikito Ogata .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ogata, M., Komoda, N. (2014). The Optimal ChunkSize Pair Choosing Method for Dual Layered Deduplication Backup System. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures, and Structures. BDAS 2014. Communications in Computer and Information Science, vol 424. Springer, Cham. https://doi.org/10.1007/978-3-319-06932-6_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06932-6_38

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06931-9

  • Online ISBN: 978-3-319-06932-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics