Abstract
Owing to the exponential growth of real-time data generation, the importance of stream processing is ever increasing. However, the data processing paradigm of stream processing is quite different, so it is difficult to expect high performance from memory systems applied to existing data centers. To solve this problem, two main solutions are suggested in this paper. First, a hybrid main memory and small buffer architecture are designed to reflect the execution characteristics of stream processing. Second, a hardware-based prefetch module supports correlation prefetching. Stream processing tends to accept incoming data in the main memory, so the prefetch module is used to divert data from the main memory layer to the buffer layer based on an intelligent clustering algorithm. This clustering algorithm affects the rapidly changing data access pattern of stream processing applications. By using heterogeneous main memories, not only can one enjoy the fast access latency of DRAM but also its nonvolatility, scalability, and low power consumption. The proposed hybrid memory architecture with our prefetch buffer structure can improve the buffer hit rate by 9–14% over other prefetch methods, reduce energy consumption by 26% over the conventional DRAM-only model, and achieve similar execution time over the 1/8-size DRAM space of the DRAM-only model.















Similar content being viewed by others
References
Habibzadeh H, Qin Z, Soyata T, Kantarci B (2017) Large scale distributed dedicated and non-dedicated smart city sensing systems. IEEE Sens J 1748:1–1
Barcelo M, Correa A, Llorca J, Tulino AM, Vicario JL, Morell A (2016) IoT-cloud service optimization in next generation smart environments. IEEE J Sel Areas Commun 34:4077–4090
Reed DA, Dongarra J (2015) Exascale computing and big data. Commun ACM 58:56–68
Chang B-j, Chang Y-h, Chang H-s, Kuo T-W, Li H-P (2014) A PCM translation layer for integrated memory and storage management. In: CODES’14 Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis
Arcangioli B (1992) A switch in time. Curr Biol 2(6):323–325
Dhiman G, Ayoub R, Rosing T (2009) PDRAM: a hybrid PRAM and DRAM main memory system. In: Design Automation Conference (DAC), p 66
Lee BC, Ipek En, Mutlu O, Burger D (2009) Architecting phase change memory as a scalable dram alternative. Int Symp Comput Archit 36:2–13
Carbone P, Ewen S, Haridi S, Katsifodimos A, Markl V, Tzoumas K (2015) Apache Flink: unified stream and batch processing in a single engine. Data Eng 36:28–38
Toshniwal A, Donham J, Bhagat N, Mittal S, Ryaboy D, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, Jackson J, Gade K, Fu M (2014) Storm@twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data—SIGMOD’14, pp 147–156
Abadi DJ, Carney D, etintemel UC, Cherniack M, Convey C, Erwin C, Galvez E, Hatoun M (2003) Aurora: a data stream management system. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data
Shevgoor M, Koladiya S, Balasubramonian R, Wilkerson C, Pugsley SH, Chishti Z (2015) Efficiently prefetching complex address patterns. Int Symp Microarchitect (Micro) 48:141–152
Jain A, Lin C (2013) Linearizing irregular memory accesses for improved correlated prefetching. Int Symp Microarchit (Micro) 46:247–259
Dahlgren F (1995) Sequential hardware prefetching in shared-memory multiprocessors. IEEE Trans Parallel Distrib Syst 6:733–745
Gill B, Modha D (2005) SARC: sequential prefetching in adaptive replacement cache. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, pp 293–308
Ramos LM, Ibanez PE (2011) Multi-level adaptive prefetching based on performance gradient tracking. J Instr Level Parallelism 13:1–14
Joseph D, Grunwald D (1999) Prefetching using Markov predictors. IEEE Trans Comput 48:121–133
Apache storm project @ONLINE. https://github.com/apache/storm
Apache spark streaming project @ONLINE. https://github.com/apache/spark/tree/master/streaming
Apache Fink project @ONLINE. https://github.com/apache/flink
Zhou P, Zhao B, Yang J, Zhang Y (2014) Throughput enhancement for phase change memories. IEEE Trans Comput 63:2080–2093
Ferreira AP, Childers B, Melhem R, Mosse D, Yousif M (2010) Using PCM in next-generation embedded space applications. In: 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium, pp 153–162
Hoseinzadeh M, Arjomand M, Sarbazi-Azad H (2016) SPCM: the striped phase change memory. ACM Trans Archit Code Optim 12. https://doi.org/10.1145/2829951
Kultursay E, Kandemir M, Sivasubramaniam A, Mutlu O (2013) Evaluating STT-RAM as an energy-efficient main memory alternative. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp 256–267
Kgil T, Mudge T (2006) FlashCache: a NAND flash memory file cache for low PowerWeb servers. In: Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), p 103
Ouyang X, Islam NS, Rajachandrasekar R, Jose J, Luo M, Wang H, Panda DK (2012) SSD-assisted hybrid memory to accelerate memcached over high performance networks. In: Proceedings of the International Conference on Parallel Processing, pp 470–479
Huang J, Badam A, Qureshi MK, Schwan K (2015) Unified address translation for memory-mapped SSDs with FlashMap. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture(ISCA), pp 580–591
Van Essen B, Pearce R, Ames S, Gokhale M (2012) On the role of NVRAM in data-intensive architectures: an evaluation. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), pp 703–714
Liu H, Chen Y, Liao X, Jin H, He B, Zheng L, Guo R (2017) Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures. In: Proceedings of International Conference on Supercomputing (ICS)
Salkhordeh R, Asadi H (2016) An operating system level data migration scheme in hybrid DRAM-NVM memory architecture, design, automation, and test in Europe (DATE), pp 936–941
Bolotin E, Nellans D, Villa O, O’Connor M, Ramirez A, Keckler SW (2015) Designing efficient heterogeneous memory architectures. IEEE Micro 35:60–68
Wu X, Reddy ALN (2011) SCMFS: a file system for storage class memory. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), p 39
Dulloor SR, Roy A, Zhao Z, Sundaram N, Satish N, Sankaran R, Jackson J, Schwan K (2016) Data tiering in heterogeneous memory systems. Eur Conf Comput Syst (EuroSys) 11:1–16
Yoon SK, Youn YS, Nam SJ, Son MH, Kim SD (2016) Optimized memory-disk integrated system with dram and nonvolatile memory. IEEE Trans Multi-Scale Comput Syst 2:83–93
Inagaki T, Onodera T, Komatsu H, Nakatani T (2003) Stride prefetching by dynamically inspecting objects. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), p 269
Hariprakash G, Achutharaman R, Omondi AR (2001) DStride: data-cache miss-address-based stride prefetching scheme for multimedia processors. In: Proceedings of the Australasian Computer Systems Architecture Conference (ACSAC), pp 62–70
Pathak P, Sarwar M, Sohoni S (2010) Markov prediction scheme for cache prefetching. Conf Theor Appl Comput Sci 2:14–19
Sethia A, Dasika G, Samadi M, Mahlke S (2013) APOGEE: adaptive prefetching on GPUs for energy efficiency. In: Parallel Architectures and Compilation Techniques Conference Proceedings (PACT), pp 73–82
Matteis T, Mencagli G (2016) Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing. In: Symposium on Principles and Practice of Parallel Programming (PPoPP), p 21
Sun D, Zhang G, Yang S, Zheng W, Khan SU, Li K (2015) Re-stream: real-time and energy-efficient resource scheduling in big data stream computing environments. Inf Sci 319:92–112
Kamburugamuve S, Ekanayake S, Pathirage, Fox G (2016) Towards high performance processing of streaming data in large data centers. In: IEEE International Parallel and Distributed Processing Symposium Workshops, pp 1627–1644
James J (2016) STYX: stream processing with trustworthy cloud-based execution. Symp Cloud Comput 7:348–360
Kryder MH, Kim CS (2009) After hard drives-what comes next? IEEE Trans Magn 45:3406–3413
Qureshi MK, Srinivasan V, Ja Rivers (2009) Scalable high performance main memory system using phase-change memory technology. ACM SIGARCH Comput Archit News 37:24–33
Li Y, Chen Y, Jones AK (2012) A software approach for combating asymmetries of non-volatile memories. In: ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), pp 191–196
Song W, Kim Y, Kim H, Lim J, Kim J (2014) Personalized optimization for android smartphones. ACM Trans Embed Comput Syst 13:1–25
Chintapalli S, Dagit D, Evans B, Farivar R, Graves T, Holderbaugh M, Liu Z, Nusbaum K, Patil K, Peng BJ, Poulosky P (2016) Benchmarking streaming computation engines: Storm, Flink and spark streaming. In: IEEE 30th International Parallel and Distributed Processing Symposium (IPDPS), pp 1789–1792
Thein KMM (2014) Apache Kafka: next generation distributed messaging system. Int J Sci Eng Technol Res 3:9478–9483
Redis @ONLINE. https://redis.io
Bellard F (2005) QEMU, a fast and portable dynamic translator. In: USENIX Annual Technical Conference, pp 41–46
Qureshi M, Karidis J (2009) Enhancing lifetime and security of pcm based main memory with start-gap wear leveling. IEEE/ACM Int Symp Microarchit (Micro) 42:14–23
Acknowledgements
This research was partially supported by the Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2015M3C4A7065522) and by an Industry-Academy joint research program between Samsung Electronics and Yonsei University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, S.M., Yoon, SK., Kim, JG. et al. Adaptive correlated prefetch with large-scale hybrid memory system for stream processing. J Supercomput 74, 4746–4770 (2018). https://doi.org/10.1007/s11227-018-2466-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2466-7