On Automated Feedback-Driven Data Placement in Multi-tiered Memory | SpringerLink
Skip to main content

On Automated Feedback-Driven Data Placement in Multi-tiered Memory

  • Conference paper
  • First Online:
Architecture of Computing Systems – ARCS 2018 (ARCS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10793))

Included in the following conference series:

Abstract

Recent emergence of systems with multiple performance and capacity tiers of memory invites a fresh consideration of strategies for optimal placement of data into the various tiers. This work explores a variety of cross-layer strategies for managing application data in multi-tiered memory. We propose new profiling techniques based on the automatic classification of program allocation sites, with the goal of using those classifications to guide memory tier assignments. We evaluate our approach with different profiling inputs and application strategies, and show that it outperforms other state-of-the-art management techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This allocation site-based strategy for optimizing accesses-per-byte is designed to obviate tracing or sampling on an object-by-object basis.

  2. 2.

    The primary goal of this work is to study the potential benefits of automated application guidance. While our simulation-based evaluation neglects overhead of profiling, Sect. 4 covers how in practice, allocation site based guidance can be generated (either online or offline) and applied in direct execution with negligible overhead.

  3. 3.

    Phase transitions may be detected online by several means, including through models of instruction and data access behaviors, hardware event ratios, etc.

  4. 4.

    For direct execution, an alternative to Pin based instrumentation is to use LLVM inserted wrappers (as described in Sect. 4.2.1), and to sample access rates through hardware-based counters (e.g., using the PEBS facility on modern Intel processors).

  5. 5.

    Other, more compact encodings of the allocation sites may also be employed – e.g., a low-overhead approximate method in direct execution is to use a hash over (call-return) last branch records (LBR) recorded by a processor’s monitoring unit.

  6. 6.

    We had to omit zeusmp due to an incompatibility with our adopted basic block vector collection tool [23].

References

  1. Intel: 3D XPoint (2016). http://www.intel.com/content/www/us/en/architecture-and-technology/3d-xpoint-unveiled-video.html

  2. Mittal, S., Vetter, J.S.: A survey of techniques for architecting DRAM caches. IEEE Trans. Parallel Distrib. Syst. 27(6), 1852–1863 (2016)

    Article  Google Scholar 

  3. Meswani, M., Blagodurov, S., Roberts, D., Slice, J., Ignatowski, M., Loh, G.: Heterogeneous memory architectures: a HW/SW approach for mixing die-stacked and off-package memories. In: HPCA, 2015 (February 2015)

    Google Scholar 

  4. Li, Y., Ghose, S., Choi, J., Sun, J., Wang, H., Mutlu, O.: Utility-based hybrid memory management. In: IEEE CLUSTER (September 2017)

    Google Scholar 

  5. Cantalupo, C., Venkatesan, V., Hammond, J.R.: User extensible heap manager for heterogeneous memory platforms and mixed memory policies (2015). http://memkind.github.io/memkind/memkind_arch_20150318.pdf

  6. Dulloor, S.R., et al.: Data tiering in heterogeneous memory systems. In: Eleventh European Conference on Computer Systems, p. 15. ACM (2016)

    Google Scholar 

  7. Agarwal, N., et al.: Page placement strategies for GPUs within heterogeneous memory systems. SIGPLAN Not. 50(4), 607–618 (2015)

    Article  MathSciNet  Google Scholar 

  8. Luk, C.K., et al.: Pin: building customized program analysis tools with dynamic instrumentation. SIGPLAN Not. 40(6), 190–200 (2005)

    Article  Google Scholar 

  9. Evans, J.: A scalable concurrent malloc (3) implementation for FreeBSD (2006)

    Google Scholar 

  10. Kim, Y., Yang, W., Mutlu, O.: Ramulator: a fast and extensible DRAM simulator. IEEE Comput. Archit. Lett. 15(1), 45–49 (2016). https://doi.org/10.1109/LCA.2015.2414456

    Article  Google Scholar 

  11. Giardino, M., Doshi, K., Ferri, B.H.: Soft2LM: application guided heterogeneous memory management. In: IEEE International Conference on Networking, Architecture and Storage (NAS), USA, pp. 1–10 (2016)

    Google Scholar 

  12. Agarwal, N., Wenisch, T.F.: Thermostat: application-transparent page management for two-tiered main memory. In: ASPLOS. ASPLOS 2017, pp. 631–644. ACM, New York (2017)

    Google Scholar 

  13. Peng, I.B., Gioiosa, R., Kestor, G., Cicotti, P., Laure, E., Markidis, S.: RTHMS: a tool for data placement on hybrid memory system. In: ISMM (2017)

    Google Scholar 

  14. Servat, H., Pea, A.J., Llort, G., Mercadal, E., Hoppe, H., Labarta, J.: Automating the application data placement in hybrid memory systems. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER) (September 2017)

    Google Scholar 

  15. Dashti, M., Fedorova, A., Funston, J., Gaud, F., Lachaize, R., Lepers, B., Quema, V., Roth, M.: Traffic management: a holistic approach to memory placement on NUMA systems. SIGPLAN Not. 48(4), 381–394 (2013)

    Article  Google Scholar 

  16. Jantz, M.R., et al.: A framework for application guidance in virtual memory systems. In: Virtual Execution Environments. VEE 2013, pp. 155–166 (2013)

    Google Scholar 

  17. Jantz, M.R., et al.: Cross-layer memory management for managed language applications. In: ACM/SIGPLAN OOPSLA. ACM, New York (2015)

    Google Scholar 

  18. Guo, R., Liao, X., Jin, H., Yue, J., Tan, G.: NightWatch: integrating lightweight and transparent cache pollution control into dynamic memory allocation systems. In: 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp. 307–318 (2015)

    Google Scholar 

  19. Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Code Generation and Optimization (2004)

    Google Scholar 

  20. Sodani, A.: Knights Landing (KNL): 2nd generation Intel® Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015)

    Google Scholar 

  21. Hamerly, G., Perelman, E., Lau, J., Calder, B.: Simpoint 3.0. J. Instr. Level Parallelism 7(4), 1–28 (2005)

    Google Scholar 

  22. Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)

    Article  Google Scholar 

  23. Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: PLDI (2007)

    Google Scholar 

Download references

Acknowledgements

This research is supported in part by the National Science Foundation under CCF-1619140, CCF-1617954, and CNS-1464288, as well as a grant from the Software and Services Group (SSG) at Intel®.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael R. Jantz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Effler, T.C., Howard, A.P., Zhou, T., Jantz, M.R., Doshi, K.A., Kulkarni, P.A. (2018). On Automated Feedback-Driven Data Placement in Multi-tiered Memory. In: Berekovic, M., Buchty, R., Hamann, H., Koch, D., Pionteck, T. (eds) Architecture of Computing Systems – ARCS 2018. ARCS 2018. Lecture Notes in Computer Science(), vol 10793. Springer, Cham. https://doi.org/10.1007/978-3-319-77610-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77610-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77609-5

  • Online ISBN: 978-3-319-77610-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics