Deterministic Memory Abstraction and Supporting Multicore System Architecture

Deterministic Memory Abstraction and Supporting Multicore System Architecture

Authors Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun



PDF
Thumbnail PDF

File

LIPIcs.ECRTS.2018.1.pdf
  • Filesize: 2.42 MB
  • 25 pages

Document Identifiers

Author Details

Farzad Farshchi
  • University of Kansas, USA
Prathap Kumar Valsan
  • Intel, USA
Renato Mancuso
  • Boston University, USA
Heechul Yun
  • University of Kansas, USA

Cite As Get BibTex

Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, and Heechul Yun. Deterministic Memory Abstraction and Supporting Multicore System Architecture. In 30th Euromicro Conference on Real-Time Systems (ECRTS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 106, pp. 1:1-1:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ECRTS.2018.1

Abstract

Poor time predictability of multicore processors has been a long-standing challenge in the real-time systems community. In this paper, we make a case that a fundamental problem that prevents efficient and predictable real-time computing on multicore is the lack of a proper memory abstraction to express memory criticality, which cuts across various layers of the system: the application, OS, and hardware. We, therefore, propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory. The key characteristic of deterministic memory is that the platform-the OS and hardware-guarantees small and tightly bounded worst-case memory access timing. In contrast, we call the conventional memory abstraction as best-effort memory in which only highly pessimistic worst-case bounds can be achieved. We propose to utilize both abstractions to achieve high time predictability but without significantly sacrificing performance. We present deterministic memory-aware OS and architecture designs, including OS-level page allocator, hardware-level cache, and DRAM controller designs. We implement the proposed OS and architecture extensions on Linux and gem5 simulator. Our evaluation results, using a set of synthetic and real-world benchmarks, demonstrate the feasibility and effectiveness of our approach.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time systems
Keywords
  • multicore processors
  • real-time
  • shared cache
  • DRAM controller
  • Linux

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Sebastian Altmeyer, Robert I Davis, and Claire Maiza. Improved cache related pre-emption delay aware response time analysis for fixed priority pre-emptive systems. Real-Time Syst. Symp. (RTSS), 48(5):499-526, 2012. Google Scholar
  2. ARM. AMBA AXI and ACE Protocol Specification, 2013. Google Scholar
  3. ARM. ARM Architecture Reference Manual. ARMv7-A and ARMv7-R Edition, 2014. Google Scholar
  4. ARM. Cortex™-A17 Technical Reference Manual, Rev: r1p1, 2014. Google Scholar
  5. N. Audsley, A. Burns, M. Richardson, K. Tindell, and A. Wellings. Applying new scheduling theory to static priority preemptive scheduling. Software Engineering Journal, 8(5):284-292, 1993. Google Scholar
  6. R. Banakar, S. Steinke, B. Lee, M. Balakrishnan, and P. Marwedel. Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In Int. Symp. Hardware/Software Codesign (CODES+ISSS), pages 73-78. ACM, 2002. Google Scholar
  7. N. Binkert, B. Beckmann, G. Black, S. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. Hower, T. Krishna, S. Sardashti, et al. The gem5 simulator. ACM SIGARCH Comput. Architecture News, 2011. Google Scholar
  8. Alan Burns and Robert Davis. Mixed criticality systems - A review. Department of Computer Science, University of York, Tech. Rep, 2013. Google Scholar
  9. Certification Authorities Software Team. CAST-32A: Multi-core Processors (Rev 0). Technical report, Federal Aviation Administration (FAA), November 2016. Google Scholar
  10. TIS Committee. Executable and linking format (ELF) specification version 1.2. TIS Committee, 1995. Google Scholar
  11. X. Ding, K. Wang, and X. Zhang. SRM-buffer: an OS buffer management technique to prevent last level cache from thrashing in multicores. In European Conf. Comput. Syst. (EuroSys). ACM, 2011. Google Scholar
  12. Leonardo Ecco and Rolf Ernst. Improved dram timing bounds for real-time dram controllers with read/write bundling. In Real-Time Systems Symposium (RTSS), pages 53-64. IEEE, 2015. Google Scholar
  13. S. A. Edwards and E. A. Lee. The case for the precision timed (PRET) machine. In Design Automation Conf. (DAC), 2007. Google Scholar
  14. EEMBC benchmark suite. URL: https://www.eembc.org.
  15. Freescale. e500mc Core Reference Manual, 2012. Google Scholar
  16. Gem5: O3CPU. URL: http://gem5.org/O3CPU.
  17. S. Goossens, B. Akesson, and K. Goossens. Conservative open-page policy for mixed time-criticality memory controllers. In Design, Automation and Test in Europe (DATE), 2013. Google Scholar
  18. A. Hansson, N. Agarwal, A. Kolli, T. Wenisch, and A. Udipi. Simulating DRAM controllers for future system architecture exploration. In Int. Symp. Performance Analysis of Syst. and Software (ISPASS), 2014. Google Scholar
  19. J.L. Henning. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Architecture News, 34(4):1-17, 2006. Google Scholar
  20. Aamer Jaleel. Memory characterization of workloads using instrumentation-driven simulation. http://www.jaleels.org/ajaleel/publications/SPECanalysis.pdf, 2010.
  21. Javier Jalle, Jaume Abella, Eduardo Quinones, Luca Fossati, Marco Zulianello, and Francisco J Cazorla. AHRB: A high-performance time-composable AMBA AHB bus. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), pages 225-236. IEEE, 2014. Google Scholar
  22. Javier Jalle, Eduardo Quinones, Jaume Abella, Luca Fossati, Marco Zulianello, and Francisco J Cazorla. A dual-criticality memory controller (DCmc): Proposal and evaluation of a space case study. In Real-Time Syst. Symp. (RTSS), pages 207-217. IEEE, 2014. Google Scholar
  23. Alexander Jordan, Florian Brandner, and Martin Schoeberl. Static analysis of worst-case stack cache behavior. In Real-Time Networks and Systems (RTNS), pages 55-64. ACM, 2013. Google Scholar
  24. H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. (Raj) Rajkumar. Bounding memory interference delay in COTS-based multi-core systems. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), 2014. Google Scholar
  25. H. Kim, A. Kandhalu, and R. Rajkumar. A coordinated approach for practical os-level cache management in multi-core real-time systems. In Real-Time Syst. (ECRTS), pages 80-89. IEEE, 2013. Google Scholar
  26. Hokeun Kim, David Bromany, Edward Lee, Michael Zimmer, Aviral Shrivastava, Junkwang Oh, et al. A predictable and command-level priority-based DRAM controller for mixed-criticality systems. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), pages 317-326. IEEE, 2015. Google Scholar
  27. Namhoon Kim, Bryan C Ward, Micaiah Chisholm, Cheng-Yang Fu, James H Anderson, and F Donelson Smith. Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning. In 2016 IEEE Real-Time and Embedded Technology and Applicat. Symp. (RTAS), pages 1-12. IEEE, 2016. Google Scholar
  28. O. Kotaba, J. Nowotsch, M. Paulitsch, S. Petters, and H Theiling. Multicore in real-time systems temporal isolation challenges due to shared resources. In Workshop on Industry-Driven Approaches for Cost-effective Certification of Safety-Critical, Mixed-Criticality Syst., 2013. Google Scholar
  29. Y. Krishnapillai, Z. Wu, and R. Pellizzoni. ROC: A Rank-switching, Open-row DRAM Controller for Time-predictable Systems. In Euromicro Conf. Real-Time Syst. (ECRTS), 2014. Google Scholar
  30. NG Chetan Kumar, Sudhanshu Vyas, Ron K Cytron, Christopher D Gill, Joseph Zambreno, and Phillip H Jones. Cache design for mixed criticality real-time systems. In Computer Design (ICCD), pages 513-516. IEEE, 2014. Google Scholar
  31. Robert Leibinger. Software architectures for advanced driver assistance systems (ADAS). In Int. Workshop on Operating Syst. Platforms for Embedded Real-Time Applicat. (OSPERT), 2015. Google Scholar
  32. Benjamin Lesage, Isabelle Puaut, and André Seznec. Preti: Partitioned real-time shared cache for mixed-criticality real-time systems. In Real-Time and Network Systems (RTNS), pages 171-180. ACM, 2012. Google Scholar
  33. Yonghui Li, Benny Akesson, and Kees Goossens. Dynamic command scheduling for real-time memory controllers. In Real-Time Systems (ECRTS), 2014 26th Euromicro Conference on, pages 3-14. IEEE, 2014. Google Scholar
  34. Yonghui Li, Benny Akesson, and Kees Goossens. Architecture and analysis of a dynamically-scheduled real-time memory controller. Real-Time Systems, pages 1-55, 2015. Google Scholar
  35. J. Liedtke, H. Haertig, and M. Hohmuth. OS-Controlled cache predictability for real-time systems. In Real-Time Technology and Applicat. Symp. (RTAS). IEEE, 1997. Google Scholar
  36. J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In High Performance Comput. Architecture (HPCA). IEEE, 2008. Google Scholar
  37. I. Liu, J. Reineke, D. Broman, M. Zimmer, and E. Lee. A PRET microarchitecture implementation with repeatable timing and competitive performance. In Comput. Design (ICCD). IEEE, 2012. Google Scholar
  38. L. Liu, Z. Cui, M. Xing, Y. Bao, M. Chen, and C. Wu. A software memory partition approach for eliminating bank-level interference in multicore systems. In Parallel Architecture and Compilation Techniques (PACT), pages 367-376. ACM, 2012. Google Scholar
  39. R. Mancuso, R. Dudko, E. Betti, M. Cesati, M. Caccamo, and R. Pellizzoni. Real-time cache management framework for multi-core architectures. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS). IEEE, 2013. Google Scholar
  40. Jan Nowotsch, Michael Paulitsch, Daniel Bühler, Henrik Theiling, Simon Wegener, and Michael Schmidt. Multi-core interference-sensitive WCET analysis leveraging runtime resource capacity enforcement. In Euromicro Conf. Real-Time Syst. (ECRTS), 2014. Google Scholar
  41. Shrinivas Anand Panchamukhi and Frank Mueller. Providing task isolation via TLB coloring. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), pages 3-13. IEEE, 2015. Google Scholar
  42. M. Paolieri, E. Quiñones, F.J. Cazorla, G. Bernat, and M. Valero. Hardware support for WCET analysis of hard real-time multicore systems. In Comput. Architecture News. ACM, 2009. Google Scholar
  43. M. Paolieri, E. Quiñones, J. Cazorla, and M. Valero. An analyzable memory controller for hard real-time CMPs. Embedded Syst. Letters, IEEE, 1(4):86-90, 2009. Google Scholar
  44. R. Rajkumar, K. Juvva, A. Molano, and S. Oikawa. Resource kernels: A resource-centric approach to real-time and multimedia systems. In Multimedia Computing and Networking (MNCN), January 1998. Google Scholar
  45. J. Reineke, I. Liu, H.D. Patel, S. Kim, and E.A. Lee. PRET DRAM controller: Bank privatization for predictability and temporal isolation. In Hardware/software codesign and system synthesis (CODES+ISSS). ACM, 2011. Google Scholar
  46. S. Rixner, W. J Dally, U. J Kapasi, P. Mattson, and J. Owens. Memory access scheduling. In ACM SIGARCH Comput. Architecture News, volume 28, pages 128-138. ACM, 2000. Google Scholar
  47. J. Rosen, A. Andrei, P. Eles, and Z. Peng. Bus access optimization for predictable implementation of real-time applications on multiprocessor systems-on-chip. In Real-Time Syst. Symp. (RTSS), pages 49-60, 2007. Google Scholar
  48. Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, et al. T-crest: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture, 61(9):449-471, 2015. Google Scholar
  49. Martin Schoeberl, Pascal Schleuniger, Wolfgang Puffitsch, Florian Brandner, Christian W Probst, Sven Karlsson, Tommy Thorn, et al. Towards a time-predictable dual-issue microprocessor: The patmos approach. In Bringing Theory to Practice: Predictability and Performance in Embedded Syst., volume 18, pages 11-21, 2011. Google Scholar
  50. L. Soares, D. Tam, and M. Stumm. Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer. In Int. Symp. Microarchitecture (MICRO). IEEE, 2008. Google Scholar
  51. N. Suzuki, H. Kim, D. de Niz, B. Andersson, L. Wrage, M. Klein, and R. Rajkumar. Coordinated bank and cache coloring for temporal protection of memory accesses. In Computational Sci. and Eng. (CSE), pages 685-692. IEEE, 2013. Google Scholar
  52. The berkeley out-of-order RISC-V processor code repository. URL: https://github.com/ucb-bar/riscv-boom.
  53. Theo Ungerer, Christian Bradatsch, Mike Gerdes, Florian Kluge, Ralf Jahr, Jörg Mische, Joao Fernandes, Pavel G Zaykov, Zlatko Petrov, B Boddeker, S. Kehr, H. Regler, A. Hugl, C. Rochange, H. Ozaktas, H. Cassé, A. Bonenfant, P. Sainrat, I. Broster, N. Lay, D. George, E. Quiñones, M. Panic, J. Abella, F. Cazorla, S. Uhrig, M. Rohde, and A. Pyka. parMERASA-Multi-core Execution of Parallelised Hard Real-Time Applications Supporting Analysability. In Digital System Design (DSD), pages 363-370. IEEE, 2013. Google Scholar
  54. Theo Ungerer, Francisco Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quinones, Mike Gerdes, Marco Paolieri, Julian Wolf, Hugues Casse, Sascha Uhrig, Irakli Guliashvili, Michael Houston, Floria Kluge, Stefan Metzlaff, and Jorg Mische. Merasa: Multicore execution of hard real-time applicat. supporting analyzability. IEEE Micro, 30(5):66-75, 2010. URL: http://dx.doi.org/10.1109/MM.2010.78.
  55. P. Valsan and Heechul Yun. MEDUSA: A predictable and high-performance DRAM controller for multicore based embedded systems. In Cyber-Physical Syst., Networks, and Applicat. (CPSNA). IEEE, 2015. Google Scholar
  56. Prathap Kumar Valsan, Heechul Yun, and Farzad Farshchi. Taming non-blocking caches to improve isolation in multicore real-time systems. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS). IEEE, 2016. Google Scholar
  57. Sravanthi Kota Venkata, Ikkjin Ahn, Donghwan Jeon, Anshuman Gupta, Christopher Louie, Saturnino Garcia, Serge Belongie, and Michael Bedford Taylor. SD-VBS: The San Diego vision benchmark suite. In Int. Symp. Workload Characterization (ISWC), pages 55-64. IEEE, 2009. Google Scholar
  58. S. Vestal. Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. In Real-Time Syst. Symp. (RTSS), pages 239-243. IEEE, 2007. Google Scholar
  59. B. Ward, J. Herman, C. Kenna, and J. Anderson. Making shared caches more predictable on multicore platforms. In Euromicro Conf. Real-Time Syst. (ECRTS), 2013. Google Scholar
  60. Z. Wu, Y. Krish, and R. Pellizzoni. Worst case analysis of DRAM latency in multi-requestor systems. In Real-Time Syst. Symp. (RTSS), 2013. Google Scholar
  61. J. Yan and W. Zhang. Time-predictable L2 cache design for high-performance real-time systems. In Embedded and Real-Time Computing Syst. and Applicat. (RTCSA), pages 357-366. IEEE, 2010. Google Scholar
  62. Jun Yan and Wei Zhang. Time-predictable multicore cache architectures. In Computer Research and Development (ICCRD), 2011 3rd International Conference on, volume 3, pages 1-5. IEEE, 2011. Google Scholar
  63. H. Yun, R. Mancuso, Z. Wu, and R. Pellizzoni. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), 2014. Google Scholar
  64. H. Yun and G. Yao. MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), 2013. Google Scholar
  65. W. Zhang and Y. Ding. Hybrid spm-cache architectures to achieve high time predictability and performance. In Application-Specific Syst., Architectures and Processors (ASAP), pages 297-304. IEEE, 2013. Google Scholar
  66. X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multicore cache management. In European Conf. Comput. Syst. (EuroSys), 2009. Google Scholar
  67. Michael Zimmer, David Broman, Chris Shaver, and Edward Lee. FlexPRET: A processor platform for mixed-criticality systems. In Real-Time and Embedded Technology and Applicat. Symp. (RTAS), pages 101-110. IEEE, 2014. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail