default search action
54th MICRO 2021: Virtual Event, Greece
- MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, October 18-22, 2021. ACM 2021, ISBN 978-1-4503-8557-2
Session 1: Best Paper Session
- Zhiyao Xie, Xiaoqing Xu, Matt Walker, Joshua Knebel, Kumaraguru Palaniswamy, Nicolas Hebert, Jiang Hu, Huanrui Yang, Yiran Chen, Shidhartha Das:
APOLLO: An Automated Power Modeling Framework for Runtime Power Introspection in High-Volume Commercial Microprocessors. 1-14 - Björn Gottschall, Lieven Eeckhout, Magnus Jahre:
TIP: Time-Proportional Instruction Profiling. 15-27 - Yu-Chia Liu, Hung-Wei Tseng:
NDS: N-Dimensional Storage. 28-45 - Harini Muthukrishnan, Daniel Lustig, David W. Nellans, Thomas F. Wenisch:
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management. 46-58
Session 2A: Non-Volatile Memory
- Congming Gao, Xin Xin, Youyou Lu, Youtao Zhang, Jun Yang, Jiwu Shu:
ParaBit: Processing Parallel Bitwise Operations in NAND Flash Memory based SSDs. 59-70 - Apostolos Kokolis, Antonis Psistakis, Benjamin Reidys, Jian Huang, Josep Torrellas:
Distributed Data Persistency. 71-85 - Marina Vemmou, Alexandros Daglis:
COSPlay: Leveraging Task-Level Parallelism for High-Throughput Synchronous Persistence. 86-99 - Minh S. Q. Truong, Eric Chen, Deanyone Su, Liting Shen, Alexander Glass, L. Richard Carley, James A. Bain, Saugata Ghose:
RACER: Bit-Pipelined Processing Using Resistive Memory. 100-116 - Md Hafizul Islam Chowdhuryy, Muhammad Rashedul Haq Rashed, Amro Awad, Rickard Ewetz, Fan Yao:
LADDER: Architecting Content and Location-aware Writes for Crossbar Resistive Memories. 117-130
Session 2B: Energy Efficiency & Low Power
- Seunghak Lee, Ki-Dong Kang, Hwanjun Lee, Hyungwon Park, Young Hoon Son, Nam Sung Kim, Daehoon Kim:
GreenDIMM: OS-assisted DRAM Power Management for DRAM with a Sub-array Granularity Power-Down State. 131-142 - Ki-Dong Kang, Gyeongseo Park, Hyosang Kim, Mohammad Alian, Nam Sung Kim, Daehoon Kim:
NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads. 143-154 - Jawad Haj-Yahya, Jisung Park, Rahul Bera, Juan Gómez-Luna, Efraim Rotem, Taha Shahroodi, Jeremie S. Kim, Onur Mutlu:
BurstLink: Techniques for Energy-Efficient Video Display for Conventional and Virtual Reality Systems. 155-169 - Jianping Zeng, Jongouk Choi, Xinwei Fu, Ajay Paddayuru Shreepathi, Dongyoon Lee, Changwoo Min, Changhee Jung:
ReplayCache: Enabling Volatile Cachesfor Energy Harvesting Systems. 170-182 - Young Geun Kim, Carole-Jean Wu:
AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning. 183-198
Session 3A: Security & Privacy I
- Luyi Kang, Yuqi Xue, Weiwei Jia, Xiaohao Wang, Jongryool Kim, Changhwan Youn, Myeong Joon Kang, Hyung Jin Lim, Bruce L. Jacob, Jian Huang:
IceClave: A Trusted Execution Environment for In-Storage Computing. 199-211 - Hanieh Hashemi, Yongqin Wang, Murali Annavaram:
DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware. 212-224 - Yonggan Fu, Yang Zhao, Qixuan Yu, Chaojian Li, Yingyan Lin:
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency. 225-237 - Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Srinivas Devadas, Ronald G. Dreslinski, Christopher Peikert, Daniel Sánchez:
F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption. 238-252 - Michael LeMay, Joydeep Rakshit, Sergej Deutsch, David M. Durham, Santosh Ghosh, Anant Nori, Jayesh Gaur, Andrew Weiler, Salmin Sultana, Karanvir Grewal, Sreenivas Subramoney:
Cryptographic Capability Computing. 253-267
Session 3B: Processing In/Near Memory
- Jaehyun Park, Byeongho Kim, Sungmin Yun, Eojin Lee, Minsoo Rhu, Jung Ho Ahn:
TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory. 268-281 - Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez-Luna, Jakub Golinowski, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Nils Blach, Marek Konieczny, Onur Mutlu, Torsten Hoefler:
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems. 282-297 - Anirban Nag, Rajeev Balasubramonian:
OrderLight: Lightweight Memory-Ordering Primitive for Efficient Fine-Grained PIM Computations. 298-310 - Elaheh Sadredini, Reza Rahimi, Mohsen Imani, Kevin Skadron:
Sunder: Enabling Low-Overhead and Scalable Near-Data Pattern Matching Acceleration. 311-323 - Xin Xin, Yanan Guo, Youtao Zhang, Jun Yang:
SAM: Accelerating Strided Memory Accesses. 324-336
Session 4A: Parallelism
- Eduardo José Gómez-Hernández, Juan M. Cebrian, J. Rubén Titos Gil, Stefanos Kaxiras, Alberto Ros:
Efficient, Distributed, and Non-Speculative Multi-Address Atomic Operations. 337-349 - Joseph Zuckerman, Davide Giri, Jihye Kwon, Paolo Mantovani, Luca P. Carloni:
Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs. 350-365 - Vanshika Baoni, Adarsh Mittal, Gurindar S. Sohi:
Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache Accesses. 366-379 - Aniket Anand Deshmukh, Yale N. Patt:
Criticality Driven Fetch. 380-391 - Philip Bedoukian, Neil Adit, Edwin Peguero, Adrian Sampson:
Software-Defined Vector Processing on Manycore Fabrics. 392-406
Session 4B: Accelerators I
- Arash Pourhabibi Zarandi, Mark Sutherland, Alexandros Daglis, Babak Falsafi:
Cerebros: Evading the RPC Tax in Datacenters. 407-420 - Mario Drumond, Louis Coulon, Arash Pourhabibi Zarandi, Ahmet Caner Yüzügüler, Babak Falsafi, Martin Jaggi:
Equinox: Training (for Free) on a Custom Inference Accelerator. 421-433 - Seongyoung Kang, Jiyoung An, Jinpyo Kim, Sang-Woo Jun:
: Near-Storage Accelerator for High-Performance Log Analytics. 434-448 - Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han:
PointAcc: Efficient Point Cloud Accelerator. 449-461 - Sagar Karandikar, Chris Leary, Chris Kennelly, Jerry Zhao, Dinesh Parimi, Borivoje Nikolic, Krste Asanovic, Parthasarathy Ranganathan:
A Hardware Accelerator for Protocol Buffers. 462-478
Session 5A: Accelerators II
- Weizhuang Liu, Bo Yu, Yiming Gan, Qiang Liu, Jie Tang, Shaoshan Liu, Yuhao Zhu:
Archytas: A Framework for Synthesizing and Dynamically Optimizing Accelerators for Robotic Localization. 479-493 - Shulin Zhao, Haibo Zhang, Cyan Subhra Mishra, Sandeepa Bhuyan, Ziyu Ying, Mahmut Taylan Kandemir, Anand Sivasubramaniam, Chita R. Das:
HoloAR: On-the-fly Optimization of 3D Holographic Processing for Augmented Reality. 494-506 - David Trilla, John-David Wellman, Alper Buyuktosunoglu, Pradip Bose:
NOVIA: A Framework for Discovering Non-Conventional Inline Accelerators. 507-521 - Ameer M. S. Abdelhadi, Eugene Sha, Ciaran Bannon, Hendrik Steenland, Andreas Moshovos:
Noema: Hardware-Efficient Template Matching for Neural Population Pattern Detection. 522-534 - Timothy Dunn, Harisankar Sadasivan, Jack Wadden, Kush Goliya, Kuan-Yu Chen, David T. Blaauw, Reetuparna Das, Satish Narayanasamy:
SquiggleFilter: An Accelerator for Portable Virus Detection. 535-549
Session 5B: Security & Privacy II
- Joonsung Kim, Hamin Jang, Hunjun Lee, Seungho Lee, Jangwoo Kim:
UC-Check: Characterizing Micro-operation Caches in x86 Processors and Implications in Security and Performance. 550-564 - Jaeguk Ahn, Jiho Kim, Hans Kasan, Zhixian Jin, Leila Delshadtehrani, WonJun Song, Ajay Joshi, John Kim:
Network-on-Chip Microarchitecture-based Covert Channel in GPUs. 565-577 - Pablo Buiras, Hamed Nemati, Andreas Lindner, Roberto Guanciale:
Validation of Side-Channel Models via Observation Refinement. 578-591 - Sam Ainsworth:
GhostMinion: A Strictness-Ordered Cache System for Spectre Mitigation. 592-606 - Rutvik Choudhary, Jiyong Yu, Christopher W. Fletcher, Adam Morrison:
Speculative Privacy Tracking (SPT): Leaking Information From Speculative Execution Without Compromising Privacy. 607-622
Session 6A: Reliabiity & Verification
- Minesh Patel, Geraldo F. Oliveira, Onur Mutlu:
HARP: Practically and Effectively Identifying Uncorrectable Errors in Memory Chips That Use On-Die Error-Correcting Codes. 623-640 - Michael B. Sullivan, Nirmal R. Saxena, Mike O'Connor, Donghyuk Lee, Paul Racunas, Saurabh Hukerikar, Timothy Tsai, Siva Kumar Sastry Hari, Stephen W. Keckler:
Characterizing and Mitigating Soft Errors in GPU DRAM. 641-653 - Jianping Zeng, Hongjune Kim, Jaejin Lee, Changhee Jung:
Turnpike: Lightweight Soft Error Resilience for In-Order Cores. 654-666 - Nursultan Kabylkas, Tommy Thorn, Shreesha Srinath, Polychronis Xekalakis, Jose Renau:
Effective Processor Verification with Logic Fuzzer Enhanced Co-simulation. 667-678 - Yao Hsiao, Dominic P. Mulligan, Nikos Nikoleris, Gustavo Petri, Caroline Trippel:
Synthesizing Formal Models of Hardware from RTL for Efficient Verification of Memory Model Implementations. 679-694
Session 6B: GPGPU
- Jie Zhang, Myoungsoo Jung:
Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors. 695-708 - Lufei Liu, Wesley Chang, Francois Demoullin, Yuan-Hsi Chou, Mohammadreza Saed, David Pankratz, Tyler Nowicki, Tor M. Aamodt:
Intersection Prediction for Accelerated GPU Ray Tracing. 709-723 - Cesar Avalos Baddouh, Mahmoud Khairy, Roland N. Green, Mathias Payer, Timothy G. Rogers:
Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads. 724-737 - Vijay Kandiah, Scott Peverelle, Mahmoud Khairy, Junrui Pan, Amogh Manjunath, Timothy G. Rogers, Tor M. Aamodt, Nikos Hardavellas:
AccelWattch: A Power Modeling Framework for Modern GPUs. 738-753 - Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, Hyesoon Kim:
Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics. 754-766
Session 7A: Microarchitecture I
- Stijn Eyerman, Wim Heirman, Sam Van den Steen, Ibrahim Hur:
Enabling Branch-Mispredict Level Parallelism by Selectively Flushing Instructions. 767-778 - Niranjan K. Soundararajan, Peter Braun, Tanvir Ahmed Khan, Baris Kasikci, Heiner Litz, Sreenivas Subramoney:
PDede: Partitioned, Deduplicated, Delta Branch Target Buffer. 779-791 - Arthur Perais:
Leveraging Targeted Value Prediction to Unlock New Hardware Strength Reduction Potential. 792-803 - Stephen Pruett, Yale N. Patt:
Branch Runahead: An Alternative to Branch Prediction for Impossible to Predict Branches. 804-815 - Tanvir Ahmed Khan, Nathan Brown, Akshitha Sriraman, Niranjan K. Soundararajan, Rakesh Kumar, Joseph Devietti, Sreenivas Subramoney, Gilles A. Pokam, Heiner Litz, Baris Kasikci:
Twig: Profile-Guided BTB Prefetching for Data Center Applications. 816-829
Session 7B: Accelerators III
- Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang, Marco Donato, Victor Sanh, Paul N. Whatmough, Alexander M. Rush, David Brooks, Gu-Yeon Wei:
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference. 830-844 - Yaoyu Tao, Zhengya Zhang:
HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer. 845-856 - Omar Mohamed Awad, Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Ciaran Bannon, Anand Jayarajan, Gennady Pekhimenko, Andreas Moshovos:
FPRaker: A Processing Element For Accelerating Neural Network Training. 857-869 - Udit Gupta, Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin Sean Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks:
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance. 870-884 - Qiyu Wan, Haojun Xia, Xingyao Zhang, Lening Wang, Shuaiwen Leon Song, Xin Fu:
Shift-BNN: Highly-Efficient Probabilistic Bayesian Neural Network Training via Memory-Friendly Pattern Retrieving. 885-897
Session 8A: Superconducting & Quantum
- Mengyu Zhang, Lei Xie, Zhenxing Zhang, Qiaonian Yu, Guanglei Xi, Hualiang Zhang, Fuming Liu, Yarui Zheng, Yicong Zheng, Shengyu Zhang:
Exploiting Different Levels of Parallelism in the Quantum Control Microarchitecture for Superconducting Qubits. 898-911 - Farzaneh Zokaee, Lei Jiang:
SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-based Systolic CNN Accelerators. 912-924 - Fei Hua, Yan-Hao Chen, Yuwei Jin, Chi Zhang, Ari B. Hayes, Youtao Zhang, Eddy Z. Zhang:
AutoBraid: A Framework for Enabling Efficient Surface Code Communication in Quantum Computing. 925-936 - Poulami Das, Swamit S. Tannu, Moinuddin K. Qureshi:
JigSaw: Boosting Fidelity of NISQ Programs via Measurement Subsetting. 937-949 - Poulami Das, Swamit S. Tannu, Siddharth Dangwal, Moinuddin K. Qureshi:
ADAPT: Mitigating Idling Errors in Qubits via Adaptive Dynamical Decoupling. 950-962
Session 8B: Sparse Processing
- Hang Lu, Liang Chang, Chenglong Li, Zixuan Zhu, Shengjian Lu, Yanhuan Liu, Mingzhe Zhang:
Distilling Bit-level Sparsity Parallelism for General Purpose Deep Learning Acceleration. 963-976 - Liqiang Lu, Yicheng Jin, Hangrui Bi, Zizhang Luo, Peng Li, Tao Wang, Yun Liang:
Sanger: A Co-Design Framework for Enabling Sparse Attention using Reconfigurable Architecture. 977-991 - Shiyu Li, Edward Hanson, Xuehai Qian, Hai (Helen) Li, Yiran Chen:
ESCALATE: Boosting the Efficiency of Sparse CNN Accelerator with Kernel Decomposition. 992-1004 - Subhankar Pal, Aporva Amarnath, Siying Feng, Michael F. P. O'Boyle, Ronald G. Dreslinski, Christophe Dubach:
SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator. 1005-1021 - Alexander Rucker, Matthew Vilim, Tian Zhao, Yaqi Zhang, Raghu Prabhakar, Kunle Olukotun:
Capstan: A Vector RDA for Sparsity. 1022-1035
Session 9A: Graph Processing
- Abanti Basak, Zheng Qu, Jilan Lin, Alaa R. Alameldeen, Zeshan Chishti, Yufei Ding, Yuan Xie:
Improving Streaming Graph Processing Performance using Input Knowledge. 1036-1050 - Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, Ang Li:
I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization. 1051-1063 - Quan M. Nguyen, Daniel Sánchez:
Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures. 1064-1077 - Jie-Fang Zhang, Zhengya Zhang:
Point-X: A Spatial-Locality-Aware Architecture for Energy-Efficient Graph-Based Point-Cloud Deep Learning. 1078-1090 - Shafiur Rahman, Mahbod Afarin, Nael B. Abu-Ghazaleh, Rajiv Gupta:
JetStream: Graph Analytics on Streaming Data with Event-Driven Hardware Accelerator. 1091-1105
Session 9B: Virtual Memory & Prefetching
- Venkat Sri Sai Ram, Ashish Panwar, Arkaprava Basu:
Trident: Harnessing Architectural Resources for All Page Sizes in x86 Processors. 1106-1120 - Rahul Bera, Konstantinos Kanellopoulos, Anant Nori, Taha Shahroodi, Sreenivas Subramoney, Onur Mutlu:
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning. 1121-1137 - Georgios Vavouliotis, Lluc Alvarez, Boris Grot, Daniel A. Jiménez, Marc Casas:
Morrigan: A Composite Instruction TLB Prefetcher. 1138-1153 - Bingyao Li, Jieming Yin, Youtao Zhang, Xulong Tang:
Improving Address Translation in Multi-GPUs via Sharing and Spilling aware TLB Design. 1154-1168 - Jagadish B. Kotra, Michael LeBeane, Mahmut T. Kandemir, Gabriel H. Loh:
Increasing GPU Translation Reach by Leveraging Under-Utilized On-Chip Resources. 1169-1181
Session 10A: Security & Privacy III
- Lois Orosa, Abdullah Giray Yaglikçi, Haocong Luo, Ataberk Olgun, Jisung Park, Hasan Hassan, Minesh Patel, Jeremie S. Kim, Onur Mutlu:
A Deeper Look into RowHammer's Sensitivities: Experimental Analysis of Real DRAM Chipsand Implications on Future Attacks and Defenses. 1182-1197 - Hasan Hassan, Yahya Can Tugrul, Jeremie S. Kim, Victor van der Veen, Kaveh Razavi, Onur Mutlu:
Uncovering In-DRAM RowHammer Protection Mechanisms: A New Methodology, Custom RowHammer Patterns, and Implications. 1198-1213 - Kazi Abu Zubair, Sudhanva Gurumurthi, Vilas Sridharan, Amro Awad:
Soteria: Towards Resilient Integrity-Protected and Encrypted Non-Volatile Memories. 1214-1226 - Alexander Freij, Huiyang Zhou, Yan Solihin:
Bonsai Merkle Forests: Efficiently Achieving Crash Consistency in Secure Persistent Memory. 1227-1240 - Xijing Han, James Tuck, Amro Awad:
Dolos: Improving the Performance of Persistent Applications in ADR-Supported Secure Memory. 1241-1253
Session 10B: Microarchitecture II
- Vasileios Tsoutsouras, Orestis Kaparounakis, Bilgesu Arif Bilgin, Chatura Samarakoon, James Timothy Meech, Jan Heck, Phillip Stanley-Marbell:
The Laplace Microarchitecture for Tracking Data Uncertainty and Its Implementation in a RISC-V Processor. 1254-1269 - Chanchal Kumar, Anirudh Seshadri, Aayush Chaudhary, Shubham Bhawalkar, Rohit Singh, Eric Rotenberg:
Post-Fabrication Microarchitecture. 1270-1281 - Yuanchao Xu, Mehmet Esat Belviranli, Xipeng Shen, Jeffrey S. Vetter:
PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-Chips. 1282-1295 - Josué Feliu, Alberto Ros, Manuel E. Acacio, Stefanos Kaxiras:
ITSLF: Inter-Thread Store-to-Load Forwardingin Simultaneous Multithreading. 1296-1308 - Liu Liu, Jilan Lin, Zheng Qu, Yufei Ding, Yuan Xie:
ENMC: Extreme Near-Memory Classification via Approximate Screening. 1309-1322
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.