default search action
33rd IPDPS 2019: Rio de Janeiro, Brazil
- 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, Rio de Janeiro, Brazil, May 20-24, 2019. IEEE 2019, ISBN 978-1-7281-1246-6
Keynote 1
- Ian T. Foster:
Coding the Continuum. 1
Session 1: Graph Algorithms
- Ariful Azad, Aydin Buluç:
LACC: A Linear-Algebraic Algorithm for Finding Connected Components in Distributed Memory. 2-12 - Monika Henzinger, Alexander Noe, Christian Schulz:
Shared-Memory Exact Minimum Cuts. 13-22 - Udit Agarwal, Vijaya Ramachandran:
Distributed Weighted All Pairs Shortest Paths Through Pipelining. 23-32 - Philipp Bamberger, Fabian Kuhn, Yannic Maus:
Local Distributed Algorithms in Highly Dynamic Networks. 33-42
Session 2: HPC Systems
- Alvaro Frank, Tim Süß, André Brinkmann:
Effects and Benefits of Node Sharing Strategies in HPC Batch Systems. 43-53 - Constantino Gómez, Francesc Martínez, Adrià Armejach, Miquel Moretó, Filippo Mantovani, Marc Casas:
Design Space Exploration of Next-Generation HPC Machines. 54-65 - Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, Torsten Hoefler:
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning. 66-77 - Jens Domke, Kazuaki Matsumura, Mohamed Wahib, Haoyu Zhang, Keita Yashima, Toshiki Tsuchikawa, Yohei Tsuji, Artur Podobas, Satoshi Matsuoka:
Double-Precision FPUs in High-Performance Computing: An Embarrassment of Riches? 78-88
Session 3: Numerical Algorithms
- Edward Hutter, Edgar Solomonik:
Communication-Avoiding Cholesky-QR2 for Rectangular Matrices. 89-100 - Jordi Wolfson-Pou, Edmond Chow:
Asynchronous Multigrid Methods. 101-110 - Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. 111-122 - Israt Nisa, Jiajia Li, Aravind Sukumaran-Rajam, Richard W. Vuduc, P. Sadayappan:
Load-Balanced Sparse MTTKRP on GPUs. 123-133
Session 4: Scheduling and Load Balancing I
- Kunal Agrawal, I-Ting Angelina Lee, Jing Li, Kefu Lu, Benjamin Moseley:
Practically Efficient Scheduler for Minimizing Average Flow Time of Parallel Jobs. 134-144 - Klaus Jansen, Marten Maack, Alexander Mäcker:
Scheduling on (Un-)Related Machines with Setup Times. 145-154 - M. Yusuf Özkaya, Anne Benoit, Bora Uçar, Julien Herrmann, Ümit V. Çatalyürek:
A Scalable Clustering-Based Task Scheduler for Homogeneous Processors Using DAG Partitioning. 155-165 - Guillaume Aupy, Ana Gainaru, Valentin Honoré, Padma Raghavan, Yves Robert, Hongyang Sun:
Reservation Strategies for Stochastic Jobs. 166-175
Session 5: Accelerating Neural Networks
- Bruno R. C. Magalhães, Thomas Sterling, Felix Schürmann, Michael L. Hines:
Exploiting Flow Graph of System of ODEs to Accelerate the Simulation of Biologically-Detailed Neural Networks. 176-187 - Jiawen Liu, Dong Li, Gokcen Kestor, Jeffrey S. Vetter:
Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training. 188-199 - Shriram S. B, Anshuj Garg, Purushottam Kulkarni:
Dynamic Memory Management for GPU-Based Training of Deep Neural Networks. 200-209 - Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen:
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism. 210-220
Session 6: GPU Computing I
- Pengyu Wang, Lu Zhang, Chao Li, Minyi Guo:
Excavating the Potential of GPU for Accelerating Graph Traversal. 221-230 - Hartwig Anzt, Tobias Ribizel, Goran Flegar, Edmond Chow, Jack J. Dongarra:
ParILUT - A Parallel Threshold ILU for GPUs. 231-241 - Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, Dhabaleswar K. Panda:
C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. 242-251 - Tyler N. Allen, Xizhou Feng, Rong Ge:
Slate: Enabling Workload-Aware Efficient Multiprocessing for Modern GPGPUs. 252-261
Session 7: Learning and Prediction Systems
- Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, Kevin A. Kwiat, Charles A. Kamhoua:
A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing. 262-272 - Adrian Colaso, Pablo Prieto, Pablo Abad Fidalgo, José-Ángel Gregorio, Valentin Puente:
Architecting Racetrack Memory Preshift through Pattern-Based Prediction Mechanisms. 273-282 - Ryan Chard, Zhuozhao Li, Kyle Chard, Logan T. Ward, Yadu N. Babuji, Anna Woodard, Steven Tuecke, Ben Blaiszik, Michael J. Franklin, Ian T. Foster:
DLHub: Model and Data Serving for Science. 283-292 - Huizhang Luo, Dan Huang, Qing Liu, Zhenbo Qiao, Hong Jiang, Jing Bi, Haitao Yuan, Mengchu Zhou, Jinzhen Wang, Zhenlu Qin:
Identifying Latent Reduced Models to Precondition Lossy Compression. 293-302
Session 8: Multicore Computing
- Mehrzad Nejat, Miquel Pericàs, Per Stenström:
QoS-Driven Coordinated Management of Resources to Save Energy in Multi-core Systems. 303-313 - Md. Vasimuddin, Sanchit Misra, Heng Li, Srinivas Aluru:
Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. 314-324 - Stephanie Labasan, Matthew Larsen, Hank Childs, Barry Rountree:
Power and Performance Tradeoffs for Visualization Algorithms. 325-334 - Shuai Che, Jieming Yin:
Northup: Divide-and-Conquer Programming in Systems with Heterogeneous Memories and Processors. 335-344
Plenary Session: Best Papers
- T.-H. Hubert Chan, Mauro Sozio, Bintao Sun:
Distributed Approximate k-Core Decomposition and Min-Max Edge Orientation: Breaking the Diameter Barrier. 345-354 - Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
FALCON: Efficient Designs for Zero-Copy MPI Datatype Processing on Emerging Architectures. 355-364 - Pankaj Khanchandani, Roger Wattenhofer:
Two Elementary Instructions Make Compare-and-Swap. 365-374 - James Gentry, Chavit Denninnart, Mohsen Amini Salehi:
Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems. 375-384
Keynote 2
- Lawrence Rauchwerger:
Two Roads to Parallelism: From Serial Code to Programming with STAPL. 385
Session 9: Cloud Computing
- Zhichao Yan, Hong Jiang, Yujuan Tan, Stan Skelton, Hao Luo:
Z-Dedup: A Case for Deduplicating Compressed Contents in Cloud. 386-395 - Petar Kochovski, Rizos Sakellariou, Marko Bajec, Pavel D. Drobintsev, Vlado Stankovski:
An Architecture and Stochastic Method for Database Container Placement in the Edge-Fog-Cloud Continuum. 396-405 - Nikos Tziritas, Thanasis Loukopoulos, Samee Khan, Cheng-Zhong Xu, Albert Y. Zomaya:
Online Live VM Migration Algorithms to Minimize Total Migration Time and Downtime. 406-417 - Nishant Saurabh, Julian Remmers, Dragi Kimovski, Radu Prodan, Jorge G. Barbosa:
Semantics-Aware Virtual Machine Image Management in IaaS Clouds. 418-427
Session 10: Graph Algorithms II
- Yongzhe Zhang, Zhenjiang Hu:
Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication Channels. 428-438 - Loc Hoang, Roshan Dathathri, Gurbinder Gill, Keshav Pingali:
CuSP: A Customizable Streaming Edge Partitioner for Distributed Graph Analytics. 439-450 - Chirag Jain, Sanchit Misra, Haowen Zhang, Alexander T. Dilthey, Srinivas Aluru:
Accelerating Sequence Alignment to Graphs. 451-461 - Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna:
Accurate, Efficient and Scalable Graph Embedding. 462-471
Session 11: Linear Algebra
- Ichitaro Yamazaki, Zhaojun Bai, Ding Lu, Jack J. Dongarra:
Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation. 472-481 - Roy Nissim, Oded Schwartz:
Revisiting the I/O-Complexity of Fast Matrix Multiplication with Recomputations. 482-490 - Elad Weiss, Oded Schwartz:
Computation of Matrix Chain Products on Parallel Machines. 491-500 - Hua Huang, Edmond Chow:
Overlapping Communications with Other Communications and Its Application to Distributed Dense Matrix Computations. 501-510
Session 12: Storage Systems
- Woong Shin, Christopher Brumgard, Bing Xie, Sudharshan S. Vazhkudai, Devarshi Ghoshal, Sarp Oral, Lavanya Ramakrishnan:
Data Jockey: Automatic Data Management for HPC Multi-tiered Storage Systems. 511-522 - Hao Fan, Song Wu, Shadi Ibrahim, Ximing Chen, Hai Jin, Jiang Xiao, Haibing Guan:
NCQ-Aware I/O Scheduling for Conventional Solid State Drives. 523-532 - Junqing Gu, Chentao Wu, Xin Xie, Han Qiu, Jie Li, Minyi Guo, Xubin He, Yuanyuan Dong, Yafei Zhao:
Optimizing the Parity Check Matrix for Efficient Decoding of RS-Based Cloud Storage Systems. 533-544 - Zhipeng Li, Min Lv, Yinlong Xu, Yongkun Li, Liangliang Xu:
D3: Deterministic Data Distribution for Efficient Data Reconstruction in Erasure-Coded Distributed Storage Systems. 545-556
Session 13: Applications I
- Zhao Liu, Xuesen Chu, Xiaojing Lv, Hongsong Meng, Shupeng Shi, Wenji Han, Jingheng Xu, Haohuan Fu, Guangwen Yang:
SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Sunway TaihuLight. 557-566 - Oleksandr Rudyy, Marta Garcia-Gasulla, Filippo Mantovani, Alfonso Santiago, Raül Sirvent, Mariano Vázquez:
Containers in HPC: A Scalability and Portability Study in Production Biological Simulations. 567-577 - Priyanka Ghosh, Sriram Krishnamoorthy, Ananth Kalyanaraman:
PaKman: Scalable Assembly of Large Genomes on Distributed Memory Machines. 578-589 - Md. Mostofa Ali Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Greg Diamos, Kenneth Church:
Language Modeling at Scale. 590-599
Session 14: File Systems
- Simbarashe Dzinamarira, Florin Dinu, T. S. Eugene Ng:
DYRS: Bandwidth-Aware Disk-to-Memory Migration of Cold Data in Big-Data File Systems. 600-609 - Bharti Wadhwa, Arnab Kumar Paul, Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ali Raza Butt, Jon Bernard, Kirk W. Cameron:
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems. 610-620 - Salvatore Di Girolamo, Pirmin Schmid, Thomas C. Schulthess, Torsten Hoefler:
SimFS: A Simulation Data Virtualizing File System Interface. 621-630 - Guillaume Aupy, Olivier Beaumont, Lionel Eyraud-Dubois:
Sizing and Partitioning Strategies for Burst-Buffers to Reduce IO Contention. 631-640
Session 15: GPU Computing II
- Prashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan:
On Optimizing Complex Stencils on GPUs. 641-652 - Wenyi Zhao, Quan Chen, Hao Lin, Jianfeng Zhang, Jingwen Leng, Chao Li, Wenli Zheng, Li Li, Minyi Guo:
Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs. 653-663 - Mohammad Khavari Tavana, Yifan Sun, Nicolas Bohm Agostini, David R. Kaeli:
Exploiting Adaptive Data Compression to Improve Performance and Energy-Efficiency of Compute Workloads in Multi-GPU Systems. 664-674 - Kyung Hoon Kim, Priyank Devpura, Abhishek Nayyar, Andrew Doolittle, Ki Hwan Yum, Eun Jung Kim:
Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU Architectures. 675-685
Session 16: Scheduling and Load Balancing II
- Arnaud Legrand, Denis Trystram, Salah Zrigui:
Adapting Batch Scheduling to Workload Characteristics: What Can We Expect From Online Learning? 686-695 - Heng Wu, Wenbo Zhang, Yuanjia Xu, Hao Xiang, Tao Huang, Haiyang Ding, Zheng Zhang:
Aladdin: Optimized Maximum Flow Management for Shared Production Clusters. 696-707 - Pawel Garncarek, Tomasz Jurdzinski, Dariusz R. Kowalski, Miguel A. Mosteiro:
mmWave Wireless Backhaul Scheduling of Stochastic Packet Arrivals. 708-717 - Petra Berenbrink, Tom Friedetzky, Dominik Kaaser, Peter Kling:
Tight & Simple Load Balancing. 718-726
Keynote 3
- Luiz DeRose:
The Path to Delivering Programable Exascale Systems. 727
Session 17: Managing Data
- Philip Dexter, Kenneth Chiu, Bedri Sendir:
An Error-Reflective Consistency Model for Distributed Data Stores. 728-737 - Jason Arnold, Boris Glavic, Ioan Raicu:
A High-Performance Distributed Relational Database System for Scalable OLAP Processing. 738-748 - Ondrej Meca, Lubomír Ríha, Tomás Brzobohatý:
An Approach for Parallel Loading and Pre-Processing of Unstructured Meshes Stored in Spatially Scattered Fashion. 749-760
Session 18: Message Passing
- Sayan Ghosh, Mahantesh Halappanavar, Ananth Kalyanaraman, Arif Khan, Assefaw H. Gebremedhin:
Exploring MPI Communication Models for Graph Applications Using Graph Matching as a Case Study. 761-770 - Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, Xuandong Li:
BigSpa: An Efficient Interprocedural Static Analysis Engine in the Cloud. 771-780 - S. Mahdieh Ghazimirsaeed, Seyed Hessam Mirsadeghi, Ahmad Afsahi:
An Efficient Collaborative Communication Mechanism for MPI Neighborhood Collectives. 781-792
Session 19: Managing Power and Energy
- Srinivasan Ramesh, Swann Perarnau, Sridutt Bhalachandra, Allen D. Malony, Peter H. Beckman:
Understanding the Impact of Dynamic Power Capping on Application Progress. 793-804 - Mohak Chadha, Michael Gerndt:
Modelling DVFS and UFS for Region-Based Energy Aware Tuning of HPC Applications. 805-814 - Wenli Zheng, Xiaorui Wang, Yue Ma, Chao Li, Hao Lin, Bin Yao, Jianfeng Zhang, Minyi Guo:
SprintCon: Controllable and Efficient Computational Sprinting for Data Center Servers. 815-824 - Mathieu Bacou, Grégoire Todeschi, Alain Tchana, Daniel Hagimont, Baptiste Lepers, Willy Zwaenepoel:
Drowsy-DC: Data Center Power Management System. 825-834
Session 20: Networks
- Dongxiao Yu, Yifei Zou, Yong Zhang, Feng Li, Jiguo Yu, Yu Wu, Xiuzhen Cheng, Francis C. M. Lau:
Distributed Dominating Set and Connected Dominating Set Construction Under the Dynamic SINR Model. 835-844 - Linghui Luo, Christian Scheideler, Thim Strothmann:
MULTISKIPGRAPH: A Self-Stabilizing Overlay Network that Maintains Monotonic Searchability. 845-854 - Soumyottam Chatterjee, Gopal Pandurangan, Peter Robinson:
Network Size Estimation in Small-World Networks Under Byzantine Faults. 855-865 - Corentin Hardy, Erwan Le Merrer, Bruno Sericola:
MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets. 866-877
Session 21: Dealing with Faults
- Luanzheng Guo, Dong Li:
MOARD: Modeling Application Resilience to Transient Faults on Data Objects. 878-889 - Giorgis Georgakoudis, Ignacio Laguna, Hans Vandierendonck, Dimitrios S. Nikolopoulos, Martin Schulz:
SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded Applications. 890-899 - Zaeem Hussain, Taieb Znati, Rami G. Melhem:
Optimal Placement of In-memory Checkpoints Under Heterogeneous Failure Likelihoods. 900-910 - Bogdan Nicolae, Adam Moody, Elsa Gonsiorowski, Kathryn M. Mohror, Franck Cappello:
VeloC: Towards High Performance Adaptive Asynchronous Checkpointing at Large Scale. 911-920
Session 22: Optimizing Memory Behavior
- Wen Pan, Tao Xie, Xiaojia Song:
HART: A Concurrent Hash-Assisted Radix Tree for DRAM-PM Hybrid Memory Systems. 921-931 - Evangelos Vasilakis, Vassilis Papaefstathiou, Pedro Trancoso, Ioannis Sourdis:
LLC-Guided Data Migration in Hybrid Memory Systems. 932-942 - Matthias Hauck, Marcus Paradies, Holger Fröning:
Software-Based Buffering of Associative Operations on Random Memory Addresses. 943-952 - Gongjin Sun, Junjie Shen, Alexander V. Veidenbaum:
Combining Prefetch Control and Cache Partitioning to Improve Multicore Performance. 953-962
Session 23: Programming Languages
- John Bachan, Scott B. Baden, Steven A. Hofmeyr, Mathias Jacquelin, Amir Kamil, Dan Bonachea, Paul H. Hargrove, Hadia Ahmed:
UPC++: A High-Performance Communication Framework for Asynchronous Computation. 963-973 - Tsung-Wei Huang, Chun-Xun Lin, Guannan Guo, Martin D. F. Wong:
Cpp-Taskflow: Fast Task-Based Parallel Programming Using Modern C++. 974-983 - Laleh Aghababaie Beni, Saikiran Ramanan, Aparna Chandramowlishwaran:
Portal: A High-Performance Language and Compiler for Parallel N-Body Problems. 984-995 - Thomas Macht, Clemens Grelck:
SAC Goes Cluster: Fully Implicit Distributed Computing. 996-1006
Session 24: Accelerating Graph Processing
- Scott Sallinen, Roger Pearce, Matei Ripeanu:
Incremental Graph Processing for On-line Analytics. 1007-1018 - Timothy A. K. Zakian, Ludovic Anthony Richard Capelli, Zhenjiang Hu:
Incrementalization of Vertex-Centric Programs. 1019-1029 - Wole Jaiyeoba, Kevin Skadron:
GraphTinker: A High Performance Data Structure for Dynamic Graph Processing. 1030-1041
Session 25: Applications II
- Shunjie Zhou, Fan Zhang, Hanhua Chen, Hai Jin, Bing Bing Zhou:
FastJoin: A Skewness-Aware Distributed Stream Join System. 1042-1052 - Philipp Habermann, Chi Ching Chi, Mauricio Alvarez-Mesa, Ben H. H. Juurlink:
A Bin-Based Bitstream Partitioning Approach for Parallel CABAC Decoding in Next Generation Video Coding. 1053-1062 - Yujing Ma, Florin Rusu, Martin Torres:
Stochastic Gradient Descent on Modern Hardware: Multi-core CPU or GPU? Synchronous or Asynchronous? 1063-1072
Session 26: Security and Reliability
- Thorsten Götte, Vipin Ravindran Vijayalakshmi, Christian Scheideler:
Always be Two Steps Ahead of Your Enemy. 1073-1082 - Diksha Gupta, Jared Saia, Maxwell Young:
Peace Through Superior Puzzling: An Asymmetric Sybil Defense. 1083-1094 - Swarnendu Biswas, Rui Zhang, Michael D. Bond, Brandon Lucia:
Rethinking Support for Region Conflict Exceptions. 1095-1106
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.