default search action
20th PPOPP 2015: San Francisco, CA, USA
- Albert Cohen, David Grove:
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, February 7-11, 2015. ACM 2015, ISBN 978-1-4503-3205-7
Concurrency
- Vincent Gramoli:
More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms. 1-10 - Dan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavit:
The SprayList: a scalable relaxed priority queue. 11-20 - Maya Arbel, Adam Morrison:
Predicate RCU: an RCU for scalable concurrent updates. 21-30 - Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav:
Automatic scalable atomicity via semantic locking. 31-41
Code Generation
- Austin R. Benson, Grey Ballard:
A framework for practical parallel fast matrix multiplication. 42-53 - Aravind Acharya, Uday Bondhugula:
PLUTO+: near-complete modeling of affine transformations for parallelism and locality. 54-64 - Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan:
Distributed memory code generation for mixed Irregular/Regular computations. 65-75
Transactional Memory
- Lingxiang Xiang, Michael L. Scott:
Software partitioning of hardware transactions. 76-86 - Alexandro Baldassin, Edson Borin, Guido Araujo:
Performance implications of dynamic memory allocators on transactional memory systems. 87-96 - Minjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond:
Low-overhead software transactional memory with progress guarantees and strong semantics. 97-108
Large Scale Parallelism
- Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushik Sen, John M. Mellor-Crummey, Costin Iancu:
Barrier elision for production parallel programs. 109-119 - Loïc Thébault, Eric Petit, Quang Dinh:
Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assembly. 120-129 - Nathan R. Tallent, Abhinav Vishnu, Hubertus Van Dam, Jeff Daily, Darren J. Kerbyson, Adolfy Hoisie:
Diagnosing the causes and severity of one-sided message contention. 130-139
Verification and Accelerators
- Yen-Jung Chang, Vijay K. Garg:
A parallel algorithm for global states enumeration in concurrent systems. 140-149 - Tiago Cogumbreiro, Raymond Hu, Francisco Martins, Nobuko Yoshida:
Dynamic deadlock verification for general barrier synchronisation. 150-160 - Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao:
VirtCL: a framework for OpenCL device abstraction and management. 161-172 - Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan:
On optimizing machine learning workloads via kernel fusion. 173-182
Algorithms
- Kaiyuan Zhang, Rong Chen, Haibo Chen:
NUMA-aware graph-structured analytics. 183-193 - Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, Haibo Chen:
SYNC or ASYNC: time to fuse for distributed graph-parallel computation. 194-204 - Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, Rezaul Alam Chowdhury:
Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. 205-214
Locking and Locality
- Milind Chabbi, Michael W. Fagan, John M. Mellor-Crummey:
High performance locks for multi-level NUMA systems. 215-226 - Zoltan Majó, Thomas R. Gross:
A library for portable and composable data locality optimizations for NUMA systems. 227-238 - Abdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, Satoshi Matsuoka:
MPI+Threads: runtime contention and remedies. 239-248
Poster Abstracts
- Andrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, Marcelo Cintra:
Fence placement for legacy data-race-free programs via synchronization read detection. 249-250 - Xianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim, Jae W. Lee:
JAWS: a JavaScript framework for adaptive CPU-GPU work sharing. 251-252 - Hyunseok Seo, Jinwook Kim, Min-Soo Kim:
GStream: a graph streaming processing method for large-scale graphs on GPUs. 253-254 - Nabeel AlSaber, Milind Kulkarni:
SemCache++: semantics-aware caching for efficient multi-GPU offloading. 255-256 - Jungwon Kim, Seyong Lee, Jeffrey S. Vetter:
An OpenACC-based unified programming model for multi-accelerator systems. 257-258 - Paul Thomson, Alastair F. Donaldson:
The lazy happens-before relation: better partial-order reduction for systematic concurrency testing. 259-260 - Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Towards batched linear solvers on accelerated hardware platforms. 261-262 - Saurav Muralidharan, Michael Garland, Bryan Catanzaro, Albert Sidelnik, Mary W. Hall:
A collection-oriented programming model for performance portability. 263-264 - Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens:
Gunrock: a high-performance graph processing library on the GPU. 265-266 - Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy M. Amato:
Decoupled load balancing. 267-268 - Ye Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy Logan, Norbert Podhorszki, Jong Youl Choi, Scott Klasky:
Combining phase identification and statistic modeling for automated parallel benchmark generation. 269-270 - Xuanhua Shi, Junling Liang, Sheng Di, Bingsheng He, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, Jianlong Zhong:
Optimization of asynchronous graph processing on GPU with hybrid coloring model. 271-272 - Scott West, Sebastian Nanz, Bertrand Meyer:
Efficient and reasonable object-oriented concurrency. 273-274 - Vassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, Dimitrios S. Nikolopoulos:
A programming model and runtime system for significance-aware energy-efficient computing. 275-276 - Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, Philippas Tsigas:
The lock-free k-LSM relaxed priority queue. 277-278 - Emmanuelle Saillard, Patrick Carribault, Denis Barthou:
Static/Dynamic validation of MPI collective communications in multi-threaded context. 279-280 - Arunmoezhi Ramachandran, Neeraj Mittal:
CASTLE: fast concurrent internal binary search tree using edge-based locking. 281-282 - Madan Mohan Das, Gabriel Southern, Jose Renau:
Section based program analysis to reduce overhead of detecting unsynchronized thread communication. 283-284 - Harshvardhan, Nancy M. Amato, Lawrence Rauchwerger:
A hierarchical approach to reducing communication in parallel graph algorithms. 285-286 - Yifeng Chen, Xiang Cui, Hong Mei:
Tiles: a new language mechanism for heterogeneous parallelism. 287-288 - Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, Danny Dig:
Are web applications ready for parallelism? 289-290
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.