Abstract is missing.
- Scaling parallel 3-D FFT with non-blocking MPI collectivesSukhyun Song, Jeffrey K. Hollingsworth. 1-8 [doi]
- Exploiting data representation for fault toleranceJames Elliott, Mark Hoemmen, Frank Mueller. 9-16 [doi]
- VCube: a provably scalable distributed diagnosis algorithmElias Procópio Duarte Jr., Luis Carlos Erpen De Bona, Vinicius K. Ruoso. 17-22 [doi]
- TX: algorithmic energy saving for distributed dense matrix factorizationsLi Tan, Zizhong Chen. 23-30 [doi]
- CholeskyQR2: a simple and communication-avoiding algorithm for computing a tall-skinny QR factorization on a large-scale parallel systemTakeshi Fukaya, Yuji Nakatsukasa, Yuka Yanagisawa, Yusaku Yamamoto. 31-38 [doi]
- Deflation strategies to improve the convergence of communication-avoiding GMRESIchitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra. 39-46 [doi]
- A framework for parallel genetic algorithms for distributed memory architecturesDobromir Georgiev, Emanouil I. Atanassov, Vassil N. Alexandrov. 47-53 [doi]
- The anatomy of Mr. Scan: a dissection of performance of an extreme scale GPU-based clustering algorithmBenjamin Welton, Barton P. Miller. 54-60 [doi]
- Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processorsChongxiao Cao, Mark Gates, Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Jack J. Dongarra. 61-68 [doi]
- A hierarchical tridiagonal system solver for heterogenous supercomputersXinliang Wang, Yangtong Xu, Wei Xue. 69-76 [doi]