Abstract is missing.
- Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLightHaohuan Fu, Junfeng Liao, Nan Ding, Xiaohui Duan, Lin Gan, Yishuang Liang, Xinliang Wang, Jinzhe Yang, Yan Zheng, Weiguo Liu, Lanning Wang, Guangwen Yang. 1 [doi]
- 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenariosHaohuan Fu, Conghui He, Bingwei Chen, Zekun Yin, Zhenguo Zhang, Wenqiang Zhang, Tingjian Zhang, Wei Xue, Weiguo Liu, Wanwang Yin, Guangwen Yang, Xiaofei Chen. 2 [doi]
- Massively parallel 3D image reconstructionXiao Wang 0004, Amit Sabne, Putt Sakdhnagool, Sherman J. Kisner, Charles A. Bouman, Samuel P. Midkiff. 3 [doi]
- LocoFS: a loosely-coupled metadata service for distributed file systemsSiyang Li, Youyou Lu, Jiwu Shu, Yang Hu, Tao Li. 4 [doi]
- Tagit: an integrated indexing and search service for file systemsHyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Geoffroy R. Vallée, Seung-Hwan Lim, Ali Raza Butt. 5 [doi]
- A configurable rule based classful token bucket filter network request scheduler for the lustre file systemYingjin Qian, Xi Li, Shuichi Ihara, Lingfang Zeng, Jürgen Kaiser, Tim Süß, André Brinkmann. 6 [doi]
- Deep learning at 15PF: supervised and semi-supervised classification for scientific dataThorsten Kurth, Jian Zhang, Nadathur Satish, Evan Racah, Ioannis Mitliagkas, Md. Mostofa Ali Patwary, Tareq M. Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan 0002, Prabhat, Pradeep Dubey. 7 [doi]
- Understanding error propagation in deep learning neural network (DNN) accelerators and applicationsGuanpeng Li, Siva Kumar Sastry Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel S. Emer, Stephen W. Keckler. 8 [doi]
- Scaling deep learning on GPU and knights landing clustersYang You, Aydin Buluç, James Demmel. 9 [doi]
- Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processingHui Guan, Xipeng Shen, Hamid Krim. 10 [doi]
- DataRaceBench: a benchmark suite for systematic evaluation of data race detection toolsChunhua Liao, Pei-Hung Lin, Joshua Asplund, Markus Schordan, Ian Karlin. 11 [doi]
- Optimizing the query performance of block index through data analysis and I/O modelingTzu-Hsien Wu, Jerry Chi-Yuan Chou, Shyng Hao, Bin Dong, Scott Klasky, Kesheng Wu. 12 [doi]
- Sympiler: transforming sparse matrix codes by decoupling symbolic analysisKazem Cheshmi, Shoaib Kamil, Michelle Mills Strout, Maryam Mehri Dehnavi. 13 [doi]
- Control replication: compiling implicit parallelism to efficient SPMD with logical regionsElliott Slaughter, Wonchan Lee, Sean Treichler, Wen Zhang, Michael Bauer, Galen M. Shipman, Patrick S. McCormick, Alex Aiken. 14 [doi]
- Optimizing geometric multigrid method computation using a DSL approachVinay Vasista, Kumudha Narasimhan, Siddharth Bhat, Uday Bondhugula. 15 [doi]
- Efficient process mapping in geo-distributed cloud data centersAmelie Chi Zhou, Yifan Gong, Bingsheng He, Jidong Zhai. 16 [doi]
- Topology-aware GPU scheduling for learning workloads in cloud environmentsMarcelo Amaral, Jordà Polo, David Carrera, Seetharami R. Seelam, Malgorzata Steinder. 17 [doi]
- Probabilistic guarantees of execution duration for Amazon spot instancesRich Wolski, John Brevik, Ryan Chard, Kyle Chard. 18 [doi]
- A framework for scalable biophysics-based image analysisAmir Gholami, Andreas Mang, Klaudius Scheufele, Christos Davatzikos, Miriam Mehl, George Biros. 19 [doi]
- Galactos: computing the anisotropic 3-point correlation function for 2 billion galaxiesBrian Friesen, Md. Mostofa Ali Patwary, Brian Austin, Nadathur Satish, Zachary Slepian, Narayanan Sundaram, Deborah Bard, Daniel J. Eisenstein, Jack Deslippe, Pradeep Dubey, Prabhat. 20 [doi]
- Extreme scale multi-physics simulations of the tsunamigenic 2004 sumatra megathrust earthquakeCarsten Uphoff, Sebastian Rettenberger, Michael Bader, Elizabeth H. Madden, Thomas Ulrich, Stephanie Wollherr, Alice-Agnes Gabriel. 21 [doi]
- GPU triggered networking for intra-kernel communicationsMichael LeBeane, Khaled Hamidouche, Brad Benton, Mauricio Breternitz, Steven K. Reinhardt, Lizy K. John. 22 [doi]
- Gravel: fine-grain GPU-initiated network messagesMarc S. Orr, Shuai Che, Bradford M. Beckmann, Mark Oskin, Steven K. Reinhardt, David A. Wood. 23 [doi]
- Toward standardized near-data processing with unrestricted data placement for GPUsGwangsun Kim, Niladrish Chatterjee, Mike O'Connor, Kevin Hsieh. 24 [doi]
- Understanding object-level memory access patterns across the spectrumXu Ji, Chao Wang, Nosayba El-Sayed, Xiaosong Ma, Youngjae Kim, Sudharshan S. Vazhkudai, Wei Xue, Daniel Sánchez 0003. 25 [doi]
- Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernelsAng Li, Weifeng Liu 0002, Mads Ruben Burgdorff Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, Shuaiwen Leon Song. 26 [doi]
- Large-scale adaptive mesh simulations through non-volatile byte-addressable memoryBao Nguyen, Hua Tan, Xuechen Zhang. 27 [doi]
- Experimental and analytical study of Xeon Phi reliabilityDaniel A. G. de Oliveira, Laércio Lima Pilla, Nathan DeBardeleben, Sean Blanchard, Heather Quinn, Israel Koren, Philippe O. A. Navaux, Paolo Rech. 28 [doi]
- REFINE: realistic fault injection via compiler-based instrumentation for accuracy, portability and speedGiorgis Georgakoudis, Ignacio Laguna, Dimitrios S. Nikolopoulos, Martin Schulz 0001. 29 [doi]
- Correcting soft errors online in fast fourier transformXin Liang, Jieyang Chen, Dingwen Tao, Sihuan Li, Panruo Wu, Hongbo Li, Kaiming Ouyang, Yuanlai Liu, Fengguang Song, Zizhong Chen. 30 [doi]
- Performance modeling under resource constraints using deep transfer learningAniruddha Marathe, Rushil Anirudh, Nikhil Jain, Abhinav Bhatele, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Jae-Seung Yeom, Barry Rountree, Todd Gamblin. 31 [doi]
- Obtaining dynamic scheduling policies with simulation and machine learningDanilo Carastan-Santos, Raphael Y. de Camargo. 32 [doi]
- 0.5 petabyte simulation of a 45-qubit quantum circuitThomas Häner, Damian S. Steiger. 33 [doi]
- Representative paths analysisNathan R. Tallent, Darren J. Kerbyson, Adolfy Hoisie. 34 [doi]
- ScrubJay: deriving knowledge from the disarray of HPC performance dataAlfredo Giménez, Todd Gamblin, Abhinav Bhatele, Chad Wood, Kathleen Shoga, Aniruddha Marathe, Peer-Timo Bremer, Bernd Hamann, Martin Schulz 0001. 35 [doi]
- Charliecloud: unprivileged containers for user-defined software stacks in HPCReid Priedhorsky, Tim Randles. 36 [doi]
- Securing HPC: development of a low cost, open source multi-factor authentication infrastructureW. Cyrus Proctor, Patrick Storm, Matthew R. Hanlon, Nathaniel Mendoza. 37 [doi]
- Embracing a new era of highly efficient and productive quantum Monte Carlo simulationsAmrita Mathuriya, Ye Luo, Raymond C. Clay III, Anouar Benali, Luke Shulenburger, Jeongnim Kim. 38 [doi]
- ™ processorVladimir Mironov, Yuri Alexeev, Kristopher Keipert, Michael D'mello, Alexander Moskovsky, Mark S. Gordon. 39 [doi]
- Efficient and scalable calculation of complex band structure using Sakurai-Sugiura methodShigeru Iwase, Yasunori Futamura, Akira Imakura, Tetsuya Sakurai, Tomoya Ono. 40 [doi]
- Towards fine-grained dynamic tuning of HPC applications on modern multi-core architecturesMohammed Sourouri, Espen Birger Raknes, Nico Reissmann, Johannes Langguth, Daniel Hackenberg, Robert Schöne, Per Gunnar Kjeldsberg. 41 [doi]
- CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learningYan Li, Kenneth Chang, Oceane Bel, Ethan L. Miller, Darrell D. E. Long. 42 [doi]
- Input-aware auto-tuning of compute-bound HPC kernelsPhilippe Tillet, David Cox. 43 [doi]
- Failures in large scale systems: long-term measurement, analysis, and implicationsSaurabh Gupta, Tirthak Patel, Christian Engelmann, Devesh Tiwari. 44 [doi]
- GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facilitySudharshan S. Vazhkudai, Ross Miller, Devesh Tiwari, Christopher Zimmer, Feiyi Wang, Sarp Oral, Raghul Gunasekaran, Deryl Steinert. 45 [doi]
- Scientific user behavior and data-sharing trends in a petascale file systemSeung-Hwan Lim, Hyogi Sim, Raghul Gunasekaran, Sudharshan S. Vazhkudai. 46 [doi]
- Scaling betweenness centrality using communication-efficient sparse matrix multiplicationEdgar Solomonik, Maciej Besta, Flavio Vella, Torsten Hoefler. 47 [doi]
- Distributed southwell: an iterative method with low communication costsJordi Wolfson-Pou, Edmond Chow. 48 [doi]
- Tessellating stencilsLiang Yuan, Yunquan Zhang, Peng Guo, Shan Huang. 49 [doi]
- Predicting the performance impact of different fat-tree configurationsNikhil Jain, Abhinav Bhatele, Louis H. Howell, David Böhme, Ian Karlin, Edgar A. León, Misbah Mubarak, Noah Wolfe, Todd Gamblin, Matthew L. Leininger. 50 [doi]
- A comparative study of SDN and adaptive routing on dragonfly networksPeyman Faizian, Md Atiqul Mollah, Zhou Tong, Xin Yuan, Michael Lang 0003. 51 [doi]
- Run-to-run variability on Xeon Phi based cray XC systemsSudheer Chunduri, Kevin Harms, Scott Parker, Vitali A. Morozov, Samuel Oshin, Naveen Cherukuri, Kalyan Kumaran. 52 [doi]
- Geometry-oblivious FMM for compressing dense SPD matricesChenhan D. Yu, James Levitt, Severin Reiz, George Biros. 53 [doi]
- Low communication FMM-accelerated FFT on GPUsCris Cecka. 54 [doi]
- Designing vector-friendly compact BLAS and LAPACK kernelsKyungjoo Kim, Timothy B. Costa, Mehmet Deveci, Andrew M. Bradley, Simon D. Hammond, Murat Efe Guney, Sarah Knepper, Shane Story, Sivasankaran Rajamanickam. 55 [doi]
- Transactional NVM cache with high performance and crash consistencyQingsong Wei, Chundong Wang, Cheng Chen, Yechao Yang, Jun Yang, Mingdi Xue. 56 [doi]
- PapyrusKV: a high-performance parallel key-value store for distributed NVM architecturesJungwon Kim, Seyong Lee, Jeffrey S. Vetter. 57 [doi]
- Unimem: runtime data managementon non-volatile memory-based heterogeneous main memoryKai Wu, Yingchao Huang, Dong Li. 58 [doi]
- sPIN: high-performance streaming processing in the networkTorsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, Ron Brightwell. 59 [doi]
- Leveraging near data processing for high-performance checkpoint/restartAbhinav Agrawal, Gabriel H. Loh, James Tuck. 60 [doi]
- Melissa: large scale in transit sensitivity analysis avoiding intermediate filesThéophile Terraz, Alejandro Ribés, Yvan Fournier, Bertrand Iooss, Bruno Raffin. 61 [doi]
- Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1Ken Raffenetti, Abdelhalim Amer, Lena Oden, Charles Archer, Wesley Bland, Hajime Fujita 0002, Yanfei Guo, Tomislav Janjusic, Dmitry Durnov, Michael Blocksome, Min-Si, Sangmin Seo, Akhil Langer, Gengbin Zheng, Masamichi Takagi, Paul Coffman, Jithin Jose, Sayantan Sur, Alexander Sannikov, Sergey Oblomov, Michael Chuvelev, Masayuki Hatanaka, Xin Zhao, Paul F. Fischer, Thilina Rathnayake, Matthew Otten, Misun Min, Pavan Balaji. 62 [doi]
- Parastack: efficient hang detection for MPI programs at large scaleHongbo Li, Zizhong Chen, Rajiv Gupta. 63 [doi]
- Scalable reduction collectives with data partitioning-based multi-leader designMohammadreza Bayatpour, Sourav Chakraborty 0003, Hari Subramoni, Xiaoyi Lu, Dhabaleswar K. Panda. 64 [doi]