Abstract is missing.
- DozzNoC: Reducing Static and Dynamic Energy in NoCs with Low-latency Voltage Regulators using Machine LearningMark Clark, Yingping Chen, Avinash Karanth, Dongsheng Brian Ma, Ahmed Louri. 1-11 [doi]
- Neksus: An Interconnect for Heterogeneous System-In-Package ArchitecturesVidushi Goyal, Xiaowei Wang, Valeria Bertacco, Reetuparna Das. 12-21 [doi]
- Accelerated Reply Injection for Removing NoC Bottleneck in GPGPUsYunfan Li, Lizhong Chen. 22-31 [doi]
- Machine-agnostic and Communication-aware Designs for MPI on Emerging ArchitecturesJahanzeb Maqbool Hashmi, Shulei Xu, Bharath Ramesh 0005, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. D. K. Panda. 32-41 [doi]
- ClusterSR: Cluster-Aware Scattered Repair in Erasure-Coded StorageZhirong Shen, Jiwu Shu, Zhijie Huang, Yingxun Fu. 42-51 [doi]
- Stitch It Up: Using Progressive Data Storage to Scale ScienceJay F. Lofstead, John Mitchell, Enze Chen. 52-61 [doi]
- HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage EnvironmentsHariharan Devarajan, Anthony Kougkas, Xian-He Sun. 62-72 [doi]
- CanarIO: Sounding the Alarm on IO-Related Performance DegradationMichael R. Wyatt II, Stephen Herbein, Kathleen Shoga, Todd Gamblin, Michela Taufer. 73-83 [doi]
- A Study of Graph Analytics for Massive Datasets on Distributed Multi-GPUsVishwesh Jatala, Roshan Dathathri, Gurbinder Gill, Loc Hoang, V. Krishna Nandivada, Keshav Pingali. 84-94 [doi]
- A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-FormatHang Cao, Liang Yuan, He Zhang, Baodong Wu, Shigang Li, Pengqi Lu, Yunquan Zhang, Yongjun Xu, Minghua Zhang. 95-104 [doi]
- Understanding GPU-Based Lossy Compression for Extreme-Scale Cosmological SimulationsSian Jin, Pascal Grosset, Christopher M. Biwer, Jesus Pulido, Jiannan Tian, Dingwen Tao, James P. Ahrens. 105-115 [doi]
- Optimizing High Performance Markov Clustering for Pre-Exascale ArchitecturesOguz Selvitopi, Md Taufique Hussain, Ariful Azad, Aydin Buluç. 116-126 [doi]
- Tightening Up the Incentive Ratio for Resource Sharing Over the RingsYukun Cheng, Xiaotie Deng, Yuhao Li 0002. 127-136 [doi]
- Communication-Efficient String SortingTimo Bingmann, Peter Sanders 0001, Matthias Schimek. 137-147 [doi]
- SCSL: Optimizing Matching Algorithms to Improve Real-time for Content-based Pub/Sub SystemsTianchen Ding, Shiyou Qian, Jian Cao, Guangtao Xue, Minglu Li 0001. 148-157 [doi]
- Distributed Graph Realizations †John Augustine, Keerti Choudhary, Avi Cohen, David Peleg, Sumathi Sivasubramaniam, Suman Sourav. 158-167 [doi]
- Transaction-Based Core ReliabilitySang Wook Stephen Do, Michel Dubois. 168-179 [doi]
- Understanding the Interplay between Hardware Errors and User Job Characteristics on the Titan SupercomputerSeung-Hwan Lim, Ross G. Miller, Sudharshan S. Vazhkudai. 180-190 [doi]
- EC-Fusion: An Efficient Hybrid Erasure Coding Framework to Improve Both Application and Recovery Performance in Cloud Storage SystemsHan Qiu 0003, Chentao Wu, Jie Li 0002, Minyi Guo, Tong Liu, Xubin He, Yuanyuan Dong, Yafei Zhao. 191-201 [doi]
- Learning an Effective Charging Scheme for Mobile DevicesTang Liu 0001, Baijun Wu, Wenzheng Xu, Xianbo Cao, Jian Peng 0002, Hongyi Wu. 202-211 [doi]
- Optimize Scheduling of Federated Learning on Battery-powered Mobile DevicesCong Wang, Xin Wei, Pengzhan Zhou. 212-221 [doi]
- Harnessing Deep Learning via a Single Building BlockEvangelos Georganas, Kunal Banerjee 0001, Dhiraj D. Kalamkar, Sasikanth Avancha, Anand Venkat, Michael J. Anderson, Greg Henry, Hans Pabst, Alexander Heinecke. 222-233 [doi]
- Experience-Driven Computational Resource Allocation of Federated Learning by Deep Reinforcement LearningYufeng Zhan, Peng Li 0017, Song Guo 0001. 234-243 [doi]
- An Active Learning Method for Empirical Modeling in Performance TuningJiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun. 244-253 [doi]
- DASSA: Parallel DAS Data Storage and Analysis for Subsurface Event DetectionBin Dong 0002, Verónica Rodríguez Tribaldos, Xin-xing, Suren Byna, Jonathan Ajo-Franklin, Kesheng Wu. 254-263 [doi]
- Scaling of Union of Intersections for Inference of Granger Causal Networks from Observational DataMahesh Balasubramanian, Trevor D. Ruiz, Brandon Cook 0001, Prabhat, Sharmodeep Bhattacharyya, Aviral Shrivastava, Kristofer E. Bouchard. 264-273 [doi]
- GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App VettingXiaodong Yu, Fengguo Wei, Xinming Ou, Michela Becchi, Tekin Bicer, Danfeng Daphne Yao. 274-284 [doi]
- Robust Server Placement for Edge ComputingDongyu Lu, Yuben Qu, Fan Wu, Haipeng Dai, Chao Dong, Guihai Chen. 285-294 [doi]
- EdgeIso: Effective Performance Isolation for Edge DevicesYoonsung Nam, YongJun Choi, Byeonghun Yoo, Hyeonsang Eom, Yongseok Son. 295-305 [doi]
- Busy-Time Scheduling on Heterogeneous MachinesRuntian Ren, Xueyan Tang. 306-315 [doi]
- Scheduling Malleable Jobs Under Topological ConstraintsEvripidis Bampis, Konstantinos Dogeas, Alexander V. Kononov, Giorgio Lucarelli, Fanny Pascual. 316-325 [doi]
- XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUsCheng Li, Abdul Dakkak, Jinjun Xiong, Wei Wei 0021, Lingjie Xu, Wen-mei Hwu. 326-327 [doi]
- Exploring the Binary Precision Capabilities of Tensor Cores for Epistasis DetectionRicardo Nobre, Aleksandar Ilic, Sergio Santander-Jiménez, Leonel Sousa. 338-347 [doi]
- Understanding and Improving Persistent Transactions on Optane™ DC MemoryPantea Zardoshti, Michael F. Spear, Aida Vosoughi, Garret Swart. 348-357 [doi]
- CycLedger: A Scalable and Secure Parallel Protocol for Distributed Ledger via ShardingMengqian Zhang, Jichen Li, Zhaohua Chen, Hongyin Chen, Xiaotie Deng. 358-367 [doi]
- Mitigating Large Response Time Fluctuations through Fast Concurrency Adapting in CloudsJianshu Liu, Shungeng Zhang, Qingyang Wang, Jinpeng Wei. 368-377 [doi]
- DAG-Aware Joint Task Scheduling and Cache Management in Spark ClustersYinggen Xu, Liu Liu, Zhijun Ding. 378-387 [doi]
- Solving the Container Explosion Problem for Distributed High Throughput ComputingTim Shaffer, Nicholas L. Hazekamp, Jakob Blomer, Douglas Thain. 388-398 [doi]
- Amoeba: QoS-Awareness and Reduced Resource Usage of Microservices with Serverless ComputingZijun Li, Quan Chen, Shuai Xue, Tao Ma, Yong Yang, Zhuo Song, Minyi Guo. 399-408 [doi]
- Efficient I/O for Neural Network Training with Compressed DataZhao Zhang 0007, Lei Huang, J. Gregory Pauloski, Ian T. Foster. 409-418 [doi]
- Not All Explorations Are Equal: Harnessing Heterogeneous Profiling Cost for Efficient MLaaS TrainingJun Yi, Chengliang Zhang, Wei Wang 0030, Cheng Li, Feng Yan 0001. 419-428 [doi]
- ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine LearningSaeed Soori, Bugra Can, Mert Gürbüzbalaban, Maryam Mehri Dehnavi. 429-439 [doi]
- Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUsCheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu. 440-450 [doi]
- Adaptive Page Migration for Irregular Data-intensive Applications under GPU Memory OversubscriptionDebashis Ganguly, Ziyu Zhang, Jun Yang, Rami G. Melhem. 451-461 [doi]
- LOGAN: High-Performance GPU-Based X-Drop Long-Read AlignmentAlberto Zeni, Giulia Guidi, Marquita Ellis, Nan Ding, Marco D. Santambrogio, Steven A. Hofmeyr, Aydin Buluç, Leonid Oliker, Katherine A. Yelick. 462-471 [doi]
- Coordinated Page Prefetch and Eviction for Memory Oversubscription Management in GPUsQi Yu 0003, Bruce R. Childers, Libo Huang, Cheng Qian, Hui Guo 0004, Zhiying Wang. 472-482 [doi]
- A Study of Single and Multi-device Synchronization Methods in Nvidia GPUsLingqi Zhang, Mohamed Wahib, Haoyu Zhang, Satoshi Matsuoka. 483-493 [doi]
- DPF-ECC: Accelerating Elliptic Curve Cryptography with Floating-Point Computing Power of GPUsLili Gao, Fangyu Zheng, Niall Emmart, Jiankuo Dong, Jingqiang Lin, Charles C. Weems. 494-504 [doi]
- Scalability Challenges of an Industrial Implicit Finite Element CodeFrancois-Henry Rouet, Cleve Ashcraft, Jef Dawson, Roger Grimes, Erman Guleryuz, Seid Koric, Robert F. Lucas, James S. Ong, Todd A. Simons, Ting-Ting Zhu. 505-514 [doi]
- ETH: An Architecture for Exploring the Design Space of In-situ Scientific VisualizationGregory D. Abram, Vignesh Adhinarayanan, Wu-chun Feng, David H. Rogers, James P. Ahrens. 515-526 [doi]
- Scaling Betweenness Approximation to Billions of Edges by MPI-based Adaptive SamplingAlexander van der Grinten, Henning Meyerhenke. 527-535 [doi]
- Improved Intermediate Data Management for MapReduce FrameworksHaoyu Wang, Haiying Shen, Charles Reiss, Arnim Jain, Yunqiao Zhang. 536-545 [doi]
- Bandwidth-Aware Page Placement in NUMADavid Gureya, João Neto, Reza Karimi, João Barreto 0001, Pramod Bhatotia, Vivien Quéma, Rodrigo Rodrigues, Paolo Romano 0002, Vladimir Vlassov. 546-556 [doi]
- HCompress: Hierarchical Data Compression for Multi-Tiered Storage EnvironmentsHariharan Devarajan, Anthony Kougkas, Luke Logan, Xian-He Sun. 557-566 [doi]
- FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point DataRobert Underwood, Sheng Di, Jon C. Calhoun, Franck Cappello. 567-577 [doi]
- DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip MultiprocessorsNadja Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericàs. 578-589 [doi]
- Coordinated Management of Processor Configuration and Cache Partitioning to Optimize Energy under QoS ConstraintsMehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström. 590-601 [doi]
- StragglerHelper: Alleviating Straggling in Computing Clusters via Sharing Memory Access PatternsWenjie Liu 0002, Ping Huang 0001, Xubin He. 602-611 [doi]
- Evaluating the Numerical Stability of Posit ArithmeticNicholas Buoncristiani, Sanjana Shah, David Donofrio, John Shalf. 612-621 [doi]
- Varity: Quantifying Floating-Point Variations in HPC Systems Through Randomized TestingIgnacio Laguna. 622-633 [doi]
- Demystifying Tensor Cores to Optimize Half-Precision Matrix MultiplyDa Yan 0002, Wei Wang 0030, Xiaowen Chu. 634-643 [doi]
- Data Collection of IoT Devices Using an Energy-Constrained UAVYuchen Li, Weifa Liang, Wenzheng Xu, Xiaohua Jia. 644-653 [doi]
- Argus: Multi-Level Service Visibility Scoping for Internet-of-Things in Enterprise EnvironmentsQian Zhou, Omkant Pandey, Fan Ye 0003. 654-663 [doi]
- G-PBFT: A Location-based and Scalable Consensus Protocol for IoT-Blockchain ApplicationsLaphou Lao, Xiaohai Dai, Bin Xiao 0001, Songtao Guo. 664-673 [doi]
- Byzantine Generalized Lattice AgreementGiuseppe Antonio Di Luna, Emmanuelle Anceaume, Leonardo Querzoni. 674-683 [doi]
- A Heterogeneous PIM Hardware-Software Co-Design for Energy-Efficient Graph ProcessingYu Huang 0013, Long Zheng 0003, Pengcheng Yao, Jieshan Zhao, Xiaofei Liao, Hai Jin 0001, Jingling Xue. 684-695 [doi]
- Spara: An Energy-Efficient ReRAM-Based Accelerator for Sparse Graph Analytics ApplicationsLong Zheng 0003, Jieshan Zhao, Yu Huang 0013, Qinggang Wang, Zhen Zeng, Jingling Xue, Xiaofei Liao, Hai Jin 0001. 696-707 [doi]
- Optimal Encoding and Decoding Algorithms for the RAID-6 Liberation CodesZhijie Huang, Hong Jiang, Zhirong Shen, Hao Che, Nong Xiao, Ning Li. 708-717 [doi]
- Sturgeon: Preference-aware Co-location for Improving Utilization of Power Constrained ComputersPu Pang, Quan Chen, Deze Zeng, Chao Li, Jingwen Leng, Wenli Zheng, Minyi Guo. 718-727 [doi]
- A High-Throughput Solver for Marginalized Graph Kernels on GPUYu-Hang Tang, Oguz Selvitopi, Doru-Thom Popovici, Aydin Buluç. 728-738 [doi]
- Dynamic Graphs on the GPUMuhammad A. Awad, Saman Ashkiani, Serban D. Porumbescu, John D. Owens. 739-748 [doi]
- Accelerating Parallel Hierarchical Matrix-Vector Products via Data-Driven SamplingLucas Erlandson, Difeng Cai, Yuanzhe Xi, Edmond Chow. 749-758 [doi]
- NC Algorithms for Popular Matchings in One-Sided Preference Systems and Related ProblemsChangyong Hu, Vijay K. Garg. 759-768 [doi]
- Smartly Handling Renewable Energy Instability in Supporting A Cloud DatacenterJiechao Gao, Haoyu Wang, Haiying Shen. 769-778 [doi]
- A Self-Optimized Generic Workload Prediction Framework for Cloud ComputingVinodh Kumaran Jayakumar, Jaewoo Lee, In Kee Kim, Wei Wang 0054. 779-788 [doi]
- SeeSAw: Optimizing Performance of In-Situ Analytics Applications under Power ConstraintsIvana Marincic, Venkatram Vishwanath, Henry Hoffmann. 789-798 [doi]
- What does Power Consumption Behavior of HPC Jobs Reveal? : Demystifying, Quantifying, and Predicting Power Consumption CharacteristicsTirthak Patel, Adam Wagenhäuser, Christopher Eibel, Timo Hönig, Thomas Zeiser, Devesh Tiwari. 799-809 [doi]
- Efficient Parallel and Adaptive Partitioning for Load-balancing in Spatial JoinJie Yang, Satish Puri. 810-820 [doi]
- Union: An Automatic Workload Manager for Accelerating Network SimulationXin Wang, Misbah Mubarak, Yao Kang, Robert B. Ross, Zhiling Lan. 821-830 [doi]
- Auto-tuning Parameter Choices in HPC Applications using Bayesian OptimizationHarshitha Menon, Abhinav Bhatele, Todd Gamblin. 831-840 [doi]
- Inter-Job Scheduling of High-Throughput Material Screening ApplicationsZhihui Du, Xinning Hui, Yurui Wang, Jun Jiang, Jason Liu, Baokun Lu, Chongyu Wang. 841-852 [doi]
- Reservation and Checkpointing Strategies for Stochastic JobsAna Gainaru, Brice Goglin, Valentin Honoré, Guillaume Pallez Aupy, Padma Raghavan, Yves Robert, Hongyang Sun. 853-863 [doi]
- A Scheduling Approach to Incremental Maintenance of Datalog ProgramsShikha Singh 0002, Sergey Madaminov, Michael A. Bender, Michael Ferdman, Ryan Johnson, Benjamin Moseley, Hung Q. Ngo 0001, Dung Nguyen, Soeren Olesen, Kurt Stirewalt, Geoffrey Washburn. 864-873 [doi]
- Dynamic Scheduling in Distributed Transactional MemoryCostas Busch, Maurice Herlihy, Miroslav Popovic, Gokarna Sharma. 874-883 [doi]
- Learning Cost-Effective Sampling Strategies for Empirical Performance ModelingMarcus Ritter, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, Felix Wolf 0001. 884-895 [doi]
- The Case of Performance Variability on Dragonfly-based SystemsAbhinav Bhatele, Jayaraman J. Thiagarajan, Taylor Groves, Rushil Anirudh, Staci A. Smith, Brandon Cook 0001, David K. Lowenthal. 896-905 [doi]
- Predicting and Comparing the Performance of Array Management LibrariesDonghe Kang, Oliver Rübel, Suren Byna, Spyros Blanas. 906-915 [doi]
- Demystifying the Performance of HPC Scientific Applications on NVM-based Memory SystemsIvy Bo Peng, Kai Wu, Jie Ren 0015, Dong Li, Maya B. Gokhale. 916-925 [doi]
- Packet-in Request Redirection for Minimizing Control Plane Response TimeRui Xia, Haipeng Dai, Jiaqi Zheng, Hong Xu 0001, Meng Li 0010, Guihai Chen. 926-935 [doi]
- PCGCN: Partition-Centric Processing for Accelerating Graph Convolutional NetworkChao Tian, Lingxiao Ma, Zhi Yang, Yafei Dai. 936-945 [doi]
- ConMidbox: Consolidated Middleboxes Selection and Routing in SDN/NFV-Enabled NetworksGuiyan Liu, Songtao Guo, Pan Li, Liang Liu. 946-955 [doi]
- Scalable and Memory-Efficient Kernel Ridge RegressionGustavo Chávez, Yang Liu, Pieter Ghysels, Xiaoye Sherry Li, Elizaveta Rebrova. 956-965 [doi]
- SSDKeeper: Self-Adapting Channel Allocation to Improve the Performance of SSD DevicesRenping Liu, Xianzhang Chen, Yujuan Tan, Runyu Zhang, Liang Liang 0002, Duo Liu. 966-975 [doi]
- FlashKey: A High-Performance Flash Friendly Key-Value StoreMadhurima Ray, Krishna Kant 0001, Peng Li, Sanjeev Trika. 976-985 [doi]
- Pacon: Improving Scalability and Efficiency of Metadata Service through Partial ConsistencyYubo Liu, Yutong Lu, Zhiguang Chen, Ming Zhao. 986-996 [doi]
- XPlacer: Automatic Analysis of Data Access Patterns on Heterogeneous CPU/GPU SystemsPeter Pirkelbauer, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao. 997-1007 [doi]
- Improving Transactional Code Generation via Variable Annotation and Barrier ElisionJoão P. L. de Carvalho, Bruno Chinelato Honorio, Alexandro Baldassin, Guido Araujo. 1008-1017 [doi]
- Evaluating Thread Coarsening and Low-cost Synchronization on Intel Xeon PhiHancheng Wu, Michela Becchi. 1018-1029 [doi]
- AnySeq: A High Performance Sequence Alignment Library based on Partial EvaluationAndré Müller, Bertil Schmidt, Andreas Hildebrandt 0001, Richard Membarth, Roland Leißa, Matthis Kruse, Sebastian Hack. 1030-1040 [doi]
- Analysis of a List Scheduling Algorithm for Task Graphs on Two Types of ResourcesLionel Eyraud-Dubois, Suraj Kumar. 1041-1050 [doi]
- Optimal Convex Hull Formation on a Grid by Asynchronous Robots with LightsRory Hector, Ramachandran Vaidyanathan, Gokarna Sharma, Jerry L. Trahan. 1051-1060 [doi]
- On the Complexity of Conditional DAG Scheduling in Multiprocessor SystemsAlberto Marchetti-Spaccamela, Nicole Megow, Jens Schlöter, Martin Skutella, Leen Stougie. 1061-1070 [doi]
- Weaver: Efficient Coflow Scheduling in Heterogeneous Parallel NetworksXin Sunny Huang, Yiting Xia, T. S. Eugene Ng. 1071-1081 [doi]
- Fault-Tolerant Containers Using NiLiConDiyu Zhou, Yuval Tamir. 1082-1091 [doi]
- Aarohi: Making Real-Time Node Failure Prediction FeasibleAnwesha Das, Frank Mueller, Barry Rountree. 1092-1101 [doi]
- FP4S: Fragment-based Parallel State Recovery for Stateful Stream ApplicationsPinchao Liu, Hailu Xu, Dilma Da Silva, Qingyang Wang, Sarker Tanzir Ahmed, Liting Hu. 1102-1111 [doi]
- Implementation and Evaluation of a Hardware Decentralized Synchronization Lock for MPSoCsMaxime France-Pillois, Jérôme Martin, Frédéric Rousseau. 1112-1121 [doi]
- Communication-Efficient Jaccard similarity for High-Performance Distributed Genome ComparisonsMaciej Besta, Raghavendra Kanakagiri, Harun Mustafa, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, Edgar Solomonik. 1122-1132 [doi]
- Engineering Worst-Case Inputs for Pairwise Merge Sort on GPUsKyle Berney, Nodari Sitchinava. 1133-1142 [doi]
- The Impossibility of Fast TransactionsKarolos Antoniadis, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel. 1143-1154 [doi]