Abstract is missing.
- Tessellating Star StencilsLiang Yuan, Shan Huang, Yunquan Zhang, Hang Cao. [doi]
- Transfer Learning based Failure Prediction for Minority Disks in Large Data Centers of Heterogeneous Disk SystemsJi Zhang, Ke Zhou 0001, Ping Huang, Xubin He, Zhili Xiao, Bin Cheng, Yongguang Ji, Yinhu Wang. [doi]
- QLEC: A Machine-Learning-Based Energy-Efficient Clustering Algorithm to Prolong Network Lifespan for IoT in High-Dimensional SpaceKe Li, Haowei Huang, Xiaofeng Gao, Fan Wu, Guihai Chen. [doi]
- Refactoring and Optimizing WRF Model on Sunway TaihuLightKai Xu, Zhenya Song, Yuandong Chan, Shida Wang, Xiangxu Meng, Weiguo Liu, Wei Xue. [doi]
- CPpf: a prefetch aware LLC partitioning approachJun Xiao, Andy D. Pimentel, Xu Liu. [doi]
- Gossip: Efficient Communication Primitives for Multi-CPU SystemsRobin Kobus, Daniel Jünger, Christian Hundt 0002, Bertil Schmidt. [doi]
- Speculative Scheduling for Stochastic HPC ApplicationsAna Gainaru, Hongyang Sun, Guillaume Pallez Aupy, Padma Raghavan. [doi]
- Runtime Adaptive Task Inlining on Asynchronous Multitasking Runtime SystemsBibek Wagle, Mohammad Alaul Haque Monil, Kevin A. Huck, Allen D. Malony, Adrian Serio, Hartmut Kaiser. [doi]
- Nested Virtualization Without the NestMathieu Bacou, Alain Tchana, Daniel Hagimont. [doi]
- Accelerating All-Edge Common Neighbor Counting on Three ProcessorsYulin Che, Zhuohang Lai, Shixuan Sun, Qiong Luo 0001, Yue Wang 0012. [doi]
- OSP: Overlapping Computation and Communication in Parameter Server for Fast Machine LearningHaozhao Wang, Song Guo, Ruixuan Li. [doi]
- Predictable GPUs Frequency Scaling for Energy and PerformanceKaijie Fan, Biagio Cosenza, Ben H. H. Juurlink. [doi]
- A Unified Optimization Approach for CNN Model Inference on Integrated GPUsLeyuan Wang, Zhi Chen, Yizhi Liu, Yao Wang, Lianmin Zheng, Mu Li, Yida Wang. [doi]
- Breaking Band: A Breakdown of High-performance CommunicationRohit Zambre, Megan Grodowitz, Aparna Chandramowlishwaran, Pavel Shamis. [doi]
- On Integration of Appends and Merges in Log-Structured Merge TreesCaixin Gong, Shuibing He, Yili Gong, Yingchun Lei. [doi]
- TEA: A Traffic-efficient Erasure-coded Archival Scheme for In-memory StoresBin Xu, Jianzhong Huang 0001, Qiang Cao, Xiao Qin 0001. [doi]
- Accelerated Work StealingD. Brian Larkins, John Snyder, James Dinan. [doi]
- SaC: Exploiting Execution-Time Slack to Save Energy in Heterogeneous Multicore SystemsMuhammad Waqar Azhar, Miquel Pericàs, Per Stenström. [doi]
- diBELLA: Distributed Long Read to Long Read AlignmentMarquita Ellis, Giulia Guidi, Aydin Buluç, Leonid Oliker, Katherine A. Yelick. [doi]
- Distributed Join Algorithms on Multi-CPU Clusters with GPUDirect RDMAChengxin Guo, Hong Chen, Feng Zhang, Cuiping Li. [doi]
- DICER: Diligent Cache Partitioning for Efficient Workload ConsolidationKonstantinos Nikas, Nikela Papadopoulou, Dimitra Giantsidi, Vasileios Karakostas, Georgios I. Goumas, Nectarios Koziris. [doi]
- EMBA: Efficient Memory Bandwidth Allocation to Improve Performance on Intel Commodity ProcessorYaocheng Xiang, Chencheng Ye, Xiaolin Wang, Yingwei Luo, Zhenlin Wang. [doi]
- Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing WorkflowsAmelie Chi Zhou, Yao Xiao, Bingsheng He, Shadi Ibrahim, Reynold Cheng. [doi]
- Artemis: A Practical Low-latency Naming and Routing SystemXuebing Li, Bingyang Liu, Yang Chen, Yu Xiao, Jiaxin Tang, Xin Wang. [doi]
- BPP: A Realtime Block Access Pattern Mining Scheme for I/O PredictionChunjie Zhu, Fang Wang, Binbing Hou. [doi]
- Cosin: Controllable Social Influence Maximization and Its Distributed Implementation in Large-scale Social NetworksJingya Zhou, Jianxi Fan, Jin Wang 0009. [doi]
- Approximate Code: A Cost-Effective Erasure Coding Framework for Tiered Video Storage in Cloud SystemsHuayi Jin, Chentao Wu, Xin Xie, Jie Li 0002, Minyi Guo, Hao Lin, Jianfeng Zhang. [doi]
- Dynamic Load Balancing in Hybrid Switching Data Center Networks with ConvertersJiaqi Zheng, Qiming Zheng, Xiaofeng Gao, Guihai Chen. [doi]
- Adaptive Learning for Concept Drift in Application Performance ModelingSandeep Madireddy, Prasanna Balaprakash, Philip H. Carns, Robert Latham, Glenn K. Lockwood, Robert B. Ross, Shane Snyder, Stefan M. Wild. [doi]
- Stage Delay Scheduling: Speeding up DAG-style Data Analytics Jobs with Resource InterleavingWujie Shao, Fei Xu, Li Chen, Haoyue Zheng, Fangming Liu. [doi]
- Data and Thread Placement in NUMA Architectures: A Statistical Learning ApproachNicolas Denoyelle, Brice Goglin, Emmanuel Jeannot, Thomas Ropars. [doi]
- Gravitational Octree Code Performance Evaluation on Volta GPUYohei Miki. [doi]
- Unleashing the Scalability Potential of Power-Constrained Data Center in the Microservice EraXiaofeng Hou, Jiacheng Liu, Chao Li 0009, Minyi Guo. [doi]
- AVR: Reducing Memory Traffic with Approximate Value ReconstructionAlbin Eldstål-Damlin, Pedro Trancoso, Ioannis Sourdis. [doi]
- Fast Recovery Techniques for Erasure-coded Clusters in Non-uniform Traffic NetworkYunren Bai, Zihan Xu, Haixia Wang, Dongsheng Wang. [doi]
- The Communication-Overlapped Hybrid Decomposition Parallel Algorithm for Multi-Scale Fluid SimulationsYi Liu, Xiaowei Guo, Chao Li, Canqun Yang, Xinbiao Gan, Peng Zhang, Yi Wang, Ran Zhao, Sijiang Fan. [doi]
- Massively Parallel ANS Decoding on GPUsAndré Weißenberger, Bertil Schmidt. [doi]
- Faster parallel collision detection at high resolution for CNC milling applicationsXin Chen, Dmytro Konobrytskyi, Thomas M. Tucker, Thomas R. Kurfess, Richard W. Vuduc. [doi]
- PhSIH: A Lightweight Parallelization of Event Matching in Content-based Pub/Sub SystemsZhengyu Liao, Shiyou Qian, Jian Cao, Yanhua Cao, Guangtao Xue, Jiadi Yu, Yanmin Zhu, Minglu Li. [doi]
- Design Exploration of Multi-tier Interconnection Networks for Exascale SystemsJavier Navaridas, Joshua Lant, Jose Antonio Pascual, Mikel Luján, John Goodacre. [doi]
- Performance, Energy, and Scalability Analysis and Improvement of Parallel Cancer Deep Learning CANDLE BenchmarksXingfu Wu, Valerie E. Taylor, Justin M. Wozniak, Rick Stevens, Thomas S. Brettin, Fangfang Xia. [doi]
- Improving Short Job Latency Performance in Hybrid Job Schedulers with DiceWei Zhou, K. Preston White, Hongfeng Yu. [doi]
- On Max-min Fair Resource Allocation for Distributed Job ExecutionYitong Guan, Chuanyou Li, Xueyan Tang. [doi]
- RFPL: A Recovery Friendly Parity Logging Scheme for Reducing Small Write Penalty of SSD RAIDGaoxiang Xu, Dan Feng, Zhipeng Tan, Xinyan Zhang, Jie Xu, xi-shu, Yifeng Zhu. [doi]
- Reducing Kernel Surface Areas for Isolation and ScalabilityDaniel Zahka, Brian Kocoloski, Kate Keahey. [doi]
- Adaptive Routing Reconfigurations to Minimize Flow Cost in SDN-Based Data Center NetworksAkbar Majidi, Xiaofeng Gao, Shunjia Zhu, Nazila Jahanbakhsh, Guihai Chen. [doi]
- A Specialized Concurrent Queue for Scheduling Irregular Workloads on GPUsDavid Troendle, Tuan Ta, Byunghyun Jang. [doi]
- I/O Characterization and Performance Evaluation of BeeGFS for Deep LearningFahim Chowdhury, Yue Zhu, Todd Heer, Saul Paredes, Adam Moody, Robin Goldstone, Kathryn Mohror, Weikuan Yu. [doi]
- Machine Learning for Fine-Grained Hardware Prefetcher ControlJason Hiebel, Laura E. Brown, Zhenlin Wang. [doi]
- A 2D Parallel Triangle Counting Algorithm for Distributed-Memory ArchitecturesAncy Sarah Tom, George Karypis. [doi]
- Accelerating Long Read Alignment on Three ProcessorsZonghao Feng, Shuang Qiu, Lipeng Wang, Qiong Luo 0001. [doi]
- Optimized Execution of Parallel Loops via User-Defined Scheduling PoliciesSeonmyeong Bak, Yanfei Guo, Pavan Balaji, Vivek Sarkar. [doi]
- Cooperative Job Scheduling and Data Allocation for Busy Data-Intensive Parallel Computing ClustersGuoxin Liu, Haiying Shen, Haoyu Wang. [doi]
- FuncyTuner: Auto-tuning Scientific Applications With Per-loop CompilationTao Wang, Nikhil Jain, David Beckingsale, David Böhme, Frank Mueller, Todd Gamblin. [doi]
- A Network-aware and Partition-based Resource Management Scheme for Data Stream ProcessingYidan Wang, Zahir Tari, Xiaoran Huang, Albert Y. Zomaya. [doi]
- Holistic Slowdown Driven Scheduling and Resource Management for Malleable JobsMarco D'Amico, Ana Jokanovic, Julita Corbalán. [doi]
- VScan: Efficiently Analyzing Surveillance Videos via Model-joint MechanismChen Zhang, Qiang Cao, Jie Yao, Yuanyuan Dong, Puyuan Yang. [doi]
- LFOC: A Lightweight Fairness-Oriented Cache Clustering Policy for Commodity MulticoresAdrian Garcia-Garcia, Juan Carlos Saez, Fernando Castro, Manuel Prieto-Matías. [doi]
- An Efficient Design Flow for Accelerating Complicated-connected CNNs on a Multi-FPGA PlatformDeguang Wang, Junzhong Shen, Mei Wen, Chunyuan Zhang. [doi]
- The Case for Water-Immersion Computer BoardsMichihiro Koibuchi, Ikki Fujiwara, Naoya Niwa, Tomohiro Totoki, Shoichi Hirasawa. [doi]
- Compiler-Assisted GPU Thread Throttling for Reduced Cache ContentionHyunjun Kim, Sungin Hong, Hyeonsu Lee, Euiseong Seo, Hwansoo Han. [doi]
- CostPI: Cost-Effective Performance Isolation for Shared NVMe SSDsJiahao Liu, Fang Wang 0001, Dan Feng 0001. [doi]
- Efficient Data-Parallel Primitives on Heterogeneous SystemsZhuohang Lai, Qiong Luo 0001, Xiaolong Xie. [doi]
- Cynthia: Cost-Efficient Cloud Resource Provisioning for Predictable Distributed Deep Neural Network TrainingHaoyue Zheng, Fei Xu, Li Chen, Zhi Zhou, Fangming Liu. [doi]
- Exploiting Vector Processing in Dynamic Binary TranslationChih-Min Lin, Sheng-Yu Fu, Ding-Yong Hong, Yu-Ping Liu, Jan-Jan Wu, Wei-Chung Hsu. [doi]
- Multi-Objective Reinforcement Learning for Reconfiguring Data Stream Analytics on Edge ComputingAlexandre Da Silva Veith, Felipe Rodrigo de Souza, Marcos Dias de Assunção, Laurent Lefèvre, Julio Cesar Santos dos Anjos. [doi]
- Automatic Differentiation for Adjoint Stencil LoopsJan Hückelheim, Navjot Kukreja, Sri Hari Krishna Narayanan, Fabio Luporini, Gerard Gorman, Paul D. Hovland. [doi]
- Solving All-Pairs Shortest-Paths Problem in Large Graphs Using Apache SparkFrank Schoeneman, Jaroslaw Zola. [doi]
- SAFE: Service Availability via Failure Elimination Through VNF ScalingRui Xia, Haipeng Dai, Jiaqi Zheng, Rong Gu, Xiaoyu Wang, Guihai Chen. [doi]
- A Read-leveling Data Distribution Scheme for Promoting Read Performance in SSDs with DeduplicationMengting Lu, Fang Wang 0001, Dan Feng 0001, Yuchong Hu. [doi]
- DeepHash: An End-to-End Learning Approach for Metadata Management in Distributed File SystemsYuanning Gao, Xiaofeng Gao, Guihai Chen. [doi]
- Near-Data Processing-Enabled and Time-Aware Compaction Optimization for LSM-tree-based Key-Value StoresHui Sun, Wei Liu, Jianzhong Huang 0001, Song Fu, Zhi Qiao, Weisong Shi. [doi]
- Modeling the Performance of Atomic Primitives on Modern ArchitecturesFazeleh Hoseini, Aras Atalar, Philippas Tsigas. [doi]
- HPAS: An HPC Performance Anomaly Suite for Reproducing Performance VariationsEmre Ates, Yijia Zhang, Burak Aksar, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun. [doi]
- Lightweight Fault Tolerance in Pregel-Like SystemsDa Yan, James Cheng, Hongzhi Chen, Cheng Long, Purushotham Bangalore. [doi]
- How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node FailuresCarlos Pachajoa, Markus Levonyak, Wilfried N. Gansterer, Jesper Larsson Träff. [doi]
- HyperPRAW: Architecture-Aware Hypergraph Restreaming Partition to Improve Performance of Parallel Applications Running on High Performance Computing SystemsCarlos Fernandez Musoles, Daniel Coca, Paul Richmond. [doi]
- JobPacker: Job Scheduling for Data-Parallel Frameworks with Hybrid Electrical/Optical Datacenter NetworksZhuozhao Li, Haiying Shen. [doi]
- N-Code: An Optimal RAID-6 MDS Array Code for Load Balancing and High I/O PerformancePing Xie, Zhu Yuan, Jianzhong Huang 0001, Xiao Qin 0001. [doi]
- ECoST: Energy-Efficient Co-Locating and Self-Tuning MapReduce ApplicationsMaria Malik, Hassan Ghasemzadeh, Tinoosh Mohsenin, Rosario Cammarota, Liang Zhao 0002, Avesta Sasan, Houman Homayoun, Setareh Rafatirad. [doi]
- swATOP: Automatically Optimizing Deep Learning Operators on SW26010 Many-Core ProcessorWei Gao, Jiarui Fang, Wenlai Zhao, Jinzhe Yang, Long Wang 0014, Lin Gan, Haohuan Fu, Guangwen Yang. [doi]
- Parallel Algorithms for Evaluating Matrix PolynomialsSivan Toledo, Amit Waisel. [doi]
- Network Congestion Avoidance through Packet-chaining ReservationKe Wu, Dezun Dong, Cunlu Li, Shan Huang, Yi Dai. [doi]
- Controlled Asynchronous GVT: Accelerating Parallel Discrete Event Simulation on Many-Core ClustersAli Eker, Barry Williams, Kenneth Chiu, Dmitry Ponomarev. [doi]
- HOPE: A Parallel Execution Model Based on Hierarchical OmissionMasahiro Yasugi, Daisuke Muraoka, Tasuku Hiraishi, Seiji Umatani, Kento Emoto. [doi]
- TLB: Traffic-aware Load Balancing with Adaptive Granularity in Data Center NetworksJinbin Hu, Jiawei Huang, Wenjun Lv, Weihe Li, Jianxin Wang, Tian He 0001. [doi]
- A Plugin Architecture for the TAU Performance SystemAllen D. Malony, Srinivasan Ramesh, Kevin A. Huck, Nicholas Chaimov, Sameer Shende. [doi]
- Express Link Placement for NoC-Based Many-Core PlatformsYunfan Li, Di Zhu 0002, Lizhong Chen. [doi]
- NFV-Enabled Multicasting in Mobile Edge Clouds with Resource SharingZichuan Xu, Yutong Zhang, Weifa Liang, Qiufen Xia, Omer Rana, Alex Galis, Guowei Wu, Pan Zhou. [doi]
- Performance Models for Data Transfers: A Case Study with Molecular Chemistry KernelsSuraj Kumar, Lionel Eyraud-Dubois, Sriram Krishnamoorthy. [doi]
- BCL: A Cross-Platform Distributed Data Structures LibraryBenjamin Brock, Aydin Buluç, Katherine A. Yelick. [doi]
- Building Scalable NVM-based B+tree with HTMMengxing Liu, Jiankai Xing, Kang Chen, Yongwei Wu. [doi]
- MAC: Memory Access Coalescer for 3D-Stacked MemoryXi Wang, Antonino Tumeo, John D. Leidel, Jie Li, Yong Chen 0001. [doi]
- AdaM: An Adaptive Fine-Grained Scheme for Distributed Metadata ManagementShiyi Cao, Yuanning Gao, Xiaofeng Gao, Guihai Chen. [doi]
- A Parallel Graph Algorithm for Detecting Mesh Singularities in Distributed Memory Ice Sheet SimulationsIan Bogle, Karen D. Devine, Mauro Perego, Sivasankaran Rajamanickam, George M. Slota. [doi]
- DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing PipelinesYang Cheng, Dan Li, Zhiyuan Guo, Binyao Jiang, Jiaxin Lin, Xi Fan, Jinkun Geng, Xinyi Yu, Wei Bai 0001, Lei Qu, Ran Shu, Peng Cheng, Yongqiang Xiong, Jianping Wu. [doi]
- Network Congestion-aware Online Service Function Chain Placement and Load BalancingXiaojun Shang, Zhenhua Liu, Yuanyuan Yang. [doi]
- A Tale of Two (Flow) Tables: Demystifying Rule Caching in OpenFlow SwitchesRui Li, Yu Pang, Jin Zhao, Xin Wang. [doi]
- When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load ManagementXiaofeng Hou, Mingyu Liang, Chao Li, Wenli Zheng, Quan Chen, Minyi Guo. [doi]
- FlowCon: Elastic Flow Configuration for Containerized Deep Learning ApplicationsWenjia Zheng, Michael Tynes, Henry Gorelick, Ying Mao, Long Cheng, Yantian Hou. [doi]
- Spatially-aware Parallel I/O for Particle DataSidharth Kumar, Steve Petruzza, Will Usher, Valerio Pascucci. [doi]
- Cartesian Collective CommunicationJesper Larsson Träff, Sascha Hunold. [doi]
- A Practical, Scalable, Relaxed Priority QueueTingzhe Zhou, Maged M. Michael, Michael F. Spear. [doi]
- Massively Parallel Automated Software TuningJakub Kurzak, Yaohung M. Tsai, Mark Gates, Ahmad Abdelfattah, Jack J. Dongarra. [doi]
- COMBFT: Conflicting-Order-Match based Byzantine Fault Tolerance Protocol with High Efficiency and RobustnessYingyao Rong, Weigang Wu, Zhiguang Chen. [doi]
- Improved Unconstrained Energy Functional Method for Eigensolvers in Electronic Structure CalculationsMauro Del Ben, Osni Marques, Andrew Canning. [doi]