Abstract is missing.
- Introduction to HCW WorkshopErik Saule, Emmanuel Jeannot. 1 [doi]
- Message from the HCW Steering Committee ChairBehrooz Shirazi. 2 [doi]
- Message from the HCW General ChairErik Saule. 3 [doi]
- Message from the HCW Program Committee ChairEmmanuel Jeannot. 4 [doi]
- HCW Keynote TalkRicky Yu-Kwong Kwok. 5 [doi]
- Portable Implementation of Advanced Driver-Assistance Algorithms on Heterogeneous ArchitecturesOliver Jakob Arndt, Fabian David Trager, Tobias Moß, Holger Blume. 6-17 [doi]
- Improving CPU Performance Through Dynamic GPU Access Throttling in CPU-GPU Heterogeneous ProcessorsSiddharth Rai, Mainak Chaudhuri. 18-29 [doi]
- Transparent Heterogeneous Backing Store for File SystemsBenjamin Marks, Tia Newhall. 30-41 [doi]
- Alternative Processor Within Threshold: Flexible Scheduling on Heterogeneous SystemsSonia López, Stavan Satish Karia. 42-53 [doi]
- Preemptive Resource Management for Dynamically Arriving Tasks in an Oversubscribed Heterogeneous Computing SystemDylan Machovec, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel, Gregory A. Koenig, Michael Wright, Marcia Hilton, Rajendra Rambharos, Thomas Naughton, Neena Imam. 54-64 [doi]
- Modeling of Applications and Hardware to Explore Task Mapping and Scheduling Strategies on a Heterogeneous Micro-Server SystemLilia Zaourar, Massinissa Ait Aba, David Briand, Jean-Marc Philippe. 65-76 [doi]
- Consumer-and-Provider-Oriented Efficient IaaS Resource AllocationThibaud Ecarot, Djamal Zeghlache, Cedric Brandily. 77-85 [doi]
- Introduction to RAW WorkshopMarco D. Santambrogio, Ramachandran Vaidyanathan. 86-87 [doi]
- RAW Keynote SpeakersRonald F. DeMara, Georgi Gaydadjiev. 88-89 [doi]
- A Pipelined and Scalable Dataflow Implementation of Convolutional Neural Networks on FPGAMarco Bacis, Giuseppe Natale, Emanuele Del Sozzo, Marco Domenico Santambrogio. 90-97 [doi]
- On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGAHaruyoshi Yonekawa, Hiroki Nakahara. 98-105 [doi]
- A Modified Sliding Window Architecture for Efficient BRAM Resource UtilizationMurad Qasaimeh, Joseph Zambreno, Phillip H. Jones. 106-114 [doi]
- Automatic Flow Selection and Quality-of-Result Estimation for FPGA PlacementG. Grewal, Shawki Areibi, Matthew Westrik, Ziad Abuowaimer, B. Zhao. 115-123 [doi]
- Exploiting Decoupled OpenCL Work-Items with Data Dependencies on FPGAs: A Case StudyJavier Alejandro Varela, Norbert Wehn, Qian Liang, Songyin Tang. 124-131 [doi]
- Exploiting FPGAs from Higher Level Languages A Signal Analysis Case StudyL. Stornaiuolo, A. Parravicini, Gianluca Durelli, Marco D. Santambrogio. 132-140 [doi]
- ReEP: A Toolset for Generation and Programming of Reconfigurable Datapaths for Event ProcessingPhilip Gottschling, Christian Hochberger. 141-149 [doi]
- A Scalable Dataflow Implementation of Curran's Approximation AlgorithmAnna Maria Nestorov, Enrico Reggiani, Hristina Palikareva, Pavel Burovskiy, Tobias Becker, Marco D. Santambrogio. 150-157 [doi]
- A Generic Approach to the Development of Coprocessors for Elliptic Curve CryptosystemsRabia Shahid, Ted Winograd, Kris Gaj. 158-167 [doi]
- A Hardware Acceleration for Surface EMG Non-Negative Matrix FactorizationLuca Cerina, Pierandrea Cancian, Giuseppe Franco, Marco Domenico Santambrogio. 168-174 [doi]
- On-FPGA Real-Time Processing of Biological Signals From High-Density MEAs: a Design Space ExplorationGiovanni Pietro Seu, Gian Nicola Angotzi, Giuseppe Tuveri, Luigi Raffo, Luca Berdondini, Alessandro Maccione, Paolo Meloni. 175-183 [doi]
- Combining Boolean Gates and Branching Programs in One Model can Lead to Faster CircuitsYosi Ben-Asher, Esti Stein, Ramachandran Vaidyanathan. 184-191 [doi]
- Efficient Totally-Ordered Subset Generation, with Application in Partial ReconfigurationUtsav Agarwal, Ramachandran Vaidyanathan. 192-201 [doi]
- FAReP: Fragmentation-Aware Replacement Policy for Task Reuse on Reconfigurable FPGAsGodwin Enemali, Adewale Adetomi, Tughrul Arslan. 202-206 [doi]
- Power Analysis of HLS-Designed Customized Instruction Set ArchitecturesTejaswini Ananthanarayana, Sonia López, Marcin Lukowiak. 207-212 [doi]
- A Near Optimal Integrated Solution for Resource Constrained Scheduling, Binding and Routing on CGRAsTajas Ruschke, Lukas Johannes Jung, Christian Hochberger. 213-218 [doi]
- Clock Buffers, Nets, and Trees for On-Chip Communication: A Novel Network Access Technique in FPGAsAdewale Adetomi, Godwin Enemali, Tughrul Arslan. 219-222 [doi]
- Pearson Correlation Coefficient Acceleration for Modeling and Mapping of Neural InterconnectionsEnrico Reggiani, Eleonora DArnese, Andrea Purgato, Marco D. Santambrogio. 223-228 [doi]
- Out-of-Order Execution of Buffered Function Units in Exposed Data Path ArchitecturesTripti Jain, Klaus Schneider 0001, Frederik Walk. 229-234 [doi]
- Dynamic Dual Fixed-Point CORDIC ImplementationAndres Jacoby, Daniel Llamocca. 235-240 [doi]
- A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGAEmanuele Del Sozzo, Lorenzo Di Tucci, Marco D. Santambrogio. 241-246 [doi]
- Feasibility Study of Real-Time Spiking Neural Network Simulations on a Swarm Intelligence Based Digital ArchitectureFrancesca Palumbo, Carlo Sau, Danilo Pani, Paolo Meloni, Luigi Raffo. 247-250 [doi]
- Introduction to HiCOMB WorkshopAlex Pothen, Ananth Grama. 251 [doi]
- HiCOMB KeynoteRadu Marculescu. 252 [doi]
- Scalable FRaC Variants: Anomaly Detection for Precision MedicineCyrus Cousins, Chirstopher M. Pietras, Donna K. Slonim. 253-262 [doi]
- Exploratory Modeling and Simulation of the Evolutionary Dynamics of Single-Stranded RNA Virus PopulationsJae-Seung Yeom, Tanya Kostova-Vassilevska, Peter D. Barnes Jr., David R. Jefferson, Tomas Oppelstrup. 263-272 [doi]
- Parallel NGS Assembly Using Distributed Assembly Graphs Enriched with Biological KnowledgeJulia D. Warnke-Sommer, Hesham H. Ali. 273-282 [doi]
- Parallel and Memory-Efficient Preprocessing for Metagenome AssemblyVasudevan Rengasamy, Paul Medvedev, Kamesh Madduri. 283-292 [doi]
- Scalable Parallelization of a Markov Coalescent Genealogy SamplerPhilip E. Davis, Adam M. Terwilliger, David Zeitler, Greg Wolffe. 293-302 [doi]
- Par-eXpress: A Tool for Analysis of Sequencing Experiments With Ambiguous Assignment of Fragments in ParallelMucahid Kutlu, Gagan Agrawal, James S. Blachly. 303-310 [doi]
- Introduction to EduPar WorkshopSheikh Ghafoor, Sushil K. Prasad, Satish Puri. 311-313 [doi]
- EduPar KeynoteJack Dongarra. 314 [doi]
- RAI: A Scalable Project Submission System for Parallel Programming CoursesAbdul Dakkak, Carl Pearson, Cheng Li, Wen-mei W. Hwu. 315-322 [doi]
- Introducing Parallel and Distributed Computing to K12Brian Broll, Ákos Lédeczi, Péter Völgyesi, János Sallai, Miklós Maróti, Chris Vanags. 323-330 [doi]
- Log Visualization Tool for Message-Passing Programming in PilotTianyi Bao, William B. Gardner. 331-338 [doi]
- I Can Has Supercomputer? A Novel Approach to Teaching Parallel and Distributed Computing Concepts Using a Meme-Based Programming LanguageDavid A. Richie, James A. Ross. 339-345 [doi]
- Teaching Future Big Data Analysts: Curriculum and Experience ReportJoshua Eckroth. 346-351 [doi]
- Hacking at the Divide Between Polar Science and HPC: Using Hackathons as Training ToolsJane Wyngaard, Heather Lynch, Jaroslaw Nabrzyski, Allen Pope, Shantenu Jha. 352-359 [doi]
- Preparing an Online Java Parallel Computing CourseVivek Sarkar, Max Grossman, Zoran Budimlic, Shams Imam. 360-366 [doi]
- A Laboratory Based Course on GPU Programming: Methods, Practices, and LessonsJawwad Ahmed Shamsi. 367-374 [doi]
- Introduction to ParLearning WorkshopAnand Panangadan. 375-376 [doi]
- ParLearning KeynotesJohn Feo, Wei Tan. 377-378 [doi]
- ExtDict: Extensible Dictionaries for Data- and Platform-Aware Large-Scale LearningAzalia Mirhoseini, Bita Darvish Rouhani, Ebrahim M. Songhori, Farinaz Koushanfar. 379-388 [doi]
- Coded TeraSortSongze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, Salman Avestimehr. 389-398 [doi]
- Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights LandingNitin A. Gawande, Joshua B. Landwehr, Jeff A. Daily, Nathan R. Tallent, Abhinav Vishnu, Darren J. Kerbyson. 399-408 [doi]
- Efficient and Portable ALS Matrix Factorization for Recommender SystemsJing Chen, Jianbin Fang, Weifeng Liu, Tao Tang, Xuhao Chen, Canqun Yang. 409-418 [doi]
- Large-Scale Stochastic Learning Using GPUsThomas P. Parnell, Celestine Dünner, Kubilay Atasu, Manolis Sifalakis, Haris Pozidis. 419-428 [doi]
- Distributed and in-Situ Machine Learning for Smart-Homes and Buildings: Application to Alarm Sounds DetectionAmaury Durand, Yanik Ngoko, Christophe Cérin. 429-432 [doi]
- The New Large-Scale RNNLM System Based on Distributed NeuronDejiao Niu, Rui Xue, Tao Cai, Hai Li, Kingsley Effah, Hang Zhang. 433-436 [doi]
- Cache Friendly Parallelization of Neural Encoder-Decoder Models Without Padding on Multi-core ArchitectureYuchen Qiao, Kazuma Hashimoto, Akiko Eriguchi, Haixia Wang, Dongsheng Wang, Yoshimasa Tsuruoka, Kenjiro Taura. 437-440 [doi]
- Introduction to PDCO WorkshopGrégoire Danoy, Didier El Baz. 441 [doi]
- A Parallel Approximation Algorithm for Scheduling Parallel Identical MachinesLaleh Ghalami, Daniel Grosu. 442-451 [doi]
- Communication Aware task Placement for Workflow Scheduling on DaaS-Based CloudHadrien Croubois, Eddy Caron. 452-461 [doi]
- Dynamic Mapping of Application Workflows in Heterogeneous Computing EnvironmentsMuhammad Qasim, Touseef Iqbal, Ehsan Ullah Munir, Nikos Tziritas, Samee U. Khan, Laurence T. Yang. 462-471 [doi]
- Load-Aware Strategies for Cloud-Based VoIP Optimization with VM Startup PredictionJorge M. Cortés-Mendoza, Andrei Tchernykh, Igor Bychkov, Alexander Feoktistov, Pascal Bouvry, Loic Didelot. 472-481 [doi]
- Multiobjective Vehicle-type Scheduling in Urban Public TransportDavid Pena, Andrei Tchernykh, Sergio Nesmachnow, Renzo Massobrio, Alexander Feoktistov, Igor Bychkov. 482-491 [doi]
- A new Co-evolutionary Algorithm Based on Constraint DecompositionEmmanuel Kieffer, Grégoire Danoy, Pascal Bouvry, Anass Nagih. 492-500 [doi]
- Training Many Neural Networks in Parallel via Back-PropagationJavier A. Cruz-Lopez, Vincent Boyer, Didier El Baz. 501-509 [doi]
- Design of Metaheuristic Based on Machine Learning: A Unified ApproachAmir Nakib, Mohamed Hilia, Frederic Heliodore, El-Ghazali Talbi. 510-518 [doi]
- Shared Memory Parallel Subgraph EnumerationRaphael Kimmig, Henning Meyerhenke, Darren Strash. 519-529 [doi]
- Exploration of de Bruijn Graph Filtering for de novo Assembly Using GraphLabJulien Collet, Tanguy Sassolas, Yves Lhuillier, Renaud Sirdey, Jacques Carlier. 530-539 [doi]
- An Efficient CPP Solution for Resilience-Oriented SDN Controller DeploymentHe Li, Robson Eduardo De Grande, Azzedine Boukerche. 540-549 [doi]
- Optimal Bandwidth Selection for Kernel Regression Using a Fast Grid Search and a GPUChris Rohlfs, Mohamed Zahran. 550-556 [doi]
- Space-Efficient Pointwise Computation of the Distance Transform on GPUsNumair Khan, Mohamed Zahran. 557-566 [doi]
- Optimizing One-Sided Communication of Parallel Applications Using Critical Path MethodsChristian Herold, Olaf Krzikalla, Andreas Knüpfer. 567-576 [doi]
- Introduction to GABB WorkshopAydin Buluç, Tim Mattson. 577 [doi]
- GABB KeynoteÜmit V. Çatalyürek. 578 [doi]
- Breadth-First Search with A Multi-Core ComputerMaryia Belova, Ming Ouyang. 579-587 [doi]
- Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph ComputationsGeorge M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri. 588-597 [doi]
- A Study of Graph Decomposition Algorithms for Parallel Symmetry BreakingSayyad Nayyaroddeen, Mahak Gambhir, Kishore Kothapalli. 598-607 [doi]
- Constructing Adjacency Arrays from Incidence ArraysHayden Jananthan, Karia Dibert, Jeremy Kepner. 608-615 [doi]
- Mini-Gunrock: A Lightweight Graph Analytics Framework on the GPUYangzihao Wang, Sean Baxter, John D. Owens. 616-626 [doi]
- Algebraic Multigrid for Least Squares Problems on Graphs with Applications to HodgeRankCharles Colley, Junyuan Lin, Xiaozhe Hu, Shuchin Aeron. 627-636 [doi]
- Deriving Streaming Graph Algorithms from Static DefinitionsDavid Ediger, James P. Fairbanks. 637-642 [doi]
- Design of the GraphBLAS API for CAydin Buluç, Tim Mattson, Scott McMillan, José E. Moreira, Carl Yang. 643-652 [doi]
- A Linear Algebra-Based Programming Interface for Graph Computations in Scala and SparkWilliam P. Horn, Gabriel Tanase, Hao Yu, Pratap Pattnaik. 653-659 [doi]
- Introduction to AsHES WorkshopSunita Chandrasekaran. 660 [doi]
- AsHES KeynoteTim Mattson. 661 [doi]
- Implementing the OpenACC Data ModelMichael Wolfe, Seyong Lee, Jungwon Kim, Xiaonan Tian, Rengan Xu, Sunita Chandrasekaran, Barbara M. Chapman. 662-672 [doi]
- Exploring Translation of OpenMP to OpenACC 2.5: Lessons LearnedSergio Pino, Lori L. Pollock, Sunita Chandrasekaran. 673-682 [doi]
- Exploring the Performance Benefit of Hybrid Memory System on HPC EnvironmentsIvy Bo Peng, Roberto Gioiosa, Gokcen Kestor, Pietro Cicotti, Erwin Laure, Stefano Markidis. 683-692 [doi]
- Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core ArchitecturesMehmet Deveci, Christian Trott, Sivasankaran Rajamanickam. 693-702 [doi]
- Time and Energy to Solution Evaluation for the Three-Point Angular Correlation FunctionAntonio Gómez-Iglesias, Miguel Cárdenas Montes. 703-712 [doi]
- Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core ProcessorsKaixi Hou, Wu-chun Feng, Shuai Che. 713-722 [doi]
- A Pluggable Framework for Composable HPC Scheduling LibrariesMax Grossman, Vivek Kumar 0001, Nick Vrvilo, Zoran Budimlic, Vivek Sarkar. 723-732 [doi]
- Static Versus Dynamic Task Scheduling of the Lu Factorization on ARM big. LITTLE ArchitecturesSandra Catalán, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí, José R. Herrero. 733-742 [doi]
- Benchmarking SW26010 Many-Core ProcessorZhigeng Xu, James Lin, Satoshi Matsuoka. 743-752 [doi]
- Introduction to HIPS WorkshopBo Wu, Andreas Knüpfer. 753-754 [doi]
- HIPS KeynoteZizhong Chen. 755 [doi]
- Performance Study of Multithreaded MPI and OpenMP Tasking in a Large Scientific CodeDana Akhmetova, Roman Iakymchuk, Örjan Ekeberg, Erwin Laure. 756-765 [doi]
- Comparison of Threading Programming ModelsSolmaz Salehian, Jiawen Liu, YongHong Yan. 766-774 [doi]
- Annotation-Based Parallelization of Java CodeMostafa Mehrabi, Nasser Giacaman, Oliver Sinnen. 775-784 [doi]
- Using LLVM for Optimized Lightweight Binary Re-Writing at RuntimeAlexis Engelke, Josef Weidendorfer. 785-794 [doi]
- Snowflake: A Lightweight Portable Stencil DSLNathan Zhang, Michael Driscoll, Charles Markley, Samuel Williams, Protonu Basu, Armando Fox. 795-804 [doi]
- Enabling One-Sided Communication Semantics on ARMPavel Shamis, M. Graham Lopez, Gilad Shainer. 805-813 [doi]
- Towards a Language Framework for Thick Control FlowsJari-Matti Mäkelä, Martti Forsell, Ville Leppänen. 814-823 [doi]
- Pure Concurrent ProgrammingBenjamin J. L. Wang, Uwe R. Zimmer. 824-831 [doi]
- Introduction to APDCM WorkshopOscar H. Ibarra, Koji Nakano. 832 [doi]
- APDCM KeynoteHong Shen. 833 [doi]
- Complete Visibility for Mobile Agents with Lights Tolerating a Faulty AgentAisha Aljohani, Gokarna Sharma. 834-843 [doi]
- A Self-Stabilizing Algorithm for Constructing (1, 1)-Maximal Directed Acyclic GraphYongHwan Kim, Haruka Ohno, Yoshiaki Katayama, Toshimitsu Masuzawa. 844-853 [doi]
- Fault Tolerance for Cooperative Lifeline-Based Global Load Balancing in Java with APGAS and HazelcastJonas Posner, Claudia Fohry. 854-863 [doi]
- Applications of Ear Decomposition to Efficient Heterogeneous Algorithms for Shortest Path/Cycle ProblemsDebarshi Dutta, Meher Chaitanya, Kishore Kothapalli, Debajyoti Bera. 864-873 [doi]
- Co-Scheduling Algorithms for Cache-Partitioned SystemsGuillaume Aupy, Anne Benoit, Loïc Pottier, Padma Raghavan, Yves Robert, Manu Shantharam. 874-883 [doi]
- Minimizing I/Os in Out-of-Core Task Tree SchedulingLoris Marchal, Samuel McCauley, Bertrand Simon, Frédéric Vivien. 884-893 [doi]
- Approximate Count and Queue Objects in Transactional MemoryBasem Assiri, Costas Busch. 894-903 [doi]
- Assessing NUMA Performance Based on Hardware Event CountersMax Plauth, Christoph Sterz, Felix Eberhardt, Frank Feinbube, Andreas Polze. 904-913 [doi]
- An Analysis of Resilience Techniques for Exascale Computing PlatformsDaniel Dauwe, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel. 914-923 [doi]
- A Compression Method for Storage Formats of a Sparse Matrix in Solving the Large-Scale Linear SystemsTomoki Kawamura, Yoneda Kazunori, Takashi Yamazaki, Takashi Iwamura, Masahiro Watanabe, Yasushi Inoguchi. 924-931 [doi]
- Accelerating the Smith-Waterman Algorithm Using Bitwise Parallel Bulk Computation Technique on GPUTakahiro Nishimura, Jacir Luiz Bordim, Yasuaki Ito, Koji Nakano. 932-941 [doi]
- Photomosaic Generation by Rearranging Subimages, with GPU AccelerationYi Yang, Yasuaki Ito, Koji Nakano. 942-951 [doi]
- HPPAC Workshop IntroductionShuaiwen Leon Song, Richard W. Vuduc. 952 [doi]
- HPPAC Keynote TalkKirk W. Cameron. 953 [doi]
- Using Machine Learning for Data Center Cooling Infrastructure Efficiency PredictionHayk Shoukourian, Torsten Wilde, Detlef Labrenz, Arndt Bode. 954-963 [doi]
- Design of an Energy Aware Petaflops Class High Performance Cluster Based on Power ArchitectureWissam Abu Ahmad, Andrea Bartolini, Francesco Beneventi, Luca Benini, Andrea Borghesi, Marco Cicala, Privato Forestieri, Cosimo Gianfreda, Daniele Gregori, Antonio Libri, Filippo Spiga, Simone Tinti. 964-973 [doi]
- Towards a Unified Monitoring Framework for Power, Performance and Thermal Metrics: A Case Study on the Evaluation of HPC Cooling SystemsAniruddha Marathe, Ghaleb Abdulla, Barry L. Rountree, Kathleen Shoga. 974-983 [doi]
- When Good Enough Is Better: Energy-Aware Scheduling for Multicore ServersXinning Hui, Zhihui Du, Jason Liu, Hongyang Sun, Yuxiong He, David A. Bader. 984-993 [doi]
- A Runtime Workload Distribution with Resource Allocation for CPU-GPU Heterogeneous SystemsShouq Alsubaihi, Jean-Luc Gaudiot. 994-1003 [doi]
- Power Measurements of Hartree-Fock Algorithms Using Different Storage DevicesVladimir Mironov, Alexander Moskovsky, Yuri Alexeev. 1004-1011 [doi]
- A Statistical Approach to Power Estimation for x86 ProcessorsMohak Chadha, Thomas Ilsche, Mario Bielert, Wolfgang E. Nagel. 1012-1019 [doi]
- Introduction to HPBDC WorkshopXiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda. 1020 [doi]
- Performance Evaluation of Scale-Free Graph Algorithms in Low Latency Non-volatile MemoryManu Shantharam, Keita Iwabuchi, Pietro Cicotti, Laura Carrington, Maya Gokhale, Roger A. Pearce. 1021-1028 [doi]
- High-Performance Data Analytics Beyond the Relational and Graph Data Models with GEMSVito Giovanni Castellana, Marco Minutoli, Shreyansh Bhatt, Khushbu Agarwal, Arthur Bleeker, John Feo, Daniel G. Chavarría-Miranda, David J. Haglin. 1029-1038 [doi]
- Graph Analytics: Complexity, Scalability, and ArchitecturesPeter M. Kogge. 1039-1047 [doi]
- Spark and HPC for High Energy Physics Data AnalysesSaba Sehrish, Jim Kowalkowski, Marc F. Paterno. 1048-1057 [doi]
- The Consistency Analysis of Secondary Index on Distributed Ordered TablesHouliang Qi, Xu Chang, Xingwu Liu, Li Zha. 1058-1067 [doi]
- BigDataBench-S: An Open-Source Scientific Big Data Benchmark SuiteXinhui Tian, Shaopeng Dai, Zhihui Du, Wanling Gao, Rui Ren, Yaodong Cheng, Zhifei Zhang, Zhen Jia, Peijian Wang, Jianfeng Zhan. 1068-1077 [doi]
- Scalable Architecture for Anomaly Detection and Visualization in Power Generating AssetsParas Jain, Chirag Tailor, Sam Ford, Liexiao Ding, Michael Phillips, Fang Cherry Liu, Nagi Gebraeel, Duen Horng Chau. 1078-1082 [doi]
- Introduction to CHIUW WorkshopTom MacDonald, Michael Ferguson. 1083-1084 [doi]
- CHIUW KeynoteJonathan Dursi. 1085 [doi]
- Identifying Use-After-Free Variables in Fire-and-Forget TasksJyothi Krishna V. S, Vassily Litvinov. 1086-1094 [doi]
- Towards a GraphBLAS Library in ChapelAriful Azad, Aydin Buluç. 1095-1104 [doi]
- Comparative Performance and Optimization of Chapel in Modern Manycore ArchitecturesEngin Kayraklioglu, Wo Chang, Tarek A. El-Ghazawi. 1105-1114 [doi]
- Introduction to PDSEC WorkshopPeter E. Strazdins, Keita Teranishi, Raphaël Couturier, Joseph Antony, Thomas Rauber, Gudula Rünger, Laurence T. Yang. 1115-1116 [doi]
- PDSEC KeynotePavan Balaji. 1117 [doi]
- Improving Performance of GMRES by Reducing Communication and Pipelining Global CollectivesIchitaro Yamazaki, Mark Hoemmen, Piotr Luszczek, Jack Dongarra. 1118-1127 [doi]
- Simultaneously Solving Swarms of Small Sparse Systems on SIMD SiliconBryce Adelstein-Lelbach, Hans Johansen, Samuel Williams. 1128-1137 [doi]
- Sparse Supernodal Solver Using Block Low-Rank CompressionGregoire Pichon, Eric Darve, Mathieu Faverge, Pierre Ramet, Jean Roman. 1138-1147 [doi]
- Task-Parallel LU Factorization of Hierarchical Matrices Using OmpSsJosé Ignacio Aliaga, Rocio Carratala-Saez, Ronald Kriemann, Enrique S. Quintana-Ortí. 1148-1157 [doi]
- Parallel Particle-in-Cell Performance Optimization: A Case Study of Electrospray SimulationRamachandran Kodanganallur Narayanan, Kamesh Madduri. 1158-1167 [doi]
- Efficient Data Structures for a Hybrid Parallel and Vectorized Particle-in-Cell CodeYann Barsamian, Sever A. Hirstoaga, Eric Violard. 1168-1177 [doi]
- A Locality-Based Threading Algorithm for the Configuration-Interaction MethodHongzhang Shan, Samuel Williams, Calvin W. Johnson, Kenneth McElvain. 1178-1187 [doi]
- Architecting the Discontinuous Deformation Analysis Method Pipeline on the GPUYunfan Xiao, Min Huang, Qinghai Miao, Jun Xiao, Ying Wang. 1188-1197 [doi]
- Redesigning OP2 Compiler to Use HPX Runtime Asynchronous TechniquesZahra Khatami, Hartmut Kaiser, J. Ramanujam. 1198-1207 [doi]
- Automated Dynamic Data RedistributionThomas Marrinan, Joseph A. Insley, Silvio Rizzi, Francois Tessier, Michael E. Papka. 1208-1215 [doi]
- An Application-Aware Data Replacement Policy for Interactive Large-Scale Scientific VisualizationLina Yu, Hongfeng Yu, Hong Jiang, Jun Wang. 1216-1225 [doi]
- Scalable Hierarchical Multipole Methods Using an Asynchronous Many-Tasking Runtime SystemJackson DeBuhr, Bo Zhang, Luke Dalessandro. 1226-1234 [doi]
- Introduction to JSSPP WorkshopWalfredo Cirne, Narayan Desai, Dalibor Klusácek. 1235-1236 [doi]
- Introduction to DPDNS WorkshopDimiter R. Avresky, Erik Maehle. 1237 [doi]
- Reliability Calculation of P2P Streaming Systems with Bottleneck LinksSatoshi Fujita. 1238-1244 [doi]
- Lifetime and Full-View Coverage Guarantees Through Distributed Algorithms in Camera Sensor NetworksChaoyang Li, Anu G. Bourgeois. 1245-1250 [doi]
- A Small-Scale Testbed for Large-Scale Reliable ComputingJason St. John, Thomas J. Hacker. 1251-1258 [doi]
- LSTM-Based Memory Profiling for Predicting Data Attacks in Distributed Big Data SystemsSantosh Aditham, Nagarajan Ranganathan, Srinivas Katkoori. 1259-1267 [doi]
- An Outlook on Volunteer and Croudsourcing Based ComputingSalvatore Distefano, Samuele Rodi. 1268-1273 [doi]
- Exploring the Effect of Compiler Optimizations on the Reliability of HPC ApplicationsRizwan Ashraf, Roberto Gioiosa, Gokcen Kestor, Ronald F. DeMara. 1274-1283 [doi]
- IPDRM Workshop IntroductionShuaiwen Leon Song, Torsten Hoefler. 1284 [doi]
- Characterizing and Improving the Performance of Many-Core Task-Based Parallel Programming RuntimesJaume Bosch, Xubin Tan, Carlos Álvarez, Daniel Jimánez-González, Xavier Martorell, Eduard Ayguadé. 1285-1292 [doi]
- A Memory Heterogeneity-Aware Runtime System for Bandwidth-Sensitive HPC ApplicationsKavitha Chandrasekar, Xiang Ni, Laxmikant V. Kalé. 1293-1300 [doi]
- SmartBlock: An Approach to Standardizing In Situ Workflow ComponentsAlexis Champsaur, Jay F. Lofstead, Jai Dayal, Matthew Wolf, Greg Eisenhauer, Patrick M. Widener, Ada Gavrilovska. 1301-1308 [doi]
- A Case Study in Computational Caching Microservices for HPCJohn Jenkins, Galen M. Shipman, Jamaludin Mohd-Yusof, Kipton Barros, Philip H. Carns, Robert B. Ross. 1309-1316 [doi]
- A Load-Balanced Parallel and Distributed Sorting Algorithm Implemented with PGX.DZahra Khatami, Sungpack Hong, Jinsoo Lee, Siegfried Depner, Hassan Chafi, J. Ramanujam, Hartmut Kaiser. 1317-1324 [doi]
- Performance Prediction of HPC Applications on Intel ProcessorsCarlos Rosales, Antonio Gómez-Iglesias, Si Liu, Feng Chen, Lei Huang, Hang Liu, Antia Lamas-Linares, John Cazes. 1325-1332 [doi]
- vPHI: Enabling Xeon Phi Capabilities in Virtual MachinesStefanos Gerangelos, Nectarios Koziris. 1333-1340 [doi]
- Introduction to iWAPT WorkshopOsni Marques, Reiji Suda. 1341 [doi]
- A Sampling Based Strategy to Automatic Performance Tuning of GPU ProgramsWilson Feng, Tarek S. Abdelrahman. 1342-1349 [doi]
- Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-TuningTianyi David Han, Tarek S. Abdelrahman. 1350-1361 [doi]
- Automating Compiler-Directed Autotuning for Phased Performance BehaviorTharindu Rusira, Mary W. Hall, Protonu Basu. 1362-1371 [doi]
- A Customizable Auto-Tuning Scenario with User-Defined Code TransformationsHiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, Daisuke Takahashi. 1372-1378 [doi]
- Online-Autotuning in the Presence of Algorithmic ChoicePhilip Pfaffe, Martin Tillmann, Sigmar Walter, Walter F. Tichy. 1379-1388 [doi]
- Performance Analysis and Optimization of Sparse Matrix-Vector Multiplication on Intel Xeon PhiAthena Elafrou, Georgios I. Goumas, Nectarios Koziris. 1389-1398 [doi]
- Auto-Tuning on NUMA and Many-Core Environments with an FDM CodeTakahiro Katagiri, Satoshi Ohshima, Masaharu Matsumoto. 1399-1407 [doi]
- Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of MatricesMark Gates, Jakub Kurzak, Piotr Luszczek, Yu Pei, Jack Dongarra. 1408-1417 [doi]
- Quadruple-Precision BLAS Using Bailey's Arithmetic with FMA Instruction: Its Performance and ApplicationsSusumu Yamada, Toshiyuki Imamura, Takuya Ina, Narimasa Sasa, Yasuhiro Idomura, Masahiko Machida. 1418-1425 [doi]
- Fast Multidimensional Performance Parameter Estimation with Multiple One-Dimensional d-Spline Parameter SearchMasayoshi Mochizuki, Akihiro Fujii, Teruo Tanaka. 1426-1433 [doi]
- Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapperLuigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, Andrew J. Davison, Paul H. J. Kelly. 1434-1443 [doi]
- Introduction to ParSocial WorkshopEunice E. Santos, John Korah. 1444-1445 [doi]
- ParSocial KeynoteBoleslaw Szymanski. 1446 [doi]
- Predicting Viral News Events in Online MediaXiaoyan Lu, Boleslaw K. Szymanski. 1447-1456 [doi]
- Mobile Crowdsensing from a Selfish Routing PerspectiveJulia Buwaya, José D. P. Rolim. 1457-1463 [doi]
- Parallel Computing for Machine Learning in Social Network AnalysisGeorge Cybenko. 1464-1471 [doi]
- Work Partitioning on Parallel and Distributed Agent-Based SimulationGennaro Cordasco, Carmine Spagnuolo, Vittorio Scarano. 1472-1481 [doi]
- Parallel k-Core Decomposition on Multicore PlatformsHumayun Kabir, Kamesh Madduri. 1482-1491 [doi]
- Endogenous Social Networks from Large-Scale Agent-Based ModelsEric Tatara, Nicholson T. Collier, Jonathan Ozik, Charles M. Macal. 1492-1499 [doi]
- Fast Parallel Graph Triad Census and Triangle Counting on Shared-Memory PlatformsSindhuja Parimalarangan, George M. Slota, Kamesh Madduri. 1500-1509 [doi]
- Efficient Anytime Anywhere Algorithms for Vertex Additions in Large and Dynamic GraphsEunice E. Santos, John Korah, Vairavan Murugappan, Suresh Subramanian. 1510-1519 [doi]
- Accelerating Topic Exploration of Multi-Dimensional DocumentsWen-Jing Hsu, You Lu, Zhuo Qi Lee. 1520-1527 [doi]
- Introduction to BigDataEco WorkshopChaitan Baru, Fen Zhao, Joanna Chan. 1528 [doi]
- Introduction to GraML WorkshopAntonino Tumeo, Mahantesh Halappanavar, John Feo. 1529-1530 [doi]
- GraML KeynoteSujith Ravi. 1531 [doi]
- Learning on Graphs for Predictions of Fracture Propagation, Flow and TransportHristo Djidjev, Daniel OMalley, Hari S. Viswanathan, Jeffrey D. Hyman, Satish Karra, Gowri Srinivasan. 1532-1539 [doi]
- Analyzing Community Structure in NetworksHongyuan Zhan, Kamesh Madduri. 1540-1549 [doi]
- Compound Analytics: Templates for Integrating Graph Algorithms and Machine LearningRonald D. Hagan, Charles A. Phillips, Bradley J. Rhodes, Michael A. Langston. 1550-1556 [doi]
- Introduction to EMBRACE WorkshopDavid Bader. 1557 [doi]
- EMBRACE KeynoteTorsten Hoefler. 1558 [doi]
- Introduction to REPPAR WorkshopSascha Hunold, Arnaud Legrand, Lucas Nussbaum. 1559 [doi]
- REPPAR KeynoteTodd Gamblin. 1560 [doi]
- The Popper Convention: Making Reproducible Systems Evaluation PracticalIvo Jimenez, Michael Sevilla, Noah Watkins, Carlos Maltzahn, Jay F. Lofstead, Kathryn Mohror, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. 1561-1570 [doi]
- Towards Trustworthy Testbeds Thanks to Throughout TestingLucas Nussbaum. 1571-1578 [doi]
- Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific ApplicationsFranziska Hoffeins, Florina M. Ciorba, Ioana Banicescu. 1579-1587 [doi]
- Characterizing the Performance of Modern Architectures Through Opaque Benchmarks: Pitfalls Learned the Hard WayLuka Stanisic, Lucas Mello Schnorr, Augustin Degomme, Franz C. Heinrich, Arnaud Legrand, Brice Videau. 1588-1597 [doi]
- Towards Reproducible Blocked LU FactorizationRoman Iakymchuk, Enrique S. Quintana-Ortí, Erwin Laure, Stef Graillat. 1598-1607 [doi]