Abstract is missing.
- Introduction to HCW 2018Alexey L. Lastovetsky, Sudeep Pasricha. 1 [doi]
- Message from the HCW Steering Committee ChairBehrooz A. Shirazi. 2 [doi]
- Message from the HCW General ChairAlexey L. Lastovetsky. 3 [doi]
- Message from the HCW Program Committee ChairSudeep Pasricha. 4 [doi]
- HCW 2018 Keynote Talk 1Manish Parashar. 5 [doi]
- HCW 2018 Keynote Talk 2Ümit V. Çatalyürek. 6 [doi]
- User-Transparent Translation of Machine Instructions to Programmable HardwareLeslie Barron, Tarek S. Abdelrahman. 7-14 [doi]
- Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on Heterogeneous IaaS Cloud PlatformsYves Caniou, Eddy Caron, Aurelie Kong Win Chang, Yves Robert. 15-26 [doi]
- Optimizing Parallel Reduction on OpenCL FPGA Platform - A Case Study of Frequent Pattern CompressionZheming Jin, Hal Finkel. 27-35 [doi]
- Approximation Algorithm for Scheduling Applications on Hybrid Multi-core Machines with Communications DelaysMassinissa Ait Aba, Lilia Zaourar, Alix Munier. 36-45 [doi]
- Exploration and Design of a Synchronous Message Passing Framework for a CPU-NPU Heterogeneous ArchitectureSean Pennefather, Karen Bradshaw, Barry Irwin. 46-56 [doi]
- Large Scale Data Centers Simulation Based on Baseline Test ModelFei Lei, Lei Yu, Bing Shao, Fei Teng 0001, Bo Zhou. 57-68 [doi]
- Application Performance on a Cluster-Booster SystemAnke Kreuzer, Norbert Eicker, Jorge Amaya, Estela Suarez. 69-78 [doi]
- Introduction to RAW 2018Marco D. Santambrogio, Diana Goehringer, Dirk Stroobandt, Ken Eguro. 79-80 [doi]
- RAW 2018 Invited TalksJürgen Becker, Viktor K. Prasanna, Markus Weimer, Wayne Luk, Kaveh Aasaraai, Derek Chiou. 81-82 [doi]
- Transport-Triggered Soft CoresPekka Jääskeläinen, Aleksi Tervo, Guillermo Payá Vayá, Timo Viitanen, Nicolai Behmann, Jarmo Takala, Holger Blume. 83-90 [doi]
- OXiGen: A Tool for Automatic Acceleration of C Functions Into Dataflow FPGA-Based KernelsFrancesco Peverelli, Marco Rabozzi, Emanuele Del Sozzo, Marco D. Santambrogio. 91-98 [doi]
- RAM as a Network Managed ResourceWilliam E. Allcock, Bennett Bernardoni, Colleen Bertoni, Neil Getty, Joseph A. Insley, Michael E. Papka, Silvio Rizzi, Brian R. Toonen. 99-106 [doi]
- MAX-PolyMem: High-Bandwidth Polymorphic Parallel Memories for DFEsCatalin Bogdan Ciobanu, Giulio Stramondo, Cees de Laat, Ana Lucia Varbanescu. 107-114 [doi]
- An FPGA-Based Acceleration Methodology and Performance Model for Iterative StencilsEnrico Reggiani, Giuseppe Natale, Carlo Moroni, Marco D. Santambrogio. 115-122 [doi]
- High-Performance High-Order Stencil Computation on FPGAs Using OpenCLHamid Reza Zohouri, Artur Podobas, Satoshi Matsuoka. 123-130 [doi]
- TiReX: Tiled Regular eXpression Matching ArchitectureAlessandro Comodi, Davide Conficconi, Alberto Scolari, Marco D. Santambrogio. 131-137 [doi]
- Hardware Implementation of POSITs and Their Application in FPGAsArtur Podobas, Satoshi Matsuoka. 138-145 [doi]
- Robustness of Surface EMG Classifiers with Fixed-Point Decomposition on Reconfigurable ArchitectureLuca Cerina, Giuseppe Franco, Pierandrea Cancian, Marco D. Santambrogio. 146-153 [doi]
- Hardware/Software Codesign for Convolutional Neural Networks Exploiting Dynamic Partial Reconfiguration on PYNQFlorian Kastner, Benedikt Janßen, Frederik Kautz, Michael Hübner, Giulio Corradi. 154-161 [doi]
- Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow PlatformChaim Baskin, Natan Liss, Evgenii Zheltonozhskii, Alexander M. Bronstein, Avi Mendelson. 162-169 [doi]
- A Framework with Cloud Integration for CNN Acceleration on FPGA DevicesNiccolo Raspa, Giuseppe Natale, Marco Bacis, Marco D. Santambrogio. 170-177 [doi]
- Kibo: An Open-Source Fixed-Point Tool-kit for Training and Inference in FPGA-Based Deep Learning NetworksDaniel Holanda Noronha, Philip Heng Wai Leong, Steven J. E. Wilton. 178-185 [doi]
- A Reconfigurable Accelerator for Morphological OperationsMenbere Kina Tekleyohannes, Christian Weis, Norbert Wehn, Martin Klein 0005, Michael Siegrist. 186-193 [doi]
- MP-STREAM: A Memory Performance Benchmark for Design Space Exploration on Heterogeneous HPC DevicesSyed Waqar Nabi, Wim Vanderbauwhede. 194-197 [doi]
- FIDA: A Framework to Automatically Integrate FPGA Kernels Within Data-Science ApplicationsLuca Stornaiuolo, Alberto Parravicini, Donatella Sciuto, Marco D. Santambrogio. 198-201 [doi]
- High-Level Reliability Evaluation of Reconfiguration-Based Fault Tolerance TechniquesTien Thanh Nguyen, Mathieu Thevenin, Anthony Mouraud, Gwenolé Corre, Olivier Pasquier, Sébastien Pillement. 202-205 [doi]
- Dynamic Reconfiguration for Real-Time Automotive Embedded Systems in Fail-Operational ContextFlorian Oszwald, Jürgen Becker, Philipp Obergfell, Matthias Traub. 206-209 [doi]
- FPGA Implementation of Pattern Matching for Industrial Control SystemsPeter Rouget, Benoît Badrignans, Pascal Benoit, Lionel Torres. 210-213 [doi]
- A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA Using Chisel HCLLorenzo Di Tucci, Davide Conficconi, Alessandro Comodi, Steven A. Hofmeyr, David Donofrio, Marco D. Santambrogio. 214-217 [doi]
- Redundant Binary to Two's Complement Converter on FPGAs Through Fabric Aware Scan Based Encoding Approach for Fault Localization SupportAyan Palchaudhuri, Anindya Sundar Dhar. 218-221 [doi]
- An Application-Specific Memory Management Unit for FPGA-SoCsMatthias Göbel, Ilja Behnke, Ahmed Elhossini, Ben H. H. Juurlink. 222-225 [doi]
- Introduction to HiCOMB 2018Srinivas Aluru, David A. Bader, Paul Medvedev. 226 [doi]
- HiCOMB Keynote 1James Taylor. 227 [doi]
- HICOMB Keynote 2Onur Mutlu. 228 [doi]
- GraphNER: Using Corpus Level Similarities and Graph Propagation for Named Entity RecognitionGolnar Sheikhshab, Elizabeth Starks, Aly Karsan, Readman Chiu, Anoop Sarkar, Inanç Birol. 229-238 [doi]
- Modifying HMMER3 to Run Efficiently on the Cori Supercomputer Using OpenMP TaskingWilliam Arndt. 239-246 [doi]
- Rerooting Trees Increases Opportunities for Concurrent Computation and Results in Markedly Improved Performance for Phylogenetic InferenceDaniel L. Ayres, Michael P. Cummings. 247-256 [doi]
- Sequence Alignment Through the Looking GlassRaja Appuswamy, Jacques Fellay, Nimisha Chaturvedi. 257-266 [doi]
- Introduction to GABB 2018Tim Mattson. 267 [doi]
- Graph Algorithms in the Language of Linear Algebra: How Did We Get Here, and Where Do We Go Next?John R. Gilbert. 268 [doi]
- Spectral Graph Drawing: Building Blocks and Performance AnalysisShad Kirmani, Kamesh Madduri. 269-277 [doi]
- Parallel Generation of Large-Scale Random GraphsAnil Kumar S. Vullikanti. 278 [doi]
- Design, Generation, and Validation of Extreme Scale Power-Law GraphsJeremy Kepner, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Tim Davis, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones 0001, Anna Klein, Peter Michaleas, Roger Pearce, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Geoffrey Sanders, Charles Yee, Albert Reuther. 279-286 [doi]
- On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and VerticesGeoffrey Sanders, Roger Pearce, Timothy La Fond, Jeremy Kepner. 287-296 [doi]
- Patterns of GraphBLAS Algorithms: Tales from the TrenchesScott McMillan. 297 [doi]
- Implementing the GraphBLAS C APIJosé E. Moreira, Manoj Kumar, William P. Horn. 298-309 [doi]
- PyGB: GraphBLAS DSL in Python with Dynamic Compilation Into Efficient C++Jesse Chamberlin, Marcin Zalewski, Scott McMillan, Andrew Lumsdaine. 310-319 [doi]
- A Survey of Modern Analysis on Graphs: Open ProblemsChris Long. 320 [doi]
- Introduction to EduPar 2018Martina Barnas, Sushil K. Prasad, Satish Puri. 321-322 [doi]
- EduPar 2018 KeynoteAlexandru Iosup. 323 [doi]
- ParallelAR: An Augmented Reality App and Instructional Approach for Learning Parallel Programming Scheduling ConceptsMarin Abernethy, Oliver Sinnen, Joel Adams, Giuseppe De Ruvo, Nasser Giacaman. 324-331 [doi]
- Learning from Optimizing Matrix-Matrix MultiplicationDevangi N. Parikh, Jianyu Huang, Margaret E. Myers, Robert A. van de Geijn. 332-339 [doi]
- An Entertaining Approach to Parallel Programming EducationEmanuel Buzek, Martin Krulis. 340-346 [doi]
- Predicting Success in Undergraduate Parallel Programming via Probabilistic Causality AnalysisSunny Raj, Sumit Kumar Jha 0001. 347-352 [doi]
- A Comprehensive Course on Big Data for Undergraduate StudentsJawwad Ahmed Shamsi, Syed Zain ul Hassan, Narmeen Bawany, Nausheen Shoaib. 353-360 [doi]
- Experiences on Teaching Parallel and Distributed Computing for UndergraduatesErik Saule. 361-368 [doi]
- Teaching Parallel Programming with Active LearningMohammad Amin Kuhail, Spencer Cook, Joshua W. Neustrom, Praveen Rao. 369-376 [doi]
- Teaching Big Data and Cloud Computing: A Modular ApproachDebzani Deb, Sebastian Cousins, M. Muztaba Fuad. 377-383 [doi]
- Introduction to HIPS 2018Karl Fuerlinger, Philip C. Roth. 384-385 [doi]
- HIPS 2018 KeynoteChristian Trott. 386 [doi]
- Visualization of Multi-layer I/O Performance in VampirHartmut Mix, Christian Herold, Matthias Weber. 387-394 [doi]
- An Operational Semantic Basis for Building an OpenMP Data Race CheckerSimone Atzeni, Ganesh Gopalakrishnan. 395-404 [doi]
- Unobtrusive Support for Asynchronous GUI Operations with Java AnnotationsMostafa Mehrabi, Nasser Giacaman, Oliver Sinnen. 405-414 [doi]
- Non-intrusively Avoiding Scaling Problems in and out of MPI CollectivesHongbo Li, Zizhong Chen, Rajiv Gupta 0001, Min Xie. 415-424 [doi]
- Modular Programming of Synchronization and Communication Among Tasks in Parallel ProgramsBernie van Veen, Sung-Shik Jongmans. 425-435 [doi]
- Scalable Collectives for Distributed Asynchronous Many-Task RuntimesMatthew Whitlock, Hemanth Kolla, Sean Treichler, Philippe P. Pébay, Janine C. Bennett. 436-445 [doi]
- Introduction to HPBDC 2018Xiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda. 446 [doi]
- HPBDC 2018 KeynoteGeoffrey C. Fox. 447 [doi]
- Improving I/O Performance Through Colocating Interrelated Input Data and Near-Optimal Load BalancingFelix Seibert, Mathias Peters, Florian Schintke. 448-457 [doi]
- How Well do CPU, GPU and Hybrid Graph Processing Frameworks Perform?Tanuj kr Aasawat, Tahsin Reza, Matei Ripeanu. 458-466 [doi]
- EASIS: An Optimized Information Service for High Performance Computing EnvironmentCan Wu, Xiaoning Wang, Haili Xiao, Rongqiang Cao, Yining Zhao, Xuebin Chi. 467-476 [doi]
- GPU Accelerated Self-Join for the Distance Similarity MetricMichael Gowanlock, Ben Karsin. 477-486 [doi]
- Implementing a Parallel Graph Clustering Algorithm with Sparse Matrix ComputationJun Chen, Peigang Zou. 487-496 [doi]
- atSNPInfrastructure, a Case Study for Searching Billions of Records While Providing Significant Cost Savings over Cloud ProvidersChristopher Harrison, Sündüz Keles, Rebecca Hudson, Sunyoung Shin, Inês Dutra. 497-506 [doi]
- Improvement of the Log Pattern Extracting Algorithm Using Text SimilarityYining Zhao, Xiaodong Wang, Haili Xiao, Xuebin Chi. 507-514 [doi]
- The Performance Analysis of Cache Architecture Based on Alluxio over Virtualized InfrastructureXu Chang, Li Zha. 515-519 [doi]
- Introduction to AsHES 2018Sunita Chandrasekaran, Antonio J. Peña, Min-Si. 520 [doi]
- AsHES 2018 KeynoteMichael Wolfe. 521 [doi]
- NVIDIA Tensor Core Programmability, Performance & PrecisionStefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter. 522-531 [doi]
- Optimizing an Atomics-Based Reduction Kernel on OpenCL FPGA PlatformZheming Jin, Hal Finkel. 532-539 [doi]
- Leveraging Data-Flow Task Parallelism for Locality-Aware Dynamic Scheduling on Heterogeneous PlatformsOsman Seckin Simsek, Andi Drebes, Antoniu Pop. 540-549 [doi]
- Tacho: Memory-Scalable Task Parallel Sparse Cholesky FactorizationKyungjoo Kim, H. Carter Edwards, Sivasankaran Rajamanickam. 550-559 [doi]
- Sorting Large Datasets with Heterogeneous CPU/GPU ArchitecturesMichael Gowanlock, Ben Karsin. 560-569 [doi]
- Improving Performance of Genomic Aligners on Intel Xeon Phi-Based ArchitecturesShaolong Chen, Miquel A. Senar. 570-578 [doi]
- An Initial Characterization of the Emu ChickEric Hein, Tom Conte, Jeffrey Young, Srinivas Eswar, Jiajia Li, Patrick Lavin, Richard W. Vuduc, E. Jason Riedy. 579-588 [doi]
- Exploring the Vision Processing Unit as Co-Processor for InferenceSergio Rivas-Gomez, Antonio J. Peña, David Moloney, Erwin Laure, Stefano Markidis. 589-598 [doi]
- Introduction to PDCO 2018Grégoire Danoy, Didier El Baz, Vincent Boyer 0002, Bernabé Dorronsoro. 599-600 [doi]
- On Integrating Population-Based Metaheuristics with Cooperative ParallelismJheisson López, Danny Munera, Daniel Diaz, Salvador Abreu. 601-608 [doi]
- A Competitive Approach for Bi-Level Co-EvolutionEmmanuel Kieffer, Grégoire Danoy, Pascal Bouvry, Anass Nagih. 609-618 [doi]
- A GPU Parallel Approximation Algorithm for Scheduling Parallel Identical Machines to Minimize MakespanYuanzhe Li, Laleh Ghalami, Loren Schwiebert, Daniel Grosu. 619-628 [doi]
- A Survey on Parallel Genetic Algorithms for Shop Scheduling ProblemsJia Luo, Didier El Baz. 629-636 [doi]
- Scalable b-Matching on GPUsMd. Naim, Fredrik Manne. 637-646 [doi]
- Automated Analysis of Task-Parallel Execution Behavior Via Artificial Neural NetworksRichard Neill, Andi Drebes, Antoniu Pop. 647-656 [doi]
- Data Stream Processing at Network EdgesThanasis Loukopoulos, Nikos Tziritas, Maria G. Koziri, George Stamoulis, Samee U. Khan, Cheng-Zhong Xu 0001, Albert Y. Zomaya. 657-665 [doi]
- WA-RRNS: Reliable Data Storage System Based on Multi-cloudAndrei Tchernykh, Mikhail G. Babenko, Vanessa Miranda-López, Alexander Yu. Drozdov, Arutyun Avetisyan. 666-673 [doi]
- Introduction to HPPAC 2018Shuaiwen Leon Song, Natalie J. Bates, Ang Li. 674 [doi]
- HPPAC 2018 KeynoteGregory A. Koenig. 675 [doi]
- DEEP-Mon: Dynamic and Energy Efficient Power Monitoring for Container-Based InfrastructuresRolando Brondolin, Tommaso Sardelli, Marco D. Santambrogio. 676-684 [doi]
- Energy and Power Aware Job Scheduling and Resource Management: Global Survey - Initial AnalysisMatthias Maiterth, Gregory A. Koenig, Kevin Pedretti, Siddhartha Jana, Natalie J. Bates, Andrea Borghesi, Dave Montoya, Andrea Bartolini, Milos Puzovic. 685-693 [doi]
- Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via PrefetchingSean Rea, Ehsan Atoofian. 694-701 [doi]
- Quality Assessment of GPU Power Profiling MechanismsSatyabrata Sen, Neena Imam, Chung-Hsing Hsu. 702-711 [doi]
- System Monitoring with lo2s: Power and Runtime Impact of C-State TransitionsThomas Ilsche, Robert Schöne, Philipp Joram, Mario Bielert, Andreas Gocht. 712-715 [doi]
- Power and Performance Tradeoff of a Floating-Point Intensive Kernel on OpenCL FPGA PlatformZheming Jin, Hal Finkel. 716-720 [doi]
- Making a Case for Green High-Performance Visualization Via Embedded Graphics ProcessorsVignesh Adhinarayanan, Bishwajit Dutta, Wu-chun Feng. 721-724 [doi]
- A Comparison of Power Management Mechanisms: P-States vs. Node-Level Power Cap ControlKevin T. Pedretti, Ryan E. Grant, James H. Laros III, Michael Levenhagen, Stephen L. Olivier, Lee Ward, Andrew J. Younge. 725-729 [doi]
- Introduction to APDCM 2018Oscar H. Ibarra, Koji Nakano, Akihiro Fujiwara, Susumu Matsumae. 730-731 [doi]
- APDCM 2018 KeynoteYuji Shinano. 732 [doi]
- Survey: Computational Models for Asymmetric Read and Write CostsYan Gu 0001. 733-743 [doi]
- Implementation of Multioperations in Thick Control Flow ProcessorsMartti Forsell, Jussi Roivainen, Ville Leppänen, Jesper Larsson Träff. 744-752 [doi]
- A Block Streaming Model for Irregular ApplicationsAnup Zope, Edward Luke. 753-762 [doi]
- An Optimal Parallel Algorithm for Computing the Summed Area Table on the GPUYutaro Emoto, Shunji Funasaka, Hiroki Tokura, Takumi Honda, Koji Nakano, Yasuaki Ito. 763-772 [doi]
- Barrier Synchronization: Simplified, Generalized, and Solved Without Mutual ExclusionAlex Aravind. 773-782 [doi]
- An Analysis of Multilevel Checkpoint Performance ModelsDaniel Dauwe, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel. 783-792 [doi]
- Combining Checkpointing and Replication for Reliable Execution of Linear WorkflowsAnne Benoit, Aurélien Cavelan, Florina M. Ciorba, Valentin Le Fèvre, Yves Robert. 793-802 [doi]
- Optimal Cooperative Checkpointing for Shared High-Performance Computing PlatformsThomas Hérault, Yves Robert, Aurelien Bouteiller, Dorian C. Arnold, Kurt B. Ferreira, George Bosilca, Jack J. Dongarra. 803-812 [doi]
- A Population Protocol for Uniform k-Partition Under Global FairnessHiroto Yasumi, Naoki Kitamura, Fukuhito Ooshita, Taisuke Izumi, Michiko Inoue. 813-819 [doi]
- On the Cost of Cloud-Assistance in Tree-Structured P2P Live StreamingSatoshi Fujita. 820-828 [doi]
- Mutual Visibility for Robots with Lights Tolerating Light FaultsGokarna Sharma. 829-836 [doi]
- Joint Cooperative Protocols and Distributed Beamforming Design with Efficient Secondary User Selection for Multi-hop Cognitive Radio NetworksWei Chen 0003, Liang Hong, Sudeep Bhattarai, Tony Sanchez, Ebholo Ijieh, Stacie Severyn, Leonard E. Lightfoot. 837-844 [doi]
- A Novel Handover Control Strategy Combined with Multi-hop Routing in LEO Satellite NetworksChaofan Duan, Jing Feng, Haotian Chang, Bin Song, Zhikang Xu. 845-851 [doi]
- Introduction to ParLearning 2018Henri E. Bal, Arindam Pal, Azalia Mirhoseini, Thomas P. Parnell. 852-853 [doi]
- ParLearning 2018 Invited Talk 1Abhinav Vishnu. 854 [doi]
- ParLearning 2018 Invited Talk 2Azalia Mirhoseini. 855 [doi]
- ParLearning 2018 Invited Talk 3Thomas P. Parnell. 856 [doi]
- Near-Optimal Straggler Mitigation for Distributed Gradient MethodsSongze Li, Seyed Mohammadreza Mousavi Kalan, Amir Salman Avestimehr, Mahdi Soltanolkotabi. 857-866 [doi]
- Streaming Tiles: Flexible Implementation of Convolution Neural Networks Inference on Manycore ArchitecturesNesma M. Rezk, Madhura Purnaprajna, Zain-ul-Abdin. 867-876 [doi]
- Parallel Huge Matrix Multiplication on a Cluster with GPGPU AcceleratorsSeungyo Ryu, Dongseung Kim. 877-882 [doi]
- A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge RegressionElizaveta Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels, Xiaoye Sherry Li. 883-892 [doi]
- Introduction to CHIUW 2018Michael Ferguson, Nikhil Padmanabhan, Brad Chamberlain. 893-894 [doi]
- CHIUW 2018 KeynoteKatherine A. Yelick. 895 [doi]
- Parallel Sparse Tensor Decomposition in ChapelThomas B. Rolinger, Tyler A. Simon, Christopher D. Krieger. 896-905 [doi]
- Iterator-Based Optimization of Imperfectly-Nested LoopsDaniel Feshbach, Mary Glaser, Michelle Strout, David G. Wonnacott. 906-914 [doi]
- Investigating Data Layout Transformations in ChapelApan Qasem, Ashwin M. Aji, Michael L. Chu. 915-924 [doi]
- RCUArray: An RCU-Like Parallel-Safe Distributed Resizable ArrayLouis Jenkins. 925-933 [doi]
- Purity: An Integrated, Fine-Grain, Data-Centric, Communication Profiler for the Chapel LanguageRichard B. Johnson, Jeffrey K. Hollingsworth. 934-942 [doi]
- Introduction to PDSEC 2018 and KeynotesPeter Strazdins, Keita Teranishi, Raphaël Couturier, Joseph Antony, Thomas Rauber, Gudula Rünger, Laurence T. Yang. 943-946 [doi]
- DM-HEOM: A Portable and Scalable Solver-Framework for the Hierarchical Equations of MotionMatthias Noack, Alexander Reinefeld, Tobias Kramer, Thomas Steinke. 947-956 [doi]
- Optimization of Reordering Procedures in HOTRG for Distributed Parallel ComputingHaruka Yamada, Akira Imakura, Toshiyuki Imamura, Tetsuya Sakurai. 957-966 [doi]
- Energy and Performance Improvement of Parallel ODE Solvers by Application-Specific Program TransformationsThomas Rauber, Gudula Rünger. 967-976 [doi]
- The Scalability of Embedded Structured Grids and Unstructured Grids in Large Scale Ice Sheet Modeling on Distributed Memory Parallel ComputersPhillip M. Dickens, Christopher Dufour, James Fastook. 977-986 [doi]
- TNT: A Solver for Large Dense Least-Squares Problems that Takes Conjugate Gradient from Bad in Theory, to Good in PracticeJoseph M. Myre, Erich Frahm, David J. Lilja, Martin O. Saar. 987-995 [doi]
- An Energy-Efficient Asymmetric Multi-Processor for HPC VirtualizationChung Lee, Peter Strazdins. 996-1005 [doi]
- A Preliminary Port and Evaluation of the Uintah AMT Runtime on Sunway TaihuLightZhang Yang, Damodar Sahasrabudhe, Alan Humphrey, Martin Berzins. 1006-1015 [doi]
- Improving CADNA Performance on GPUsPacôme Eberhart, Baptiste Landreau, Julien Brajard, Pierre Fortin, Fabienne Jézéquel. 1016-1025 [doi]
- Evaluation of MD5Hash Kernel on OpenCL FPGA PlatformZheming Jin, Hal Finkel. 1026-1032 [doi]
- Performance Optimization of Fully Anisotropic Elastic Wave Propagation on 2nd Generation Intel® Xeon Phi(TM) ProcessorsAlbert Farrés, Claudia Rosas, Mauricio Hanzich, Alejandro Duran, Charles Yount. 1033-1042 [doi]
- Introduction to JSSPP 2018Walfredo Cirne, Narayan Desai, Dalibor Klusácek. 1043-1044 [doi]
- JSSPP 2018 KeynoteJohn Wilkes. 1045-1046 [doi]
- Introduction to iWAPT 2018Osni Marques, Reiji Suda, Jakub Kurzak, Akihiro Fujii. 1047 [doi]
- iWAPT 2018 Invited Speaker 1Sarah Knepper. 1048 [doi]
- Use of Code Structural Features for Machine Learning to Predict Effective OptimizationsYuki Kawarabatake, Mulya Agung, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa. 1049-1055 [doi]
- Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUsIsrat Nisa, Charles Siegel, Aravind Sukumaran-Rajam, Abhinav Vishnu, P. Sadayappan. 1056-1065 [doi]
- Tensile: Auto-Tuning GEMM GPU Assembly for All Problem SizesDavid E. Tanner. 1066-1075 [doi]
- GreedyTalents: An Energy-Aware Auto-Tuning Method for Many-Core ProcessorTimothy M. Platt, Zhiliu Yang, Chen Liu. 1076-1083 [doi]
- Auto-Tuning for the Era of Relatively High Bandwidth Memory Architectures: A Discussion Based on an FDM ApplicationTakahiro Katagiri. 1084-1092 [doi]
- Threaded Accurate Matrix-Matrix Multiplications with Sparse Matrix-Vector MultiplicationsShuntaro Ichimura, Takahiro Katagiri, Katsuhisa Ozaki, Takeshi Ogita, Toru Nagai. 1093-1102 [doi]
- iWAPT 2018 Invited Speaker 2David E. Tanner. 1103 [doi]
- Algebraic Multigrid Solver Using Coarse Grid Aggregation with Independent AggregationNaoya Nomura, Akihiro Fujii, Teruo Tanaka, Osni Marques, Kengo Nakajima. 1104-1112 [doi]
- A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K ComputerTakeshi Fukaya, Toshiyuki Imamura, Yusaku Yamamoto. 1113-1122 [doi]
- AutoTuneTMP: Auto-Tuning in C++ With Runtime Template MetaprogrammingDavid Pfander, Malte Brunn, Dirk Pflüger. 1123-1132 [doi]
- Methodology for Adaptive Active Message Coalescing in Task Based Runtime SystemsBibek Wagle, Samuel Kellar, Adrian Serio, Hartmut Kaiser. 1133-1140 [doi]
- Introduction to ParSocial 2018Eunice E. Santos, John Korah. 1141 [doi]
- ParSocial 2018 KeynoteV. S. Subrahmanian. 1142 [doi]
- Using Activity Patterns to Place Electric Vehicle Charging Stations in Urban RegionsAnamitra Pal, Pavan Rangudu, S. S. Ravi, Anil Kumar S. Vullikanti. 1143-1152 [doi]
- Handling Vertex Deletions in Memory Scalable Anytime Anywhere Algorithms for Large and Dynamic Social NetworksEunice E. Santos, John Korah, Vairavan Murugappan. 1153-1162 [doi]
- Integrating Cyber Security and Data Science for Social Media: A Position PaperBhavani M. Thuraisingham, Murat Kantarcioglu, Latifur Khan. 1163-1165 [doi]
- Introduction to GraML 2018Antonino Tumeo, Mahantesh Halappanavar, John Feo, Assefaw Hadish Gebremedhin, Abhinav Vishnu. 1166-1167 [doi]
- GraML 2018 KeynoteNesreen Ahmed. 1168 [doi]
- Classification and Anomaly Detection in Traffic Patterns of New York City Taxis: A Case Study in Compound AnalyticsRonald D. Hagan, Charles A. Phillips, Michael A. Langston, Bradley J. Rhodes. 1169-1174 [doi]
- V2V: Vector Embedding of a Graph and ApplicationsTrong Duc Nguyen, Srikanta Tirthapura. 1175-1183 [doi]
- Network Similarity Prediction in Time-Evolving Graphs: A Machine Learning ApproachKeyvan Sasani, Mohammad Hossein Namaki, Assefaw Hadish Gebremedhin. 1184-1193 [doi]
- Neural Networks and Graph Algorithms with Next-Generation ProcessorsKathleen E. Hamilton, Catherine D. Schuman, Steven R. Young, Neena Imam, Travis S. Humble. 1194-1203 [doi]
- Introduction to CEBDA 2018Shadi Ibrahim, Manish Parashar, Anna Queralt, Domenico Talia. 1204 [doi]
- CEBDA 2018 KeynoteFranck Cappello. 1205 [doi]
- Data-Locality Aware Dynamic Schedulers for Independent Tasks with Replicated InputsOlivier Beaumont, Thomas Lambert, Loris Marchal, Bastien Thomas. 1206-1213 [doi]
- Transferring Data from High-Performance Simulations to Extreme Scale Analysis Applications in Real-TimeThomas Marrinan, Silvio Rizzi, Joseph A. Insley, Brian R. Toonen, William E. Allcock, Michael E. Papka. 1214-1220 [doi]
- Towards a TRansparent I/O SolutionFotios Nikolaidis, Nick Kossifidis, Thomas Leibovici, Soraya Zertal. 1221-1228 [doi]
- Introduction to MPP 2018Leandro A. J. Marzulo, Felipe Maia Galvão França, Cristiana Bentes, Gabriele Mencagli. 1229-1230 [doi]
- MPP 2018 KeynoteVladimir Castro Alves, Jae Young Do. 1231 [doi]
- Invited Paper: How Future Buildings Could Redefine Distributed ComputingYanik Ngoko, Nicolas Saintherant, Christophe Cérin, Denis Trystram. 1232-1240 [doi]
- A Smart Disk for In-Situ Face RecognitionVictor C. Ferreira, Alexandre Solon Nery, Felipe Maia Galvão França. 1241-1249 [doi]
- A DVND Local Search Implemented on a Dataflow Architecture for the Minimum Latency ProblemRodolfo Pereira Araujo, Igor Machado Coelho, Leandro A. J. Marzulo. 1250-1259 [doi]
- CompStor: An In-storage Computation Platform for Scalable Distributed ProcessingMahdi Torabzadehkashi, Siavash Rezaei, Vladimir Castro Alves, Nader Bagherzadeh. 1260-1267 [doi]
- Fog-Assisted Translation: Towards Efficient Software Emulation on Heterogeneous IoT DevicesVanderson Martins do Rosario, Flavia Pisani, Alexandre Rodrigues Gomes, Edson Borin. 1268-1277 [doi]
- Introduction to PMAW 2018Martin Kong, Zoran Budimlic. 1278 [doi]
- Introduction to ROME 2018Stefan Lankes, Carsten Clauss, Jens Breitbart. 1279-1280 [doi]
- ROME 2018 KeynoteSang-Hoon Kim. 1281 [doi]
- ROME 2018 Invited TalkKarl Fuerlinger. 1282 [doi]
- Memory Footprint of Locality Information on Many-Core PlatformsBrice Goglin. 1283-1292 [doi]
- Diagnosing Performance Fluctuations of High-Throughput Software for Multi-core CPUsSoramichi Akiyama, Takahiro Hirofuchi, Ryousei Takano. 1293-1302 [doi]
- Parallelizing MPI Using Tasks for Hybrid Programming ModelsSurabhi Jain, Gengbin Zheng, Maria Garzaran, James H. Cownie, Taru Doodi, Terry L. Wilmarth. 1303-1312 [doi]
- A Study of Network Quality of Service in Many-Core MPI ApplicationsLee Savoie, David K. Lowenthal, Bronis R. de Supinski, Kathryn Mohror. 1313-1322 [doi]
- Custom machine learning architectures: towards realtime anomaly detection for flight testingDi Wu, Zhanrui Sun, Yongxin Zhu 0001, Li Tian, Hanlin Zhu, Peng Xiong, Zihao Cao, Menglin Wang, Yu Zheng, Chao Xiong, Hao Jiang, Kuen Hung Tsoi, Xinyu Niu, Wei Mao, Can Feng, Xiaowen Zha, Guobao Deng, Wayne Luk. 1323-1330 [doi]
- Multi-start simulated annealing for partially-reconfigurable FPGA floorplanningFrançois Galea, Sergiu Carpov, Lilia Zaourar. 1335-1338 [doi]