Abstract is missing.
- HCW 2020 Keynote Speaker Edge Intelligence Empowering IoT Data AnalyticsAlbert Y. Zomaya. 1 [doi]
- Message from the HCW Steering Committee ChairBehrooz A. Shirazi. 2 [doi]
- Message from the HCW General ChairJohn K. Antonio. 3 [doi]
- Message from the HCW Technical Program Committee ChairFlorina M. Ciorba. 4 [doi]
- MigHEFT: DAG-based Scheduling of Migratable Tasks on Heterogeneous Compute NodesAchim Lösch, Marco Platzner. 6-16 [doi]
- Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing SystemsAli Mokhtari, Chavit Denninnart, Mohsen Amini Salehi. 17-26 [doi]
- I/O Performance of the SX-Aurora TSUBASAMitsuo Yokokawa, Ayano Nakai, Kazuhiko Komatsu, Yuta Watanabe, Yasuhisa Masaoka, Yoko Isobe, Hiroaki Kobayashi. 27-35 [doi]
- (Special Topic Submission) Enabling Domain-Specific Architectures with an Open-Source Soft-Core GPGPUMarcelo Brandalero, Hector Gerardo Muñoz Hernandez, Mitko Veleski, Muhammed Al Kadi, Paolo Rech, Michael Hübner. 36-43 [doi]
- User-Space Emulation Framework for Domain-Specific SoC DesignJoshua Mack, Nirmal Kumbhare, Anish Nk, Ümit Y. Ogras, Ali Akoglu. 44-53 [doi]
- Improving Inference Latency and Energy of Network-on-Chip based Convolutional Neural Networks through Weights CompressionGiuseppe Ascia, Vincenzo Catania, John Jose, Salvatore Monteleone, Maurizio Palesi, Davide Patti. 54-63 [doi]
- CNN-based Monocular Decentralized SLAM on embedded FPGAJincheng Yu, Feng Gao, Jianfei Cao, Chao Yu, Zhaoliang Zhang, Zhengfeng Huang, Yu Wang, Huazhong Yang. 66-73 [doi]
- Improving HLS Generated Accelerators Through Relaxed Memory Access SchedulingJohanna Rohde, Karsten Müller, Christian Hochberger. 74-81 [doi]
- Real-time Automatic Modulation Classification using RFSoCStephen Tridgell, David Boland, Philip H. W. Leong, Ryan Kastner, Alireza Khodamoradi, Siddhartha 0003. 82-89 [doi]
- FPGA Based Emulation Environment for Neuromorphic ArchitecturesSpencer Valancius, Edward Richter, Ruben Purdy, Kris Rockowitz, Michael Inouye, Joshua Mack, Nirmal Kumbhare, Kaitlin Fair, John Mixter, Ali Akoglu. 90-97 [doi]
- PHRYCTORIA: A Messaging System for Transprecision OpenCAPI-attached FPGA AcceleratorsDionysios Diamantopoulos, Mitra Purandare, Burkhard Ringlein, Christoph Hagleitner. 98-106 [doi]
- QTAccel: A Generic FPGA based Design for Q-Table based Reinforcement Learning AcceleratorsYuan Meng, Sanmukh R. Kuppannagari, Rachit Rajat, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna. 107-114 [doi]
- SpiderWeb - High Performance FPGA NoCMartin Langhammer, Gregg Baeckler, Sergey Gribok. 115-118 [doi]
- Secure acceleration on cloud-based FPGAs - FPGA enclavesHåkan Englund, Niklas Lindskog. 119-122 [doi]
- A FPGA-Based Post-Processing and Validation Platform for Random Number GeneratorsLaurent Gantel, Alexandre Duc, Lucie Steiner, Fabien Vannel, Andres Upegui, Florent Gluck. 123-126 [doi]
- An Interval-based Mapping Algorithm for Multi-shape Tasks on Dynamic Partial Reconfigurable FPGAsTingyu Zhou, Tieyuan Pan, Michael Conrad Meyer, Yiping Dong, Takahiro Watanabe. 127-130 [doi]
- EMPhASIS: An EMbedded Public Attention Stress Identification SystemJessica Leoni, Asia Ciallella, Luca Stornaiuolo, Marco D. Santambrogio, Donatella Sciuto. 131-134 [doi]
- Hardware resources analysis of BNNs splitting for FARD-based multi-FPGAs Distributed SystemsGiorgia Fiscaletti, Marco Speziali, Luca Stornaiuolo, Marco D. Santambrogio, Donatella Sciuto. 135-138 [doi]
- A Microcode-based Control Unit for Deep Learning ProcessorsQian Zhao 0001, Yasuhiro Nakahara, Motoki Amagasaki, Masahiro Iida, Takaichi Yoshida. 139-142 [doi]
- Fast Monocular Depth Estimation on an FPGAYouki Sada, Naoto Soga, Masayuki Shimoda, Akira Jinguji, Shimpei Sato, Hiroki Nakahara. 143-146 [doi]
- SALSA: A Domain Specific Architecture for Sequence AlignmentLorenzo Di Tucci, Riyadh Baghdadi, Saman P. Amarasinghe, Marco D. Santambrogio. 147-150 [doi]
- Optimizing OpenCL Kernels and Runtime for DNN Inference on FPGAsSeung-Hun Chung, Tarek S. Abdelrahman. 151-154 [doi]
- Leveraging Succinct Data Structures for DNA Sequence Mapping on FPGAGuido Walter Di Donato, Alberto Zeni, Lorenzo Di Tucci, Marco D. Santambrogio. 155-158 [doi]
- A Tropical Semiring Multiple Matrix-Product Library on GPUs: (not just) a step towards RNA-RNA Interaction ComputationsBrandon Gildemaster, Prerana Ghalsasi, Sanjay V. Rajopadhye. 160-169 [doi]
- Fast and High Quality Graph Alignment via TreeletsMorgan Lee, George M. Slota. 170-173 [doi]
- GPU accelerated partial order multiple sequence alignment for long reads self-correctionFrancesco Peverelli, Lorenzo Di Tucci, Marco D. Santambrogio, Nan Ding, Steven A. Hofmeyr, Aydin Buluç, Leonid Oliker, Katherine A. Yelick. 174-182 [doi]
- Optimizing High-Performance Computing Systems for Biomedical WorkloadsPatricia A. Kovatch, Lili Gai, Hyung Min Cho, Eugene Fluder, Dansha Jiang. 183-192 [doi]
- Kcollections: A Fast and Efficient Library for K-mersM. Stanley Fujimoto, Cole A. Lyman, Mark J. Clement. 193-198 [doi]
- Message from the workshop chairsScott McMillan, Manoj Kumar 0006, Danai Koutra, Mahantesh Halappanavar, Tim Mattson, Antonino Tumeo. 199-200 [doi]
- GrAPL 2020 Keynote Speaker Deep Graph Library: Overview, Updates, and Future DevelopmentsGeorge Karypis. 201 [doi]
- GrAPL 2020 Keynote Speaker The GraphIt Universal Graph Framework: Achieving HighPerformance across Algorithms, Graph Types, and ArchitecturesSaman P. Amarasinghe. 202 [doi]
- An incremental GraphBLAS solution for the 2018 TTC Social Media case studyMárton Elekes, Gábor Szárnyas. 203-206 [doi]
- 75, 000, 000, 000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS MatricesJeremy Kepner, Tim Davis 0001, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones 0001, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther. 207-210 [doi]
- Parallelizing Maximal Clique Enumeration on Modern Manycore ProcessorsJovan Blanusa, Radu Stoica, Paolo Ienne, Kubilay Atasu. 211-214 [doi]
- Considerations for a Distributed GraphBLAS APIBenjamin Brock, Aydin Buluç, Timothy G. Mattson, Scott McMillan, José E. Moreira, Roger Pearce, Oguz Selvitopi, Trevor Steil. 215-218 [doi]
- A Roadmap for the GraphBLAS C++ APIBenjamin Brock, Aydin Buluç, Timothy G. Mattson, Scott McMillan, José E. Moreira. 219-222 [doi]
- Linear Algebraic Louvain Method in PythonTze Meng Low, Daniele G. Spampinato, Scott McMillan, Michel Pelletier. 223-226 [doi]
- A scalable graph generation algorithm to sample over a given shell distributionM. Yusuf Özkaya, M. Fatih Balin, Ali Pinar, Ümit V. Çatalyürek. 227-236 [doi]
- Kronecker Graph Generation with Ground Truth for 4-Cycles and Dense Structure in Bipartite GraphsTrevor Steil, Scott McMillan, Geoffrey Sanders, Roger Pearce, Benjamin Priest. 237-246 [doi]
- Message from the EduPar-20 Workshop ChairsSushil K. Prasad, Tia Newhall, David P. Bunde, Martina Barnas, Satish Puri. 247-249 [doi]
- EduPar-20 Keynote SpeakerMartin Langhammer. 250 [doi]
- EduPar-20 Invited PanelHenry A. Gabb, Andrew Lumsdaine, Margaret Martonosi, Arnold L. Rosenberg, Martina Barnas. 251 [doi]
- Retrospective: A Look Back at 20+ Years of Experience in Parallel Computing EducationJoel C. Adams. 252-260 [doi]
- A Framework for the Evaluation of Parallel and Distributed Computing Educational ResourcesDavid W. Brown, Vitaly Ford, Sheikh K. Ghafoor. 261-268 [doi]
- NumbaSummarizer: A Python Library for Simplified Vectorization ReportsNeftali Watkinson, Preston Tai, Alexandru Nicolau, Alexander V. Veidenbaum. 269-275 [doi]
- EASYPAP: a Framework for Learning Parallel ProgrammingAlice Lasserre, Raymond Namyst, Pierre-André Wacrenier. 276-283 [doi]
- PDCunplugged: A Free Repository of Unplugged Parallel Distributed Computing ActivitiesSuzanne J. Matthews. 284-291 [doi]
- Teaching Modern Multithreading in CS2 with ActorsMark C. Lewis, Lisa L. Lacher. 292-299 [doi]
- Teaching Cloud Computing: Motivations, Challenges and ToolsCosimo Anglano, Massimo Canonico, Marco Guazzone. 300-306 [doi]
- Using Embedded Xinu and the Raspberry Pi 3 to Teach Operating SystemsPatrick J. McGee, Rade Latinovich, Dennis Brylow. 307-315 [doi]
- Workshop 6: HIPS High-level Parallel Programming Models and Supportive EnvironmentsDong Li, Heike Jagode. 316 [doi]
- Compile-time Parallelization of Subscripted Subscript PatternsAkshay Bhosale, Rudolf Eigenmann. 317-325 [doi]
- Online Scheduling with Redirection for Parallel JobsAdrien Fauré, Giorgio Lucarelli, Olivier Richard, Denis Trystram. 326-329 [doi]
- Performance Portability Evaluation of OpenCL Benchmarks across Intel and NVIDIA PlatformsColleen Bertoni, JaeHyuk Kwack, Thomas Applencourt, Yasaman Ghadar, Brian Homerding, Christopher Knight, Brice Videau, Huihuo Zheng, Vitali A. Morozov, Scott Parker. 330-339 [doi]
- Scalable Crash Consistency for Staging-based In-situ Scientific WorkflowsShaohua Duan, Manish Parashar. 340-348 [doi]
- Automatic Selection of Tuning Plugins in PTF Using Machine LearningRobert Mijakovic, Michael Gerndt. 349-358 [doi]
- Porting a Legacy CUDA Stencil Code to oneAPISteffen Christgau, Thomas Steinke. 359-367 [doi]
- A Case Study on the HACCmk Routine in SYCL on Integrated GraphicsZheming Jin, Vitali Morozov, Hal Finkel. 368-374 [doi]
- Enhancing Java Streams API with PowerList ComputationVirginia Niculescu, Darius Bufnea, Adrian Sterca. 375-384 [doi]
- Workshop 7: HPBDC High-Performance Big Data and Cloud ComputingXiaoyi Lu, Jianfeng Zhan. 385 [doi]
- Two-Pass Softmax AlgorithmMarat Dukhan, Artsiom Ablavatski. 386-395 [doi]
- Smart Streaming: A High-Throughput Fault-tolerant Online Processing SystemJia Guo, Gagan Agrawal. 396-405 [doi]
- Parallel Query Service for Object-centric Data Management SystemsHoujun Tang, Suren Byna, Bin Dong 0002, Quincey Koziol. 406-415 [doi]
- Pinocchio: A Blockchain-Based Algorithm for Sensor Fault Tolerance in Low Trust EnvironmentChen Zeng, Yifan Wang, Fan Liang, Xiaohui Peng 0002. 416-425 [doi]
- Scaling Optimizations for Large-Scale Distributed Data with Lightweight CoresetsDaniel Nobre Pinheiro, Samuel Xavier de Souza, Daniel Aloise. 426-429 [doi]
- Workshop 8: AsHES Accelerators and Hybrid Exascale SystemsMin-Si, Lena Oden, Simon Garcia De Gonzalo. 430 [doi]
- AsHES 2020 Keynote Speaker (5: 30 pm CDT)Taisuke Boku. 431 [doi]
- Population Count on Intel® CPU, GPU and FPGAZheming Jin, Hal Finkel. 432-439 [doi]
- SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systemsSeher Acer, Erik G. Boman, Sivasankaran Rajamanickam. 440-449 [doi]
- Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGANorihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, Tomohiro Ueno, Kentaro Sano, Taisuke Boku. 450-459 [doi]
- In-Depth Optimization with the OpenACC-to-FPGA Framework on an Arria 10 FPGAJacob Lambert, Seyong Lee, Jeffrey S. Vetter, Allen D. Malony. 460-470 [doi]
- Unified data movement for offloading Charm++ applicationsMatthias Diener, Laxmikant V. Kalé. 471-474 [doi]
- Towards automated kernel selection in machine learning systems: A SYCL case studyJohn Lawson. 475-478 [doi]
- Understanding the Performance of Elementary NLA Kernels in FPGAsFederico Favaro, Juan P. Oliver, Ernesto Dufrechou, Pablo Ezzatti. 479-482 [doi]
- Scalability of Sparse Matrix Dense Vector Multiply (SpMV) on a Migrating Thread ArchitectureBrian A. Page, Peter M. Kogge. 483-488 [doi]
- Workshop 9: PDCO Parallel / Distributed Combinatorics and OptimizationGrégoire Danoy, Didier El Baz, Vincent Boyer 0002, Bernabé Dorronsoro, Laurence T. Yang, Keqin Li 0001. 489 [doi]
- Load Balancing Run-Times and Space Usage for Computing the Power SetRoger L. Goodwin. 490-501 [doi]
- Implementing Central Force optimization on the Intel Xeon PhiThomas Charest, Robert C. Green. 502-511 [doi]
- Parallel/distributed implementation of cellular training for generative adversarial neural networksEmiliano Pérez, Sergio Nesmachnow, Jamal Toutouh, Erik Hemberg, Una-May O'Reilly. 512-518 [doi]
- Predicting near-optimal skin distance in Verlet buffer approach for Discrete Element MethodAbdoul Wahid Mainassara Checkaraou, Xavier Besseron, Alban Rousset, Emmanuel Kieffer, Bernhard Peters. 519-527 [doi]
- Competitive Evolution of a UAV Swarm for Improving Intruder Detection RatesDaniel H. Stolfi, Matthias R. Brust, Grégoire Danoy, Pascal Bouvry. 528-535 [doi]
- Workshop 10: APDCM Advances in Parallel and Distributed Computational ModelsJacir Luiz Bordim, Koji Nakano, Susumu Matsumae, Masahiro Shibata. 536-537 [doi]
- Debugging strongly-compartmentalized distributed systemsHenry Zhu, Nik Sultana, Boon Thau Loo. 538-547 [doi]
- An Efficient Multicore CPU Implementation for Convolution-Pooling Computation in CNNsHiroki Kataoka, Kohei Yamashita, Yasuaki Ito, Koji Nakano, Akihiko Kasagi, Tsuguchika Tabaru. 548-556 [doi]
- A Work-Time Optimal Parallel Exhaustive Search Algorithm for the QUBO and the Ising model, with GPU implementationMasaki Tao, Koji Nakano, Yasuaki Ito, Ryota Yasudo, Masaru Tatekawa, Ryota Katsuki, Takashi Yazane, Yoko Inaba. 557-566 [doi]
- Design and Comparison of Resilient Scheduling Heuristics for Parallel JobsAnne Benoit, Valentin Le Fèvre, Padma Raghavan, Yves Robert, Hongyang Sun. 567-576 [doi]
- Optimizing Memory Access in TCF Processors with Compute-Update OperationsMartti Forsell, Jussi Roivainen, Jesper Larsson Träff. 577-586 [doi]
- TOSS: A Topology-based Scheduler for Storm C1ustersYi Zhou 0009, Yangyang Liu, Chaowei Zhang, Xiaopu Peng, Xiao Oin. 587-596 [doi]
- Revisiting dynamic DAG scheduling under memory constraints for shared-memory platformsGabriel Bathie, Loris Marchal, Yves Robert, Samuel Thibault. 597-606 [doi]
- Optimal Randomized Complete Visibility on a Grid for Asynchronous Robots with LightsGokarna Sharma, Ramachandran Vaidyanathan, Jerry L. Trahan. 607-616 [doi]
- An Initial Assessment of NVSHMEM for High Performance ComputingChung-Hsing Hsu, Neena Imam, Akhil Langer, Sreeram Potluri, Chris J. Newburn. 617-626 [doi]
- A Model Checking Method for Secure Routing Protocols by SPIN with State Space ReductionHideharu Kojima, Naoto Yanai. 627-635 [doi]
- Methods and Experiences for Developing Abstractions for Data-intensive, Scientific ApplicationsAndré Luckow, Shantenu Jha. 636-645 [doi]
- JSSPP 2020 - 23rd Workshop on Job Scheduling Strategies for Parallel ProcessingDalibor Klusácek, Walfredo Cirne, Narayan Desai. 646-647 [doi]
- CHIUW 2020 The Seventh Annual Chapel Implementers and Users WorkshopBenjamin Robbins. 648-649 [doi]
- CHIUW 2020 Keynote Arkouda: Chapel-Powered, Interactive Supercomputing for Data ScienceWilliam Reus. 650 [doi]
- Development of Parallel CFD Applications on Distributed Memory with ChapelMatthieu Parenteau, Simon Bourgault-Cote, Frederic Plante, Eric Laurendeau. 651-658 [doi]
- Paving the way for Distributed Non-Blocking Algorithms and Data Structures in the Partitioned Global Address Space modelGarvit Dewan, Louis Jenkins. 659-666 [doi]
- Computing Hypergraph Homology in ChapelJesun Sahariar Firoz, Louis Jenkins, Cliff A. Joslyn, Brenda Praggastis, Emilie Purvine, Mark Raugas. 667-670 [doi]
- An Automated Machine Learning Approach for Data Locality Optimizations in ChapelEngin Kayraklioglu, Tarek A. El-Ghazawi. 671 [doi]
- Exploring Chapel Productivity Using Some Graph AlgorithmsRichard F. Barrett, Jeanine E. Cook, Stephen L. Olivier, Omar Aaziz, Christipher D. Jenkins, Courtenay T. Vaughan. 672 [doi]
- Visibility Control: Use and Import Statement ImprovementsLydia Duncan. 673 [doi]
- Towards Stability in the Chapel LanguageMichael P. Ferguson. 674 [doi]
- Exploring a multi-resolution GPU programming model for ChapelAkihiro Hayashi, Sri Raj Paul, Vivek Sarkar. 675 [doi]
- Random Forests in ChapelBenjamin Albrecht. 676 [doi]
- Squeezing performance out of ArkoudaElliot Ronaghan. 677 [doi]
- Simulating Ultralight Dark Matter in ChapelNikhil Padmanabhan, Elliot Ronaghan, J. Luna Zagorac, Richard Easther. 678 [doi]
- Chapel on AcceleratorsRahul Ghangas, Josh Milthorpe. 679 [doi]
- Workshop 13: PDSEC Parallel and Distributed Scientific and Engineering ComputingRaphaël Couturier, Peter Strazdins, Eric Aubanel, Sabine Roller, Laurence T. Yang, Thomas Rauber, Gudula Rünger. 680-681 [doi]
- Comparison of MPI and Spark for Data Science ApplicationsManvi Saxena, Shweta Jha, Saba Khan, John Rodgers, Peggy Lindner, Edgar Gabriel. 682-690 [doi]
- Improving MPI Application Communication Time with an Introspection Monitoring LibraryEmmanuel Jeannot, Richard Sartori. 691-700 [doi]
- A GPU-Accelerated Barycentric Lagrange TreecodeNathan Vaughn, Leighton Wilson, Robert Krasny. 701-710 [doi]
- Vectorization and Minimization of Memory Footprint for Linear High-Order Discontinuous Galerkin SchemesJean-Matthieu Gallard, Leonhard Rannabauer, Anne Reinarz, Michael Bader. 711-720 [doi]
- Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based RuntimeYu Pei, Qinglei Cao, George Bosilca, Piotr Luszczek, Victor Eijkhout, Jack J. Dongarra. 721-729 [doi]
- Implementing an Attack Graph Generator in CUDAMing Li, Peter J. Hawrylak, John Hale. 730-738 [doi]
- Tri-Objective Workflow Scheduling and Optimization in Heterogeneous Cloud EnvironmentsHuda Alrammah, Yi Gu, Zhifeng Liu. 739-748 [doi]
- Identifying Optimization Opportunities Using Memory Access Tracing in OpenSHMEM Runtimes with the TAU Performance SystemNicholas Chaimov, Sameer Shende, Allen D. Malony, Neena Imam. 749-756 [doi]
- Tiled Algorithms for Efficient Task-Parallel ℌ-Matrix SolversRocío Carratalá-Sáez, Mathieu Faverge, Grégoire Pichon, Guillaume Sylvand, Enrique S. Quintana-Ortí. 757-766 [doi]
- Workshop 14: iWAPT Automatic Performance TuningI-Hsin Chung, Kazuhiko Komatsu. 767-768 [doi]
- Machine Learning-Based Prefetching for SCM Main Memory SystemMayuko Koezuka, Yusuke Shirota, Satoshi Shirai, Tatsunori Kanai. 769-776 [doi]
- Acceleration of Structural Analysis Simulations using CNN-based Auto-Tuning of Solver ToleranceAmir Haderbache, Koichi Shirahata, Takuji Yamamoto, Yasumoto Tomita, Hiroshi Okuda. 777-786 [doi]
- Using Small-Scale History Data to Predict Large-Scale Performance of HPC ApplicationWenju Zhou, Jiepeng Zhang, Jingwei Sun, Guangzhong Sun. 787-795 [doi]
- Node-Aware Stencil Communication for Heterogeneous SupercomputersCarl Pearson, Mert Hidayetoglu, Mohammad Almasri, Omer Anjum, I-Hsin Chung, Jinjun Xiong, Wen-mei W. Hwu. 796-805 [doi]
- Task Priority Control for the HPX Runtime SystemSuhang Jiang, Mulya Agung, Ryusuke Egawa, Hiroyuki Takizawa. 806-813 [doi]
- Improving Collective I/O Performance with Machine Learning Supported Auto-tuningAyse Bagbaba. 814-821 [doi]
- Automatically Avoiding Memory Access Conflicts on SX-Aurora TSUBASANaoki Ebata, Ryusuke Egawa, Yoko Isobe, Ryoji Takaki, Hiroyuki Takizawa. 822-829 [doi]
- Importance of Selecting Data Layouts in the Tsunami Simulation CodeTakumi Kishitani, Kazuhiko Komatsu, Masayuki Sato 0001, Akihiro Musa, Hiroaki Kobayashi. 830-837 [doi]
- Workshop 15: MPP Parallel Programming Models - Special Edition Machine Learning Performance and SecurityLeandro A. J. Marzulo, Tiago A. O. Alves, Cristiana Bentes, Gabriele Mencagli. 838-839 [doi]
- Enhancing the Utilization of Dot-Product Engines in Deep Learning AcceleratorsTaha Soliman, Armin Runge, Leonardo Ecco. 840-843 [doi]
- Weightless Neural Networks Applied to Nonintrusive Load MonitoringGuilherme C. De Lello, Juliano F. Caldeira, Mauricio Aredes, Felipe M. G. França, Priscila M. V. Lima. 844-851 [doi]
- Tangle Ledger for Decentralized LearningRobert Schmid, Bjarne Pfitzner, Jossekin Beilharz, Bert Arnrich, Andreas Polze. 852-859 [doi]
- Regression WiSARD application of controller on DC STATCOM converter under fault conditionsRaphael N. C. B. Rocha, Leopoldo Lusquino Filho, Mauricio Aredes, Felipe M. G. França, Priscila M. V. Lima. 860-867 [doi]
- Workshop 16: SNACS Scalable Networks for Advanced Computing SystemsIlkay Altintas, Dorian C. Arnold, Martin Schulz 0001, Matthew G. F. Dosanjh, Ryan E. Grant, Taylor L. Groves. 868 [doi]
- Analyzing and Understanding the Impact of Interconnect Performance on HPC, Big Data, and Deep Learning Applications: A Case Study with InfiniBand EDR and HDRAmit Ruhela, Shulei Xu, Karthik Vadambacheri Manian, Hari Subramoni, Dhabaleswar K. Panda. 869-878 [doi]
- The Case for Explicit Reuse Semantics for RDMA CommunicationScott Levy, Patrick M. Widener, Craig Ulmer, Todd Kordenbrock. 879-888 [doi]
- Performance of MPI Sends of Non-Contiguous DataVictor Eijkhout. 889-895 [doi]
- Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPIKaushik Kandadi Suresh, Bharath Ramesh 0005, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. D. K. Panda. 896-905 [doi]
- Workshop 17: PAISE Parallel AI and Systems for the EdgePeter H. Beckman, Rajesh Sankaran. 906-907 [doi]
- Analyzing Deep Learning Model Inferences for Image Classification using OpenVINOZheming Jin, Hal Finkel. 908-911 [doi]
- Energy-Efficient Machine Learning on the EdgesMohit Kumar, Xingzhou Zhang, Liangkai Liu, Yifan Wang 0005, Weisong Shi. 912-921 [doi]
- Indirect Deconvolution AlgorithmMarat Dukhan. 922-926 [doi]
- Multiperspective Automotive LabelingLuke Jacobs, Akhil Kodumuri, Jim James, Seongha Park, Yongho Kim. 927-936 [doi]
- Integrating DOTS With Blockchain Can Secure Massive IoT SensorsSyed Badruddoja, Ram Dantu, Logan Widick, Zachary Zaccagni, Kritagya Upadhyay. 937-946 [doi]
- Workshop on Resource Arbitration for Dynamic Runtimes (RADR)Peter H. Beckman, Emmanuel Jeannot, Swann Perarnau. 947-949 [doi]
- NUMA-aware CPU core allocation in cooperating dynamic applicationsJirí Dokulil, Siegfried Benkner. 950-957 [doi]
- Overlapping MPI communications with Intel TBB computationCassandra Rocha Barbosa, Pierre Lemarinier, Marc Sergent, Guillaume Papauré, Marc Pérache. 958-966 [doi]
- System Software for Resource Arbitration on Future Many-* ArchitecturesFlorian Schmaus, Sebastian Maier, Tobias Langer, Jonas Rabenstein, Timo Hönig, Wolfgang Schröder-Preikschat, Lars Bauer, Jörg Henkel. 967-975 [doi]
- An Implementation of User-Level Processes using Address Space SharingAtsushi Hori, Balazs Gerofi, Yutaka Ishikawa. 976-984 [doi]
- Workshop 19: ScaDL Scalable Deep Learning over Parallel and Distributed InfrastructuresAshish Verma, Christopher D. Carothers, K. R. Jayaram, Parijat Dube. 985-986 [doi]
- "A Stitch in Time": A Grand Challenge for Distributed Machine LearningManish Gupta. 987 [doi]
- High Performance Computing: From Deep Learning to Data EngineeringGeoffrey C. Fox. 988 [doi]
- Advancing Computing Infrastructure for Very Large-Scale Deep Learning at C3SRWen-mei Hwu. 989 [doi]
- Scalable Deep Learning Inference: Algorithmic ApproachMinsik Cho. 990 [doi]
- Neural Network Molecular Dynamics at ScalePankaj Rajak, Kuang Liu, Aravind Krishnamoorthy, Rajiv K. Kalia, Aiichiro Nakano, Ken-ichi Nomura, Subodh C. Tiwari, Priya Vashishta. 991-994 [doi]
- Asynchronous SGD for DNN training on Shared-memory Parallel ArchitecturesFlorent Lopez, Edmond Chow, Stanimire Tomov, Jack J. Dongarra. 995-998 [doi]
- Accelerating Towards Larger Deep Learning Models and Datasets - A System Platform View PointSaritha Vinod, M. Naveen, Asis K. Patra, Anto Ajay Raj John. 999-1005 [doi]
- Data Parallel Large Sparse Deep Neural Network on GPUNaw Safrin Sattar, Shaikh Anfuzzaman. 1006-1014 [doi]
- Efficient Training of Semantic Image Segmentation on Summit using Horovod and MVAPICH2-GDRQuentin Anthony, Ammar Ahmad Awan, Arpan Jain, Hari Subramoni, Dhabaleswar K. D. K. Panda. 1015-1023 [doi]
- First IEEE International Workshop on High-Performance Storage (HPS)Kathryn Mohror, Marc Snir. 1024-1026 [doi]
- Dynamic Provisioning of Storage Resources: A Case Study with Burst BuffersFrancois Tessier, Maxime Martinasso, Matteo Chesi, Mark Klein, Miguel Gila. 1027-1035 [doi]
- Optimizing Asynchronous Multi-Level Checkpoint/Restart Configurations with Machine LearningTonmoy Dey, Kento Sato, Bogdan Nicolae, Jian Guo, Jens Domke, Weikuan Yu, Franck Cappello, Kathryn Mohror. 1036-1043 [doi]
- On Overlapping Communication and File I/O in Collective Write OperationRaafat Feki, Edgar Gabriel. 1044-1051 [doi]
- Recorder 2.0: Efficient Parallel I/O Tracing and AnalysisChen Wang, Jinghan Sun, Marc Snir, Kathryn Mohror, Elsa Gonsiorowski. 1052-1059 [doi]
- Silent Data Access Protocol for NVRAM + RDMA Distributed StorageQingyue Liu, Peter Varman. 1060-1069 [doi]
- Enhancing Endurance of SSD Based high-performance Storage Systems using Emerging NVM TechnologiesTanaya Roy, Krishna Kant. 1070-1079 [doi]
- Design of Locality-aware MPI-IO for Scalable Shared File Write PerformanceKohei Sugihara, Osamu Tatebe. 1080-1089 [doi]
- Parallel Generation of Simple Null Graph ModelsJack Garbus, Christopher Brissette, George M. Slota. 1091-1100 [doi]
- A System for High Performance Mining on GDELT DataKonstantin Pogorelov, Daniel Thilo Schroeder, Petra Filkukova, Johannes Langguth. 1101-1111 [doi]
- A Parallel LFR-like Benchmark for Evaluating Community Detection AlgorithmsGeorge M. Slota, Jack Garbus. 1112-1115 [doi]
- The Role of Artificial Intelligence and Cyber Security for Social MediaBhavani M. Thuraisingham. 1116-1118 [doi]
- YouTube Data Collection Using Parallel ProcessingJoseph Kready, Shishila Awung Shimray, Muhammad Nihal Hussain, Nitin Agarwal. 1119-1122 [doi]
- New Approaches for Performance Optimization and Analysis of Large-Scale Dynamic Social Network Analysis using Anytime Anywhere AlgorithmsEunice E. Santos, Vairavan Murugappan, John Korah. 1123-1128 [doi]