Abstract is missing.
- StressBench: A Configurable Full System Network and I/O Benchmark FrameworkDean G. Chester, Taylor Groves, Simon D. Hammond, Tim Law, Steven A. Wright 0001, Richard P. Smedley-Stevenson, Suhaib A. Fahmy, Gihan R. Mudalidge, Stephen A. Jarvis. [doi]
- Scaling of Evolutionary Search of Algorithm Space to Speed-Up Scientific Image Understanding WorkflowsNicholas Grabill, Kai Pinckard, Dirk Colbry. 1-6 [doi]
- Delayed Asynchronous Iterative Graph AlgorithmsMark P. Blanco, Scott McMillan, Tze Meng Low. 1-7 [doi]
- Low-Communication Asynchronous Distributed Generalized Canonical Polyadic Tensor DecompositionCannada Lewis, Eric Phipps. 1-5 [doi]
- Large Scale String Analytics in ArkoudaZhihui Du, Oliver Alvarado Rodriguez, David A. Bader. 1-7 [doi]
- Design of Asynchronous Polymorphic Logic Gates for Hardware SecurityChandler Bernard, William Bryant, Richard Becker, Jia Di. 1-5 [doi]
- System-Level Modeling of GPU/FPGA Clusters for Molecular Dynamics SimulationsChunshu Wu, Sahan Bandara, Tong Geng, Vipin Sachdeva, Woody Sherman, Martin C. Herbordt. 1-8 [doi]
- An All-at-Once CP Decomposition Method for Count TensorsTeresa M. Ranadive, Muthu M. Baskaran. 1-8 [doi]
- Toward HDL Extensions for Rapid AI/ML Accelerator GenerationRyan Kabrick, John D. Leidel, David Donofrio. 1-6 [doi]
- Performance Evaluation of Mixed-Precision Runge-Kutta MethodsBen Burnett, Sigal Gottlieb, Zachary J. Grant, Alfa R. H. Heryudono. 1-6 [doi]
- Embedded Compute Matrix Processing and FFTs using Floating Point FPGAsMichael Parker. 1-5 [doi]
- Inverse-Deletion BFS - Revisiting Static Graph BFS Traversals with Dynamic Graph OperationsOded Green. 1-7 [doi]
- Rapid Configuration of Asynchronous Recurrent Neural Networks for ASIC ImplementationsSpencer Nelson, Wassim Khalil, Sang Yun Kim, Jia Di, Zhe Zhou, Zhihang Yuan, Guang-Yu Sun 0003. 1-6 [doi]
- Performance Study of GPU applications using SYCL and CUDA on Tesla V100 GPUGoutham Kalikrishna Reddy Kuncham, Rahul Vaidya, Mahesh Barve. 1-7 [doi]
- Deluge: Achieving Superior Efficiency, Throughput, and Scalability with Actor Based Streaming on Migrating ThreadsBrian A. Page, Peter M. Kogge. 1-6 [doi]
- Serving Machine Learning Inference Using Heterogeneous HardwareBaolin Li, Vijay Gadepally, Siddharth Samsi, Mark S. Veillette, Devesh Tiwari. 1-8 [doi]
- Efficient Scheduling of Dependent Tasks in Many-Core Real-Time System Using a Hardware SchedulerAmin Norollah, Zahra Kazemi, Niloufar Sayadi, Hakem Beitollahi, Mahdi Fazeli, David Hély. 1-7 [doi]
- Towards Distributed Square Counting in Large GraphsTrevor Steil, Geoffrey Sanders, Roger Pearce. 1-7 [doi]
- WASP: A Wearable Super-Computing Platform for Distributed Intelligence in Multi-Agent SystemsChinmaya Patnayak, James E. McClure, Ryan K. Williams. 1-7 [doi]
- Realizing Forward Defense in the Cyber DomainSandeep Pisharody, Jonathan Bernays, Vijay Gadepally, Michael Jones 0001, Jeremy Kepner, Chad R. Meiners, Peter Michaleas, Adam Tse, Doug Stetson. 1-7 [doi]
- Reconfigurable Hardware Root-of-Trust for Secure Edge ProcessingAlan Ehret, Eliakin Del Rosario, Carsten Schwicking, Karen Gettings, Michel A. Kinsy. 1-7 [doi]
- HyPC-Map: A Hybrid Parallel Community Detection Algorithm Using Information-Theoretic ApproachMd Abdul Motaleb Faysal, Shaikh Arifuzzaman, Cy P. Chan, Maximilian H. Bremer, Doru Popovici, John Shalf. 1-8 [doi]
- Exploring the Tradeoff Between Reliability and Performance in HPC SystemsCraig Walker, Braeden Slade, Gavin Bailey, Nicklaus Przybylski, Nathan DeBardeleben, William M. Jones. 1-7 [doi]
- Survey and Future Trends for FPGA Cloud ArchitecturesHafsah Shahzad, Ahmed Sanaullah, Martin C. Herbordt. 1-10 [doi]
- Maneuver Identification ChallengeKaira Samuel, Vijay Gadepally, David Jacobs, Michael Jones 0001, Kyle McAlpin, Kyle Palko, Ben Paulk, Sid Samsi, Ho Chit Siu, Charles Yee, Jeremy Kepner. 1-7 [doi]
- Efficient Neighbor-Sampling-based GNN Training on CPU-FPGA Heterogeneous PlatformBingyi Zhang, Sanmukh R. Kuppannagari, Rajgopal Kannan, Viktor K. Prasanna. 1-7 [doi]
- Modeling Data Movement Performance on Heterogeneous ArchitecturesAmanda Bienz, Luke N. Olson, William D. Gropp, Shelby Lockhart. 1-7 [doi]
- Workload Imbalance in HPC Applications: Effect on Performance of In-Network ProcessingPouya Haghi, Anqi Guo, Tong Geng, Anthony Skjellum, Martin C. Herbordt. 1-8 [doi]
- Graph Embedding and Field Based Detection of Non-Local Webs in Large Scale Free NetworksMichael E. Franusich, Franz Franchetti. 1-7 [doi]
- Streaming Detection and Classification Performance of a POWER9 Edge SupercomputerWesley Brewer, Chris Geyer, Dardo Kleiner, Connor Horne. 1-7 [doi]
- Using Monitoring Data to Improve HPC Performance via Network-Data-Driven AllocationYijia Zhang 0002, Burak Aksar, Omar Aaziz, Benjamin Schwaller, Jim M. Brandt, Vitus J. Leung, Manuel Egele, Ayse K. Coskun. 1-7 [doi]
- Classification frameworks comparison on 3D point cloudsF. Patricia Medina, Randy C. Paffenroth. 1-6 [doi]
- Model Quantization and Synthetic Aperture Data Analyses Increasing Throughput and Energy EfficiencyMark Barnell, Courtney Raymond, Anthony Salmin, Daniel Brown, Darrek Isereau. 1-5 [doi]
- Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared AtomicsDennis Bautembach, Iason Oikonomidis, Antonis A. Argyros. 1-8 [doi]
- Distributed and Heterogeneous SAR Backprojection with HalideConnor Imes, Tzu-Mao Li, Mark Glines, Rishi Khan, John Paul Walters. 1-9 [doi]
- HyKernel: A Hybrid Selection of One/Two-Phase Kernels for Triangle Counting on GPUsMohammad Almasri, Neo Vasudeva, Rakesh Nagi, Jinjun Xiong, Wen-mei Hwu. 1-7 [doi]
- A Novel Approach to Cyber Situational Awareness in Embedded SystemsKyle Denney, Robert Lychev, Donato Kava, Alice Lee, Michael Vai, Nick Evancich, Richard Clark, David Lide, Kyung Joon Kwak, Jason H. Li, Michael Lynch, Kyle Tillotson, Walt Tirenin, Douglas Schafer. 1-5 [doi]
- AI Accelerator Survey and TrendsAlbert Reuther, Peter Michaleas, Michael Jones 0001, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner. 1-9 [doi]
- Solving sparse linear systems with approximate inverse preconditioners on analog devicesVasileios Kalantzis, Anshul Gupta, Lior Horesh, Tomasz Nowicki, Mark S. Squillante, Chai Wah Wu, Tayfun Gokmen, Haim Avron. 1-7 [doi]
- Knowledge-guided Tensor Decomposition for Baselining and Anomaly DetectionDimitri Leggas, Christopher J. Coley, Teresa M. Ranadive. 1-7 [doi]
- Interrogating the performance of quantum annealing for the solution of steady-state subsurface flowJessie M. Henderson, Daniel O'Malley, Hari S. Viswanathan. 1-6 [doi]
- Spatial Temporal Analysis of 40, 000, 000, 000, 000 Internet Darkspace PacketsJeremy Kepner, Michael Jones 0001, Daniel Andersen, Aydin Buluç, Chansup Byun, Kimberly C. Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Chad R. Meiners, Lauren Milechin, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Adam Tse, Charles Yee, Peter Michaleas. 1-8 [doi]
- 3D Real-Time Supercomputer MonitoringBill Bergeron, Matthew Hubbell, Dylan Sequeira, Winter Williams, William Arcand, David Bestor, Chansup Byun, Vijay Gadepally, Michael Houle, Michael Jones 0001, Anna Klien, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner. 1-7 [doi]
- Performance of a GPU-Based Radar ProcessorMark Bolding, Saul Crumpton, David Ediger, George Samo. 1-5 [doi]
- Toward Performance Portable Programming for Heterogeneous Systems on a Chip: A Case Study with Qualcomm Snapdragon SoCAnthony M. Cabrera, Seth Hitefield, Jungwon Kim, Seyong Lee, Narasinga Rao Miniskar, Jeffrey S. Vetter. 1-7 [doi]
- A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier TransformsDaniel Sharp, Miroslav Stoyanov, Stanimire Tomov, Jack J. Dongarra. 1-5 [doi]
- Implications of Reduced Communication Precision in a Collocated Discontinuous Galerkin Finite Element FrameworkMarcin Rogowski, Lisandro Dalcín, Matteo Parsani, David E. Keyes. 1-7 [doi]
- Boundary Integral Solver Approaches for Particle Accelerator Simulation Problems and Deployment on NERSC HardwareJulia Wei, Matthew Harper Langston, Pierre-David Letourneau, Matthew J. Morse, Larry Weintraub, Aimee Nogoy, Noah Amsel, Richard Lethin. 1-6 [doi]
- Efficiently Building a Large Scale Dataset for Program InductionLauren Milechin, Javier Lopez-Contreras, Ferran Alet. 1-7 [doi]
- Productive High-Performance k-Truss Decomposition on GPU Using Linear AlgebraRunze Wang, Linchen Yu, Qinggang Wang, Jie Xin, Long Zheng 0003. 1-7 [doi]
- *Sadasivan Sadas Shankar. 1-8 [doi]
- The K-Core Decomposition Algorithm Under the Framework of GraphBLASLonglong Li, Hu Chen, Ping Li, Jie Han, Guanghui Wang, Gong Zhang. 1-7 [doi]
- Pragmatic Benchmarking for Research ComputingDennis Milechin, Ahmed Aly, Josh Bevan, Charlie Jahnke, Yun Shen, Brian Gregor. 1-6 [doi]
- Vertical, Temporal, and Horizontal Scaling of Hierarchical Hypersparse GraphBLAS MatricesJeremy Kepner, Tim Davis 0001, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones 0001, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas. 1-6 [doi]
- DMM-GAPBS: Adapting the GAP Benchmark Suite to a Distributed Memory ModelZach Hansen, Brody Williams, John D. Leidel, Xi Wang, Yong Chen. 1-8 [doi]
- HLS Portability from Intel to Xilinx: A Case StudyZhili Xiao, Roger D. Chamberlain, Anthony M. Cabrera. 1-8 [doi]
- Spectral Graph Partitioning Using Geodesic Distance-based ProjectionYasunori Futamura, Ryota Wakaki, Tetsuya Sakurai. 1-7 [doi]
- The GraphBLAS in Julia and Python: the PageRank and Triangle CentralitiesMichel Pelletier, Will Kimmerer, Timothy A. Davis 0001, Timothy G. Mattson. 1-7 [doi]
- Software-Hardware Co-Optimization on Partial-Sum Problem for PIM-based Neural Network AcceleratorQizhe Wu, Linfeng Tao, Huawen Liang, Wei Yuan, Teng Tian, Shuang Xue, Xi Jin. 1-7 [doi]
- Non-Volatile Memory Accelerated Geometric Multi-Scale Resolution AnalysisAndrew Wood, Moshik Hershcovitch, Daniel G. Waddington, Sarel Cohen, Meredith Wolf, Hongjun Suh, Weiyu Zong, Peter Chin. 1-7 [doi]
- A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation SolutionsYingbo Ma, Vaibhav Dixit, Michael J. Innes, Xingjian Guo, Christopher Rackauckas. 1-9 [doi]
- Timing-based side-channel attack and mitigation on PCIe connected distributed embedded systemsSalman Abdul Khaliq, Usman Ali, Omer Khan. 1-7 [doi]
- The MIT Supercloud DatasetSiddharth Samsi, Matthew L. Weiss, David Bestor, Baolin Li, Michael Jones 0001, Albert Reuther, Daniel Edelman, William Arcand, Chansup Byun, John Holodnack, Matthew Hubbell, Jeremy Kepner, Anna Klein, Joseph McDonald, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia S. Mullen, Charles Yee, Benjamin Price, Andrew Prout, Antonio Rosa, Allan Vanterpool, Lindsey McEvoy, Anson Cheng, Devesh Tiwari, Vijay Gadepally. 1-8 [doi]
- Towards Combining Error-bounded Lossy Compression and Cryptography for Scientific DataRuiwen Shan, Sheng Di, Jon C. Calhoun, Franck Cappello. 1-7 [doi]
- Sparse Deep Neural Network Acceleration on HBM-Enabled FPGA PlatformAbhishek Kumar Jain, Sharan Kumar, Aashish Tripathi, Dinesh Gaitonde. 1-7 [doi]
- A High-Performance Heterogeneous Critical Path Analysis FrameworkYasin Zamani, Tsung-Wei Huang. 1-7 [doi]
- GCN Inference Acceleration using High-Level SynthesisYi-Chien Lin, Bingyi Zhang, Viktor K. Prasanna. 1-6 [doi]
- Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGASasindu Wijeratne, Rajgopal Kannan, Viktor K. Prasanna. 1-7 [doi]
- Improved Compression for Word Embeddings by Scaling Principal ComponentsJoseph McDonald, Siddharth Samsi, Daniel Edelman, Chansup Byun, Jeremy Kepner, Vijay Gadepally. 1-7 [doi]
- A GraphBLAS Implementation of Triangle CentralityFuhuan Li, David A. Bader. 1-2 [doi]
- Faster Stochastic Block Partition Using Aggressive Initial Merging, Compressed Representation, and Parallelism ControlAhsen J. Uppal, Jaeseok Choi, Thomas B. Rolinger, H. Howie Huang. 1-7 [doi]
- Digraph Clustering by the BlueRed MethodTiancheng Liu, Dimitris Floros, Nikos Pitsianis, Xiaobai Sun. 1-7 [doi]
- Fusing Non Element-wise Layers in DNNsUpasana Sridhar, Tze Meng Low, Martin D. Schatz. 1-2 [doi]
- Privateer: Multi-versioned Memory-mapped Data Stores for High-Performance Data ScienceKarim Youssef, Keita Iwabuchi, Wu-chun Feng, Roger Pearce. 1-7 [doi]
- Supercomputing Enabled Deployable Analytics for Disaster ResponseKaira Samuel, Jeremy Kepner, Michael Jones 0001, Lauren Milechin, Vijay Gadepally, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Anna Klein, Victor Lopez, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas. 1-5 [doi]
- Filtered Tensor Construction and Decomposition for Drug RepositioningDimitri Leggas, Muthu Manikandan Baskaran, James R. Ezick, Brendan von Hofe. 1-7 [doi]
- Are van Emde Boas trees viable on the GPU?Benedikt Mayr, Alexander Weinrauch, Mathias Parger, Markus Steinberger. 1-7 [doi]
- Instance Segmentation of Neuronal Nuclei Leveraging Domain AdaptationKevin Brady, Pooya Khorrami, Lars Gjesteby, Laura J. Brattain. 1-5 [doi]
- Using Computation Effectively for Scalable Poisson Tensor Factorization: Comparing Methods Beyond Computational EfficiencyJeremy M. Myers, Daniel M. Dunlavy. 1-7 [doi]
- Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-SchedulingAndrew J. Weinert, Marc Brittain, Ngaire Underhill, Christine Serres. 1-8 [doi]
- An interface for multidimensional arrays in ArkoudaMitesh Kothari, Richard W. Vuduc. 1-2 [doi]
- Enabling Exploratory Large Scale Graph Analytics through ArkoudaZhihui Du, Oliver Alvarado Rodriguez, David A. Bader. 1-7 [doi]
- An Efficient Algorithm for the Construction of Dynamically Updating Trajectory NetworksDeniz Gurevin, Chris J. Michael, Omer Khan. 1-7 [doi]
- Optimized Quantum Circuit Generation with SPIRALScott Mionis, Franz Franchetti, Jason Larkin. 1-7 [doi]
- IRIS: A Portable Runtime System Exploiting Multiple Heterogeneous Programming SystemsJungwon Kim, Seyong Lee, Beau Johnston, Jeffrey S. Vetter. 1-8 [doi]
- Machine Learning Fairness is Computationally Difficult and Algorithmically Unsatisfactorily SolvedMike H. M. Teodorescu, Xinyu Yao. 1-8 [doi]
- HARDROID: Transparent Integration of Crypto Accelerators in AndroidLuca Piccolboni, Giuseppe Di Guglielmo, Simha Sethumadhavan, Luca P. Carloni. 1-8 [doi]
- Detection of Multiple Crop Diseases using Image Processing TechniquesAkanksha Soni, Jeetendra Kumar Soni, Surabhi Hota. 1-6 [doi]
- A Survey and Taxonomy of Blockchain-based Payment Channel NetworksHaleh Khojasteh, Hirad Tabatabaei. 1-8 [doi]
- DPGS Graph Summarization Preserves Community StructureLisa J. K. Durbeck, Peter Athanas. 1-9 [doi]
- Performance Portability of an SpMV Kernel Across Scientific Computing and Data Science ApplicationsStephen L. Olivier, Nathan D. Ellingwood, Jonathan W. Berry, Daniel M. Dunlavy. 1-8 [doi]
- Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space ExplorationJie Xin, Xianqi Ye, Long Zheng 0003, Qinggang Wang, Yu Huang 0013, Pengcheng Yao, Linchen Yu, Xiaofei Liao, Hai Jin 0001. 1-7 [doi]
- Node-Based Job Scheduling for Large Scale Simulations of Short Running JobsChansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones 0001, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner. 1-7 [doi]
- A Survey: Handling Irregularities in Neural Network Acceleration with FPGAsTong Geng, Chunshu Wu, Cheng Tan 0002, Chenhao Xie 0001, Anqi Guo, Pouya Haghi, Sarah Yuan He, Jiajia Li, Martin C. Herbordt, Ang Li. 1-8 [doi]