IEEE 24th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2012, New York, NY, USA, October 24-26, 2012 - researchr publication

researchr

You are not signed in
Sign in
Sign up

IEEE 24th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2012, New York, NY, USA, October 24-26, 2012. IEEE, 2012. [doi]

Conference: sbac-pad2012

Abstract is missing.

The Network Adapter: The Missing Link between MPI Applications and Network PerformanceGermán Rodríguez, Cyriel Minkenberg, Ronald P. Luijten, Ramón Beivide, Patrick Geoffray, Jesús Labarta, Mateo Valero, Steve Poole. 1-8 [doi]

HAT: Heterogeneous Adaptive Throttling for On-Chip NetworksKevin Kai-Wei Chang, Rachata Ausavarungnirun, Chris Fallin, Onur Mutlu. 9-18 [doi]

On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware AcceleratorsArdavan Pedram, Andreas Gerstlauer, Robert A. van de Geijn. 19-26 [doi]

Network Endpoints for Clusters of SMPsGabriel Ilie Tanase, Gheorghe Almasi, Hanhong Xue, Charles Archer. 27-34 [doi]

Assessing Energy Efficiency of Fault Tolerance Protocols for HPC SystemsEsteban Meneses, Osman Sarood, Laxmikant V. Kalé. 35-42 [doi]

Using Heterogeneous Networks to Improve Energy Efficiency in Direct Coherence Protocols for Many-Core CMPsAlberto Ros, Ricardo Fernández Pascual, Manuel E. Acacio. 43-50 [doi]

Energy Savings via Dead Sub-Block PredictionMarco Antonio Zanata Alves, Khubaib, Eiman Ebrahimi, Veynu Narasiman, Carlos Villavieja, Philippe Olivier Alexandre Navaux, Yale N. Patt. 51-58 [doi]

Scalable Thread Scheduling in Asymmetric Multicores for Power EfficiencyRance Rodrigues, Arunachalam Annamalai, Israel Koren, Sandip Kundu. 59-66 [doi]

Divergence Analysis with Affine ConstraintsDiogo Sampaio, Rafael Martins, Sylvain Collange, Fernando Magno Quintão Pereira. 67-74 [doi]

Exploiting Concurrent GPU Operations for Efficient Work Stealing on Multi-GPUsJoão V. F. Lima, Thierry Gautier, Nicolas Maillard, Vincent Danjean. 75-82 [doi]

Sparse Fast Fourier Transform on GPUs and Multi-core CPUsJiaxi Hu, Zhaosen Wang, Qiyuan Qiu, Weijun Xiao, David J. Lilja. 83-91 [doi]

Cloud Workload Analysis with SWATMauricio Breternitz, Keith Lowery, Anton Charnoff, Patryk Kaminski, Leonardo Piga. 92-99 [doi]

Scalable Algorithms for Distributed-Memory Adaptive Mesh RefinementAkhil Langer, Jonathan Lifflander, Phil Miller, Kuo-Chuan Pan, Laxmikant V. Kalé, Paul M. Ricker. 100-107 [doi]

Compression Speed Enhancements to LZO for Multi-core SystemsJason Kane, Qing Yang. 108-115 [doi]

Parallelizing Information Set Generation for Game Tree Search ApplicationsMark Richards, Abhishek Gupta, Osman Sarood, Laxmikant V. Kalé. 116-123 [doi]

A Parallel Implementation of Gomory-Hu's Cut Tree AlgorithmJaime Cohen, Luiz A. Rodrigues, Elias Procópio Duarte Jr.. 124-131 [doi]

Beyond CPU Frequency Scaling for a Fine-grained Energy Control of HPC SystemsGhislain Landry Tsafack Chetsa, Laurent Lefèvre, Jean-Marc Pierson, Patricia Stolf, Georges Da Costa. 132-138 [doi]

BTL: A Framework for Measuring and Modeling Energy in Memory HierarchiesIoannis Manousakis, Dimitrios S. Nikolopoulos. 139-146 [doi]

Energy-Performance Tradeoffs in Software Transactional MemoryAlexandro Baldassin, Joao P. L. de Carvalho, Leonardo A. G. Garcia, Rodolfo Azevedo. 147-154 [doi]

Runtime Procedure for Energy Savings in Applications with Point-to-Point CommunicationsVaibhav Sundriyal, Masha Sosonkina, Alexander Gaenko. 155-162 [doi]

Scalable Triadic Analysis of Large-Scale Graphs: Multi-core vs. Multi-processor vs. Multi-threaded Shared Memory ArchitecturesGeorge Chin Jr., Andrès Márquez, Sutanay Choudhury, John Feo. 163-170 [doi]

Efficient Sorting on the Tilera Manycore ArchitectureAlessandro Morari, Antonino Tumeo, Oreste Villa, Simone Secchi, Mateo Valero. 171-178 [doi]

Level-3 BLAS on the TI C6678 Multi-core DSPMurtaza Ali, Eric Stotzer, Francisco D. Igual, Robert A. van de Geijn. 179-186 [doi]

Parallel Exact Inference on Multicore Using MapReduceNam Ma, Yinglong Xia, Viktor K. Prasanna. 187-194 [doi]

An OS-Hypervisor Infrastructure for Automated OS Crash Diagnosis and Recovery in a Virtualized EnvironmentJoefon Jann, R. Sarma Burugula, Ching-Farn E. Wu, Kaoutar El Maghraoui. 195-202 [doi]

VPC: Scalable, Low Downtime Checkpointing for Virtual ClustersPeng Lu, Binoy Ravindran, Changsoo Kim. 203-210 [doi]

FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous EnvironmentYoonho Park, Eric Van Hensbergen, Marius Hillenbrand, Todd Inglett, Bryan S. Rosenburg, Kyung Dong Ryu, Robert W. Wisniewski. 211-218 [doi]

Transactional Forwarding: Supporting Highly-Concurrent STM in Asynchronous Distributed SystemsMohamed M. Saad, Binoy Ravindran. 219-226 [doi]

Exploiting Phase-Change Memory in Cooperative CachesLuiz E. Ramos, Ricardo Bianchini. 227-234 [doi]

Global Data Re-allocation via Communication Aggregation in ChapelAlberto Sanz, Rafael Asenjo, Juan López, Rafael Larrosa, Angeles G. Navarro, Vassily Litvinov, Sung-Eun Choi, Bradford L. Chamberlain. 235-242 [doi]

Integrating Dataflow Abstractions into the Shared Memory ModelVladimir Gajinov, Srdjan Stipic, Osman S. Unsal, Tim Harris, Eduard Ayguadé, Adrián Cristal. 243-251 [doi]

CSHARP: Coherence and SHaring Aware Cache Replacement Policies for Parallel ApplicationsBiswabandan Panda, Shankar Balachandran. 252-259 [doi]

Low Overhead Instruction-Cache Modeling Using Instruction Reuse ProfilesMuneeb Khan, Andreas Sembrant, Erik Hagersten. 260-269 [doi]

Data and Instruction Uniformity in Minimal Multi-threadingTeo Milanez, Sylvain Collange, Fernando Magno Quintão Pereira, Wagner Meira Jr., Renato Ferreira. 270-277 [doi]

ACCGen: An Automatic ArchC Compiler GeneratorRafael Auler, Paulo Centoducatte, Edson Borin. 278-285 [doi]

Efficiently Handling Memory Accesses to Improve QoS in Multicore Systems under Real-Time ConstraintsJosé Luis March, Salvador Petit, Julio Sahuquillo, Houcine Hassan, José Duato. 286-293 [doi]

runs on WebDSL