Abstract is missing.
- CUDA-Lite: Reducing GPU Programming ComplexitySain-Zee Ueng, Melvin Lathara, Sara S. Baghsorkhi, Wen-mei W. Hwu. 1-15 [doi]
- MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUsJohn A. Stratton, Sam S. Stone, Wen-mei W. Hwu. 16-30 [doi]
- Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE ArchitectureNikola Vujic, Marc González, Xavier Martorell, Eduard Ayguadé. 31-46 [doi]
- Efficient Set Sharing Using ZBDDsMario Méndez-Lojo, Ondřej Lhoták, Manuel V. Hermenegildo. 47-63 [doi]
- Register Bank Assignment for Spatially Partitioned ProcessorsBehnam Robatmili, Katherine E. Coons, Doug Burger, Kathryn S. McKinley. 64-79 [doi]
- Smashing: Folding Space to Tile through TimeNissa Osheim, Michelle Mills Strout, Dave Rostron, Sanjay V. Rajopadhye. 80-93 [doi]
- Identification of Heap-Carried Data Dependence Via Explicit Store Heap ModelsMark Marron, Darko Stefanovic, Deepak Kapur, Manuel V. Hermenegildo. 94-108 [doi]
- On the Scalability of an Automatically Parallelized Irregular ApplicationMartin Burtscher, Milind Kulkarni, Dimitrios Prountzos, Keshav Pingali. 109-123 [doi]
- Statistically Analyzing Execution Variance for Soft Real-Time ApplicationsTushar Kumar, Romain Cledat, Jaswanth Sreeram, Santosh Pande. 124-140 [doi]
- Minimum Lock Assignment: A Method for Exploiting Concurrency among Critical SectionsYuan Zhang, Vugranam C. Sreedhar, Weirong Zhu, Vivek Sarkar, Guang R. Gao. 141-155 [doi]
- Set-Congruence Dynamic Analysis for Thread-Level Speculation (TLS)Cosmin E. Oancea, Alan Mycroft. 156-171 [doi]
- Thread Safety through Partitions and Effect AgreementsNicholas D. Matsakis, Thomas R. Gross. 172-186 [doi]
- P-Ray: A Software Suite for Multi-core Architecture CharacterizationAlexandre Duchateau, Albert Sidelnik, María Jesús Garzarán, David A. Padua. 187-201 [doi]
- Scalable Implementation of Efficient Locality ApproximationXipeng Shen, Jonathan Shaw. 202-216 [doi]
- P-OPT: Program-Directed Optimal Cache ManagementXiaoming Gu, Tongxin Bai, Yaoqing Gao, Chengliang Zhang, Roch Archambault, Chen Ding. 217-231 [doi]
- Compiler-Driven Dependence Profiling to Guide Program ParallelizationPeng Wu, Arun Kejariwal, Calin Cascaval. 232-248 [doi]
- gluepy: A Simple Distributed Python Programming Framework for Complex Grid EnvironmentsKen Hironaka, Hideo Saito, Kei Takahashi, Kenjiro Taura. 249-263 [doi]
- A Fully Parallel LISP2 Compactor with Preservation of the Sliding PropertiesXiao-Feng Li, Ligang Wang, Chen Yang. 264-278 [doi]
- A Case Study in Tightly Coupled Multi-paradigm Parallel ProgrammingSayantan Chakravorty, Aaron Becker, Terry Wilmarth, Laxmikant V. Kalé. 279-291 [doi]
- ASYNC Loop Constructs for Relaxed SynchronizationRussell Meyers, Zhiyuan Li. 292-303 [doi]
- Design for Interoperability in stapl: pMatrices and Linear Algebra AlgorithmsAntal A. Buss, Timmie G. Smith, Gabriel Tanase, Nathan L. Thomas, Mauro Bianco, Nancy M. Amato, Lawrence Rauchwerger. 304-315 [doi]
- Implementation of Sensitivity Analysis for Automatic ParallelizationSilvius Rus, Maikel Pennings, Lawrence Rauchwerger. 316-330 [doi]
- Just-In-Time Locality and Percolation for Optimizing Irregular Applications on a Manycore ArchitectureGuangming Tan, Vugranam C. Sreedhar, Guang R. Gao. 331-342 [doi]
- Exploring the Optimization Space of Dense Linear Algebra KernelsQing Yi, Apan Qasem. 343-355 [doi]