CGO 2011 Tutorial: GPUs and General-Purpose Multicores: Programming Models, Optimizations and Tuning

GPU Programming Models, Optimizations and Tuning

Half-day Tutorial at

The International Symposium on Code Generation and Optimization, CGO 2011
April 2, 2011
Le Majestic Congress Center, Chamonix, France

J. (Ram) Ramanujam
Department of Electrical and Computer Engineering
and Center for Computation and Technology
Louisiana State University
Baton Rouge, LA 70803, USA

P. (Saday) Sadayappan
Department of Computer Science and Engineering
The Ohio State University
Columbus, OH 43210, USA

Audience

This tutorial is targeted primarily at application developers, computer/computational scientists, and graduate students interested in programming models and/or compiler optimization issues for GPGPU computing. Knowledge of C programming will be assumed; basic knowledge of processor architectures will be assumed; no prior parallel programming experience or familiarity with source-to-source transformations will be assumed.

Brief Description

GPU based parallel computing is of tremendous interest today because of their significantly higher peak performance than general-purpose multicore processors, as well as better energy efficiency. However, harnessing the power of GPUs is more complicated than general-purpose multi-cores. There has been considerable recent interest in two complementary approaches to assist application developers:

programming models that explicitly expose the programmer to parallelism; and
compiler optimization and tuning frameworks to automatically transform programs for parallel execution on GPUs.
This tutorial will provide an introductory survey covering both these aspects.

Lecture Outline

Introduction

CPUs versus GPUs
Programming Models for GPUs
Compiler Transformations for GPUs

GPU Architectures and programming

GPU architectures
General-purpose computation on GPUs
Programming models and idioms
GPU programming models/environments:

OpenCL
CUDA
CAL
PGI Accelerator
CAPS/HMPP

Code examples on GPUs
Examples of CPU vs. GPU performance

Compiler optimizations and tuning for GPUs

Performance characterization
Optimizing memory accesses
Multi-level parallelism exploitation
Performance models and emipirical search
Compiler-driven tuning
Examples of application optimization
Software managed memory hierarchies

Tutorial Speakers

J. (Ram) Ramanujam received the B. Tech. degree in Electrical Engineering from the Indian Institute of Technology, Madras, India in 1983, and his M.S. and Ph. D. degrees in Computer Science from The Ohio State University in 1987 and 1990 respectively. He is currently the John E. and Beatrice L. Ritter Distinguished Professor in the Department of Electrical and Computer Engineering at Louisiana State University (LSU). In addition, he holds a joint faculty appointment with the LSU Center for Computation and Technology. His research interests are in compilers and runtime systems for high-performance computing, domain-specific languages and compilers for parallel computing, embedded systems, and high-level hardware synthesis. He has participated in several NSF-funded projects including the Tensor Contraction Engine and the Pluto project for automatic parallelization. Additional details can be found at http://www.ece.lsu.edu/jxr/.
P. (Saday) Sadayappan received the B. Tech. degree from the Indian Institute of Technology, Madras, India, and an M.S. and Ph. D. from the State University of New York at Stony Brook, all in Electrical Engineering. He is currently a Professor in the Department of Computer Science and Engineering at The Ohio State University. His research interests include compiler/runtime optimization for parallel computing, and domain-specific languages for high-performance scientific computing. He has led several NSF-funded projects including the Tensor Contraction Engine and the Pluto project for automatic parallelization. Additional details can be found at http://www.cse.ohio-state.edu/~saday/.

GPU Programming Models, Optimizations and Tuning Half-day Tutorial at The International Symposium on Code Generation and Optimization, CGO 2011 April 2, 2011 Le Majestic Congress Center, Chamonix, France

J. (Ram) Ramanujam Department of Electrical and Computer Engineering and Center for Computation and Technology Louisiana State University Baton Rouge, LA 70803, USA P. (Saday) Sadayappan Department of Computer Science and Engineering The Ohio State University Columbus, OH 43210, USA

GPU Programming Models, Optimizations and Tuning

Half-day Tutorial at

The International Symposium on Code Generation and Optimization, CGO 2011
April 2, 2011
Le Majestic Congress Center, Chamonix, France

J. (Ram) Ramanujam
Department of Electrical and Computer Engineering
and Center for Computation and Technology
Louisiana State University
Baton Rouge, LA 70803, USA

P. (Saday) Sadayappan
Department of Computer Science and Engineering
The Ohio State University
Columbus, OH 43210, USA