Caspur


EWOMP Home Page


Organization


Final program, papers, slides


Conference location


Registrations and accommodations


Travel information


Related Events and Web Sites



Afternoon Tutorial

OpenMP Implementation and Performance Issues

Eduard Ayguade, Mats Brorsson, Sven Karlsson, Xavier Martorell, Marc Gonzalez

OpenMP has become an important tool to bring parallel computing to a larger community. However, for the Supercomputing community, the first attempts to use OpenMP are sometimes discouraging in terms of performance since the user sometimes believe that all she/he needs to do is to insert some directives at suitable places, e.g. at for/do-loops with independent iterations.
To understand the performance of an OpenMP program, it is important to understand how an OpenMP implementation can be done and how the synchronization and communication of a shared memory program is actually done in the real hardware.
In this tutorial, we will cover the design and implementation of an OpenMP compilation system consisting of a source-to-source OpenMP translator and a run-time library. The source code for this system is freely available and will be distributed to tutorial participants. We will also discuss some shared memory architecture details that affect performance and discuss some issues regarding performance analysis of OpenMP applications.

The software described in this tutorial has been developed partly within the EU project Intone under contract number IST-1999-20252.

 

Part I: OpenMP implementation: a case study

·        Run-time library implementation         
In this section we will describe a run-time library specifically targeted for OpenMP program execution. This part of the tutorial will go through some of the implementation details of this library.

·        OpenMP compiler implementation         
In this section we will go through OpenMP constructs, one by one and show how the OpenMP translator works and maps them to calls to the run-time library

·        OdinMP: a hacker's guide        
The OpenMP compilation system has been developed partly to provide a free OpenMP implementation to be used freely by application developers at universities and elsewhere. It has, however, been developed also to be able to make experiments with OpenMP extensions. This part of the tutorial will briefly introduce OdinMP, which is an OpenMP translator for C (with some C++ extensions) and show how it has been designed so that you can start modifying it yourself. The Fortran NanosCompiler, also available in the distribution, will be briefly commented.

 

Part II: OpenMP performance issues

·        Performance analysis tools and techniques
In this section we will comment the mechanisms available in the distribution to carry out performance analysis of OpenMP applications.

·        Shared memory architectures  
In this section we will go through some of the issues that influence performance on shared memory architectures, both hardware (SMP and ccNUMA) and software coherent (SDSM).

Fabio Sergio Paolo Maria Francesca Francesco Livia Fabio Sergio Paolo Maria Francesca Francesco Livia Fabio Sergio Paolo Maria Francesca Francesco Livia Fabio Sergio Paolo Maria Francesca Francesco Livia



The organisers can be contacted at ewomp2002@caspur.it