OpenMP topic: Work sharing

Experimental html version of downloadable textbook, see http://www.tacc.utexas.edu/~eijkhout/istc/istc.html
\[ \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \] 19.1 : Sections
19.2 : Single/master
19.3 : Fortran array syntax parallelization
Back to Table of Contents

19 OpenMP topic: Work sharing

The declaration of a parallel region establishes a team of threads. This offers the possibility of parallelism, but to actually get meaningful parallel activity you need something more. OpenMP uses the concept of a construct}: a way of dividing parallelizable work over a team of threads. The work sharing constructs are:

  • for (for~C) or do (for Fortran). The threads divide up the loop iterations among themselves; see~ 18.1 .
  • sections The threads divide a fixed number of sections between themselves; see section~ 19.1 .
  • single The section is executed by a single thread; section~ 19.2 .
  • task See section~ 22.4 .
  • workshare Can parallelize Fortran array syntax; section~ 19.3 .

19.1 Sections

crumb trail: > omp-share > Sections

A parallel loop is an example of independent work units that are numbered. If you have a pre-determined number of independent work units, the sections is more appropriate. In a sections construct can be any number of section constructs. These need to be independent, and they can be execute by any available thread in the current team, including having multiple sections done by the same thread.

#pragma omp sections
{
#pragma omp section
  // one calculation
#pragma omp section
  // another calculation
}

This construct can be used to divide large blocks of independent work. Suppose that in the following line, both f(x) and g(x) are big calculations:

  y = f(x) + g(x)

You could then write

double y1,y2;
#pragma omp sections
{
#pragma omp section
  y1 = f(x)
#pragma omp section
  y2 = g(x)
}
y = y1+y2;

Instead of using two temporaries, you could also use a critical section; see section  22.2.1 . However, the best solution is have a reduction clause on the parallel sections directive. For the sum

  y = f(x) + g(x)

You could then write

// sectionreduct.c
#pragma omp parallel reduction(+:y)
#pragma omp sections
    {
#pragma omp section
      y += f();
#pragma omp section
      y += g();
    }

19.2 Single/master

crumb trail: > omp-share > Single/master

The limit the execution of a block to a single thread. This can for instance be used to print tracing information or doing I/O operations.

#pragma omp parallel
{
#pragma omp single
  printf("We are starting this section!\n");
  // parallel stuff
}

Another use of single is to perform initializations in a parallel region:

int a;
#pragma omp parallel
{
  #pragma omp single
    a = f(); // some computation
  #pragma omp sections
    // various different computations using a
}

The point of the single directive in this last example is that the computation needs to be done only once, because of the shared memory. Since it's a work sharing construct there is an implicit barrier after it, which guarantees that all threads have the correct value in their local memory (see section  25.3 .

Exercise

What is the difference between this approach and how the same computation would be parallelized in MPI?

The master directive, also enforces execution on a single thread, specifically the master thread of the team, but it does not have the synchronization through the implicit barrier.

Exercise

Modify the above code to read:

int a;
#pragma omp parallel
{
  #pragma omp master
    a = f(); // some computation
  #pragma omp sections
    // various different computations using a
}

This code is no longer correct. Explain.

Above we motivated the single directive as a way of initializing shared variables. It is also possible to use single to initialize private variables. In that case you add the clause. This is a good solution if setting the variable takes I/O.

Exercise

Give two other ways to initialize a private variable, with all threads receiving the same value. Can you give scenarios where each of the three strategies would be preferable?

19.3 Fortran array syntax parallelization

crumb trail: > omp-share > Fortran array syntax parallelization

The parallel do directive is used to parallelize loops, and this applies to both C and Fortran. However, Fortran also has implied loops in its array syntax . To parallelize array syntax you can use the workshare directive.

The workshare directive exists only in Fortran. It can be used to parallelize the implied loops in array syntax , as well as forall loops.

Back to Table of Contents