OpenMP topic: Memory model

Experimental html version of downloadable textbook, see http://www.tacc.utexas.edu/~eijkhout/istc/istc.html
\[ \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \] 25.1 : Thread synchronization
25.2 : Data races
25.3 : Relaxed memory model
Back to Table of Contents

25 OpenMP topic: Memory model

25.1 Thread synchronization

crumb trail: > omp-memory > Thread synchronization

Let's do a producer-consumer model\footnote{This example is from Intel's excellent OMP course by Tim Mattson}. This can be implemented with sections, where one section, the producer, sets a flag when data is available, and the other, the consumer, waits until the flag is set.

#pragma omp parallel sections
{
  // the producer
  #pragma omp section
  {
    ... do some producing work ...
    flag = 1;
  }
  // the consumer
  #pragma omp section
  {
    while (flag==0) { }
    ... do some consuming work ...
  }
}

One reason this doesn't work, is that the compiler will see that the flag is never used in the producing section, and that is never changed in the consuming section, so it may optimize these statements, to the point of optimizing them away.

The producer then needs to do:

... do some producing work ...
#pragma omp flush
#pragma atomic write
  flag = 1;
#pragma omp flush(flag)

and the consumer does:

#pragma omp flush(flag)
while (flag==0) {
  #pragma omp flush(flag)
}
#pragma omp flush

This code strictly speaking has a race condition on the flag variable.

The solution is to make this an atomic operation and use an atomic pragma here: the producer has

#pragma atomic write
  flag = 1;

and the consumer:

while (1) {
  #pragma omp flush(flag)
  #pragma omp atomic read
    flag_read = flag
  if (flag_read==1) break;
}

25.2 Data races

crumb trail: > omp-memory > Data races

OpenMP, being based on shared memory, has a potential for race conditions . These happen when two threads access the same data item. The problem with race conditions is that programmer convenience runs counter to efficient execution. For this reason, OpenMP simply does not allow some things that would be desirable.

For a simple example:

// race.c
#pragma omp parallel for shared(counter)
  for (int i=0; i<count; i++)
    counter++;
  printf("Counter should be %d, is %d\n",
	 count,counter);

The basic rule about multiple-thread access of a single data item is:

Any memory location that is written by one thread, can not be read by another thread in the same parallel region, if no synchronization is done.

To start with that last clause: any workshare construct ends with an implicit barrier , so data written before that barrier can safely be read after it.

As an illustration of a possible problem:

c = d = 0;
#pragma omp sections
{
#pragma omp section
  { a = 1; c = b; }
#pragma omp section
  { b = 1; d = a; }
}

Under any reasonable interpretation of parallel execution, the possible values for c,d are $1,1$ $0,1$ or $1,0$. This is known as sequential consistency : the parallel outcome is consistent with a sequential execution that interleaves the parallel computations, respecting their local statement orderings. (See also  Eijkhout:IntroHPC .)

However, without synchronization, threads are allowed to maintain a value for a variable locally that is not the same as the stored value. In this example, that means that the thread executing the first section need not write its value of  a to memory, and likewise b  in the second thread, so $0,0$ is in fact a possible outcome.

In order to resolve multiple accesses:

  1. Thread one reads the variable.
  2. Thread one flushes the variable.
  3. Thread two flushes the variable.
  4. Thread two reads the variable.

25.3 Relaxed memory model

crumb trail: > omp-memory > Relaxed memory model

flush

  • There is an implicit flush of all variables at the start and end of a parallel region .
  • There is a flush at each barrier, whether explicit or implicit, such as at the end of a work sharing !flush after}.
  • At entry and exit of a critical section section!flush at}
  • When a lock is set or unset.

Back to Table of Contents