# OpenMP topic: Memory model

##### Experimental html version of downloadable textbook, see http://www.tacc.utexas.edu/~eijkhout/istc/istc.html
25.2 : Data races
25.3 : Relaxed memory model

# 25 OpenMP topic: Memory model

## 25.1 Thread synchronization

crumb trail: > omp-memory > Thread synchronization

Let's do a producer-consumer model\footnote{This example is from Intel's excellent OMP course by Tim Mattson}. This can be implemented with sections, where one section, the producer, sets a flag when data is available, and the other, the consumer, waits until the flag is set.

#pragma omp parallel sections
{
// the producer
#pragma omp section
{
... do some producing work ...
flag = 1;
}
// the consumer
#pragma omp section
{
while (flag==0) { }
... do some consuming work ...
}
}


One reason this doesn't work, is that the compiler will see that the flag is never used in the producing section, and that is never changed in the consuming section, so it may optimize these statements, to the point of optimizing them away.

The producer then needs to do:

... do some producing work ...
#pragma omp flush
#pragma atomic write
flag = 1;
#pragma omp flush(flag)


and the consumer does:

#pragma omp flush(flag)
while (flag==0) {
#pragma omp flush(flag)
}
#pragma omp flush


This code strictly speaking has a race condition on the flag variable.

The solution is to make this an atomic operation and use an atomic pragma here: the producer has

#pragma atomic write
flag = 1;


and the consumer:

while (1) {
#pragma omp flush(flag)
#pragma omp atomic read
}


## 25.2 Data races

crumb trail: > omp-memory > Data races

OpenMP, being based on shared memory, has a potential for race conditions . These happen when two threads access the same data item. The problem with race conditions is that programmer convenience runs counter to efficient execution. For this reason, OpenMP simply does not allow some things that would be desirable.

For a simple example:

// race.c
#pragma omp parallel for shared(counter)
for (int i=0; i<count; i++)
counter++;
printf("Counter should be %d, is %d\n",
count,counter);


The basic rule about multiple-thread access of a single data item is:

Any memory location that is written by one thread, can not be read by another thread in the same parallel region, if no synchronization is done.

To start with that last clause: any workshare construct ends with an implicit barrier , so data written before that barrier can safely be read after it.

As an illustration of a possible problem:

c = d = 0;
#pragma omp sections
{
#pragma omp section
{ a = 1; c = b; }
#pragma omp section
{ b = 1; d = a; }
}


Under any reasonable interpretation of parallel execution, the possible values for c,d are $1,1$ $0,1$ or $1,0$. This is known as sequential consistency : the parallel outcome is consistent with a sequential execution that interleaves the parallel computations, respecting their local statement orderings. (See also  Eijkhout:IntroHPC .)

However, without synchronization, threads are allowed to maintain a value for a variable locally that is not the same as the stored value. In this example, that means that the thread executing the first section need not write its value of  a to memory, and likewise b  in the second thread, so $0,0$ is in fact a possible outcome.

In order to resolve multiple accesses:

2. Thread one flushes the variable.
3. Thread two flushes the variable.