This chapter explains the basic concepts of CAF , and helps you get started on running your first program.
CAF is built on the same SPMD design as MPI. Where MPI talks about processes or ranks, CAF calls the running instances of your program image s.
The Intel compiler uses the flag -coarray=xxx with values single , shared , distributed gpu .
It is possible to bake the number of `images' into the executable, but by default this is not done, and it is determined at runtime by the variable FOR_COARRAY_NUM_IMAGES .CAF
can not be mixed with OpenMP.
Co-arrays are defined by giving them, in addition to the Dimension , a
Complex,codimension(*) :: number Integer,dimension(:,:,:),codimension[-1:1,*] :: grid
This means we are respectively declaring an array with a single number on each image, or a three-dimensional grid spread over a two-dimensional processor grid.
Traditional-like syntax can also be used:
Complex :: number[*] Integer :: grid(10,20,30)[-1:1,*]
Unlike MPI , which normally only supports a linear process numbering, CAF allows for multi-dimensional process grids. The last dimension is always specified as * , meaning it is determined at runtime.
As in other models, in CAF one can ask how many images/processes there are, and what the number of the current one is, with and respectively.
// hello.F90 write(*,*) "Hello from image ", this_image(), & "out of ", num_images()," total images"
If you call this_image with a co-array as argument, it will return the image index, as a tuple of s, rather than a linear index. Given such a set of subscripts, will return the linear index.
The functions and give the lower and upper bound on the image subscripts, as a linear index, or a tuple if called with a co-array variable.
The appeal of CAF is that moving data between images looks (almost) like an ordinary copy operation:
real :: x(2)[*] integer :: p p = this_image() x(1)[ p+1 ] = x(2)[ p ]
Exchanging grid boundaries is elegantly done with array syntax:
Real,Dimension( 0:N+1,0:N+1 )[*] :: grid grid( N+1,: )[p] = grid( 0,: )[p+1] grid( 0,: )[p] = grid( N,: )[p-1]
The fortran standard forbids race conditions :
If a variable is defined on an image in a segment, it shall not be referenced, defined or become undefined in a segment on another image unless the segments are ordered.
That is, you should not cause them to happen. The language and runtime are certainly not going to help yu with that.
Well, a little. After remote updates you can synchronize images with the call. The easiest variant is a global synchronization:
Compare this to a wait call after MPI nonblocking calls.
More fine-grained, one can synchronize with specific images:
sync images( (/ p-1,p,p+1 /) )
While remote operations in CAF are nicely one-sided, synchronization is not: if image p issues a call
then q also needs to issue a mirroring call to synchronize with p .
As an illustration, the following code is not a correct implementation of a ping-pong :
// pingpong.F90 sync all if (procid==1) then number[procid+1] = number[procid] else if (procid==2) then number[procid-1] = 2*number[procid] end if sync all
We can solve this with a global synchronization:
sync all if (procid==1) & number[procid+1] = number[procid] sync all if (procid==2) & number[procid-1] = 2*number[procid] sync all
if (procid==1) & number[procid+1] = number[procid] if (procid<=2) sync images( (/1,2/) ) if (procid==2) & number[procid-1] = 2*number[procid] if (procid<=2) sync images( (/2,1/) )
Example of how you would synchronize a collective:
if ( this_image() .eq. 1 ) sync images( * ) if ( this_image() .ne. 1 ) sync images( 1 )
Here image 1 synchronizes with all others, but the others don't synchronize with each other.
if (procid==1) then sync images( (/procid+1/) ) else if (procid==nprocs) then sync images( (/procid-1/) ) else sync images( (/procid-1,procid+1/) ) end if
Collectives are not part of CAF as of the 2008 Fortran standard.