Parallel I/O

Experimental html version of downloadable textbook, see http://www.tacc.utexas.edu/~eijkhout/istc/istc.html
\[ \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \] Back to Table of Contents

52 Parallel I/O

For a great discussion see~ [Mendez:ParallelIOpage] , from which figures here are taken.

52.1 Use sequential I/O

crumb trail: > io > Use sequential I/O

MPI processes can do anything a regular process can, including opening a file. This is the simplest form of parallel I/O: every MPI process opens its own file. To prevent write collisions,

  • you use MPI_Comm_rank to generate a unique file name, or
  • you use a local file system, typically /tmp , that is unique per process, or at least per the group of processes on a node.

For reading it is actually possible for all processes to open the same file, but for reading this is not really feasible. Hence the unique files.

52.2 MPI I/O

crumb trail: > io > MPI I/O

In chapter~ MPI topic: File I/O we discussed MPI I/O. This is a way for all processes on a communicator to open a single file, and write to it in a coordinated fashion. This has the big advantage that the end result is an ordinary Unix file.

52.3 Higher level libraries

crumb trail: > io > Higher level libraries

Libraries such as NetCDF or HDF5 offer advantages over MPI I/O:

  • Files can be OS-independent, removing worries such as about little-endian storage.
  • Files are self-documenting: they contain the metadata describing their contents.

Back to Table of Contents