Best Practices with MPI

The MPI system attempts to clean up failed jobs by terminating all processes in the event that any one of the processes fails. However, MPI's automatic cleanup can take time, and cannot always detect failures. This sometimes leads to orphaned processes remaining in the system unbeknownst to the scheduler, which can cause problems for other jobs.

Hence, it is each programmer's responsibility to make sure that their MPI programs do not leave orphaned processes hanging around on the cluster. How this is accomplished depends on the particular program, however every MPI program should follow these general rules:

  1. Check the exit status of every MPI function/subroutine call and every other statement in the program that could fail in a way that would prevent the program from running successfully. Some other examples include memory allocations and file opens, reads, writes. These are only examples, however. It is the programmer's responsibility to examine every line of code and check for possible failures. There should be no exceptions to this rule.
  2. Whenever a statement fails, the program should detect the failure and take appropriate action. If you cannot determine a course of action that would allow the program to continue, then simply perform any necessary cleanup work and terminate the process immediately. MPI programs that do not self-terminate may leave orphaned processes running on a cluster that interfere with other users' jobs.
Self-test
  1. How can an MPI program ensure that it doesn't leave orphaned processes running?