Chapter 34. Programming for HPC

Table of Contents

34.. Introduction
34.. Common Uses for HPC
34.. Real HPC Examples
34.. HPC Programming Languages and Libraries
34.. Shared Memory Parallel Programming with OpenMP
OMP Parallel
OMP Loops
Shared and Private Variables
Critical and Atomic Sections
34.. Shared Memory Parallelism with POSIX Threads
34.. Message Passing Interface (MPI)
34.. MPI vs Other Parallel Paradigms
34.. Structure of an MPI Job
34.. A Simple MPI Program
34.. Best Practices with MPI
34.. Higher Level MPI Features
Parallel Message Passing
34.. Process Distribution
34.. Parallel Disk I/O

Before You Begin

Before reading this chapter, you should be familiar with basic Unix concepts (Chapter 3, Using Unix), the Unix shell (the section called “Command Line Interfaces (CLIs): Unix Shells”, redirection (the section called “Redirection and Pipes”), shell scripting (Chapter 4, Unix Shell Scripting), and have some experience with computer programming.


There are many computational problems that cannot be decomposed into completely independent subproblems.

High Performance Computing refers to a class of distributed parallel computing problems where the processes are not completely independent of each other, but must cooperate in order to solve a problem. The processes within a job are more tightly coupled, i.e. they exchange information with each other while running.

Because HPC processes are not independent of each other, the programming is more complex than HTC. Due to the complexity of HPC programming, it's usually worth the effort to search for previously written solutions. Most well-known mathematical functions used in science and engineering that can be solved in a distributed parallel fashion have already been implemented. Many complex tasks such as finite element analysis, fluid modeling, common engineering simulations, etc. have also been implemented in both open source and commercial software packages. Chances are that you won't need to reinvent the wheel in order to utilize HPC.

HPC jobs also may require a high-speed dedicated network to avoid a communication bottleneck. Hence, HPC models are generally restricted to clusters, and few will run effectively on a grid.

HPC problems do not scale as easily as HTC. Generally, more communication between processes means less scalability, but the reality is not so simple. Some HPC models cannot effectively utilize more than a dozen cores. Attempting to use more will increase communication overhead to the point that the job will take as long or longer than it does using fewer cores. On the other hand, some HPC models can scale to hundreds or thousands of cores. Each model is unique and there is no simple way to predict how a model will scale.

  1. Explain the basic difference between HTC and HPC.
  2. What are the advantages of HTC?
  3. What are the advantages of HPC?