Chapter 6. Parallel Computing

Table of Contents

6.. Introduction
Computing is not Programming
6.. Shared Memory and Multi-Core Technology
6.. Distributed Parallel Computing
Clusters and HPC
Grids and HTC
6.. Multiple Parallel Jobs
6.. Graphics Processors (Accelerators)
6.. Best Practices in Parallel Computing
Parallelize as a Last Resort
Make it Quick
Monitor Your Jobs
Development and Testing Servers



In 1976, Los Alamos National Laboratories purchased a Cray-1 supercomputer for $8.8 million. As the world's fastest computer at the time, it had 8 mebibytes (a little over 8 million bytes) of main memory and was capable of 160 million floating point operations per second. (Source:

The first draft of this manual was written in September 2010 on a $700 desktop computer with a gibibyte (a little over 1 billion bytes) of main memory, and capable of over 2 billion floating point operations per second.

It may seem that today's computers have made the supercomputer obsolete. However, there are still, and always will be many problems that require far more computing power than any single processor can provide. There are many examples of highly optimized programs that take months to run even with thousands of today's processors. The volume and resolution of raw data awaiting analysis has exploded in recent years due to advances in both research techniques and technology. Enormous amounts of new data are being generated every day, and new ways to analyze old data are constantly being discovered.

Computing is not Programming

It is important to understand the difference between parallel computing and parallel programming. Parallel computing includes any situation where multiple computations are being done at the same time.

This often involves running multiple instances of the same serial (single-processor) program at the same time, each with different inputs. Typically, there is no communication or cooperation between the multiple instances as they run. This scenario is known as embarrassingly parallel computing.

Parallel programming, on the other hand, involves writing a special program that will utilize multiple processors. The code running on each processor will exchange information with the others in order to work together toward a common goal. This is much more complex than embarrassingly parallel computing, but is necessary for many problems.

There are many types of parallel architectures, and each is suited for specific types of problems. Writing parallel programs is not a process that can be easily generalized. Understanding of specific algorithms is crucial in determining if and how they can be decomposed into independent subtasks, and what will be the most suitable parallel architecture on which to run them. Some of the common parallel architectures are outlined in the following sections.



Be sure to thoroughly review the instructions in Section 2, “Practice Problem Instructions” before doing the practice problems below.
  1. How much has computing power increased since the 1970s?

  2. Is there still a need for parallel computing, given the advances in computing?