Table of Contents
The first rule in parallel programming: Don't, unless you really have to.
Anything you can do in serial, you can do in parallel with a lot more effort.
People often look to parallelize their code simply because it's slow, without realizing that it could be made to run hundreds of times faster without being parallelized. Using better algorithms, reducing memory use, or switching from an interpreted language to a compiled language will usually be far easier than parallelizing the code. Parallel programming is difficult, and running programs on a cluster is inconvenient compared to running them on your PC.
Some people are motivated by ego or even a pragmatic desire to impress someone, rather than a desire to solve a real problem.
Using a $1,000,000 cluster to make inefficient programs run faster or to make yourself look smart is an unethical abuse of resources. If an organization spends this much money on computing resources, it's important to ensure that they are used productively. Programmers have a responsibility to ensure that their code is well optimized before running it on a cluster.
As explained in Chapter 33, Programming for HTC, you don't necessarily need to write a parallel program in order to utilize parallel computing resources.
In many cases, you can run simply multiple instances of a serial program at the same time. This is known as "embarrassingly parallel" computing. This is not only far easier than writing a parallel program, it also achieves better speedup in most cases, since there are no communication bottlenecks between the many processes. This type of parallel computing also scales almost infinitely. While some parallel programs can't effectively utilize more than a few processors, embarrassingly parallel computing jobs can usually utilize thousands and achieve nearly perfect speedup. ( Running N processes at once will reduce the total computation time by a factor of N. )
Think to parallelize your entire computing project, not your program. If an individual run takes only hours or days, and you have to do many runs, then embarrassingly parallel computing will serve your needs very well. Parallelizing a program is only worthwhile when you have a very long running program (usually weeks or more) that will only be run a few times.
You don't need parallel computing resources to develop and test parallel programs.
Parallel code can be developed and tested on a single machine, even with a single core. This is much easier and faster than developing in the more complex scheduled environment of a cluster or grid, and avoids wasting valuable resources on test runs that won't produce useful output.
You can control the number of processes used by OpenMP, MPI, and other parallel computing systems, even using more than one process per core.
Of course, you won't be able to measure speed-up until you run on multiple cores, but that doesn't matter through most of the development cycle.
Develop small test cases to run on your development and testing system, which could be a server, a workstation, or even your laptop. This will be sufficient to test for correctness, which is the vast majority of the development effort.