Write It

Writing software is time consuming (i.e. expensive), although not nearly as hard as most people make it for themselves. Those who have the programming skills and esoteric software needs may choose to write their own software.

For these people, choosing the right operating system and the right programming language are critical. All software should be written to be portable (to run on any operating system and hardware) and computational software must be performant (run as fast and efficiently as possible). Programs written in a compiled language will run on the order of 100 times faster than the same program in an interpreted language. See Chapter 3, Using Unix for more details about operating systems. the section called “Compiled vs Interpreted Languages” discusses language performance in detail.

Needs may dictate which compiled language you use, but if you have a choice, start with C. It's much simpler and more portable than C++ or Fortran. You can learn the language quickly and then focus on improving the quality of your code rather than getting bogged down in learning more language features. Mastering C++ is a career in and of itself. More on this in Part III, “High Performance Programming”.

Interactive software that does minimal computation and is mainly an interface for visualizing data may not need to offer high performance. In this case, the programming language is chosen for convenience rather than speed. Interpreted languages such as Python and Matlab are far slower than compiled languages, but offer convenient plotting libraries such as Python's Matplotlib that make it easy to create beautiful plots and graphs of your data.

Figure 2.2, “Visualizing Gene Neighborhoods with Matplotlib” shows how gene neighborhoods can be visualized using Matplotlib to understand the changes that have occurred over the course of evolution. This is part of a software suite called Microsynteny Tools, which includes several computational programs written in C for optimal performance, and visualization tools written in Python to leverage the convenience and power of Matplotlib.

Figure 2.2. Visualizing Gene Neighborhoods with Matplotlib

Visualizing Gene Neighborhoods with Matplotlib

The main goals when writing a program should always be as follows:

Your time will be much better spent finding established and well-tested software to incorporate into your programs, rather than writing everything yourself. For example, if your program involves typical matrix operations, there are many highly-efficient math libraries available that your program can use, such as BLAS, LAPACK, Eigen, Arpack, and METIS, just to name a few. Writing your own matrix multiplication routine would be an enormous waste of your own time and computer time, since the prewritten routines in one of the previously mentioned libraries are probably much faster than anything you would write yourself.

Beware: Bad advice on choosing operating systems and languages abounds in most professions. Many people choose things for irrational reasons, such as finding the appearance pleasing, popularity among friends, etc. A smart selection is based on objective measures such as portability (does it run on any operating system and processor type?), performance, reliability, price, etc.



Be sure to thoroughly review the instructions in Section 2, “Practice Problem Instructions” before doing the practice problems below.
  1. What are the primary goals in writing any software?

  2. What is the main advantage of compiled languages over interpreted languages?

  3. What is the main advantage of C over other compiled languages for busy researchers?

  4. How much should we rely on the advice of others when choosing a language or operating system? Why?