Chapter 1. Computational Science

Chapter 1. Computational Science
Prev	Part I. Research Computing	Next

So, What is Computational Science?

Nope, it's not the study of computation. That would be computer science.

Computational science is any method of scientific exploration involving the use of computers. This may involve using computer models directly for experimentation or using computers to analyze data from experiments performed by other means.

Not Just for Nerds Anymore

Computation has been a core part of mathematics, physics, chemistry, and engineering research for decades. It is rapidly gaining popularity in other areas of research such as biology, psychology, economics, political science, and just about any other discipline you can think of.

This trend is due in part to the introduction of other technologies into these fields, such as rapid gene sequencers and imaging technology such as MRI (Magnetic Resonance Imaging). These new technologies generate vast amounts of data that require significant computing resources to store and process. Researchers in these fields often spend the majority of their time on the computer and only a small fraction in the wet lab. If you don't like computer work, you may want to reconsider becoming a geneticist or MRI researcher.

The trend is also due to computer technology itself facilitating the storage and use of vast amounts of data in all walks of life. The evolution of fast, cheap computer technology and the Internet has made it possible to archive detailed records of things like election results and sales records and make them easily available to almost anyone in the world. There are many researchers these days who don't collect their own data, but simply use archives collected by others in the past.

The Computation Time Line

In doing computational science, we ultimately have a few key goals:

Minimize the wall time, the total time from the moment we decide what we want to do, to obtaining good quality results from the software we run.
Minimize man-hours, the amount of time we spend doing manual work.
Minimize computer time, the amount of time we wait for the computer to do its part.

These goals almost always go hand-in-hand, but occasionally there may be a trade off, where we sacrifice more man-hours to get results sooner, or accept a delay in results to save precious man-hours.

The figure below represents the time line of computational science project, showing typical time requirements for developing the software, deploying (installing) the software, learning the software, and finally running the software.

Any one of these steps could end up taking the majority of your time, so we need to consider all of them when devising a strategy for achieving our goals. This text will discuss ways to minimize the time required for each step as well as potential trade-offs involved.

Table 1.1. Computation Time Line

Development Time	Deployment Time	Learning Time	Run Time
Hours to years	Hours to months (or never)	Hours to weeks	Hours to months

Development Time

Developing software is inherently time-consuming. Large programs may require many thousands of man-hours to specify, design, implement, and test.

Fortunately, most researchers do not need to write major software of their own. There are many commercial and open source programs available to assist in a wide range of research methods. Many researchers will need to do some level of programming, however. If there is no quality free software for your research, the cost of commercial software or hiring a programmer may be beyond your means and likely to cause major delays even if you can afford them. It is a good idea to learn now and practice regularly so you're ready when you have to write some code of your own.

Deployment Time

Deploying software can and should be quick and easy. Unfortunately, many researchers are not aware of efficient deployment methods and often end up wasting time or even failing entirely to get software installed. This is a major obstacle to important research, which we will discuss in detail a bit later.

All software installations should be doable in seconds or minutes. They should not require hours, days, weeks, or months. If they do, then either you or the software developers are doing something wrong. Don't accept difficult software installations as a normal part of doing research.

Learning Time

Learning time is largely a matter of focus and quality of documentation. All I can offer here is some advice: Do your homework, locate the best sources of documentation, and invest some time in studying it without distraction.

Run Time

Run time depends on many factors, such as the algorithms used by the program, the language used to implement it (compiled languages are many times faster than interpreted as discussed in the section called “Compiled vs Interpreted Languages”), and the hardware it runs on.

Many scientific programs are extremely poor quality and could be made to run hundreds or even thousands of times faster with the right programming skills. Optimizing the software should always be done before throwing more hardware resources at it. When it is not feasible to make the program faster, one might consider using parallel computing resources. However, doing so before optimizing the software would be an unwise and possibly unethical waste of costly resources. There is also a steep learning curve involved in using parallel resources that can be avoided by improving the program first.

Practice

Note

Be sure to thoroughly review the instructions in Section 2, “Practice Problem Instructions” before doing the practice problems below.

What kind of research are you currently conducting, and how might computers be used to further your goals?
How does computational science differ from computer science?
What areas of research benefit from computational science?
Describe two reasons that computational science is growing so rapidly.
What are the major goals in the computational time line?
What are the major steps in the computational time line? Which one takes the longest?
Can researchers avoid programming entirely? Why or why not?