Pitfalls and Best Practices

A very common and very bad practice in shell scripting is inferring things from the wrong information in order to make decisions. One of the most common ways this bad approach is used involves making assumptions based on the operating system in use. Take the following code segment, for example:

if [ `uname` == 'Linux' ]; then
    compiler='gcc'      # Compiler to use
    endian='little'     # Byte order for this CPU
fi
        

Both of the assumptions made about Linux in the code above were taken from real examples!

Setting the compiler to gcc because we're running on Linux is simply wrong, because Linux can run other compilers such as clang or icc. Compiler selection should be based on the user's wishes or the needs of the program being compiled, not the operating system. Also, gcc is the same as cc on most Linux systems (they are generally hard links to the same executable file). Hence, there is no reason to explicitly use gcc on Linux. The cc command is available on all Unix systems, being gcc on Linux and clang on FreeBSD, for example. So just using cc to compile programs is the safest default.

Assuming the machine is little-endian is wrong because Linux runs on a variety of CPU types, some of which are big-endian. The user who wrote this code assumed that if the computer is running Linux, it must be a PC with an x86 processor, which is not a valid assumption. The alternative for that user was an SGI IRIX workstation using a big-endian MIPS processor. Even if an operating system only runs on little-endian processors today does not mean the same will be true tomorrow. Hence, a check like this is a time-bomb, even if it's valid at the moment you write it.

There are simple ways to find out the actual endianness of a system, so why would we instead try to infer it from an unrelated fact? We should instead use something like the open source endian program, which runs on any Unix compatible system.

if [ `endian` == 'little' ]; then

fi
        

Moreover, users at this level should never have to worry about the endianness of a system. The fact that the user needed to check for this at the shell level indicates a serious design flaw in one of the programs he was using.

We can find out the exact CPU type using uname -m or uname -p. They usually report the same string, but on some platforms may produce different but equivalent strings such as "amd64" and "x86_64" or "arm64" and "aarch64".

Practice

Note

Be sure to thoroughly review the instructions in Section 2, “Practice Problem Instructions” before doing the practice problems below.
  1. What can we infer about the hardware on which our script is running based on the name of the operating system?

  2. How should a script decide what compiler to use to build programs?

  3. How should a script go about determining what type of CPU your system uses?