2.11. Shell and Environment Variables

Variables are essential to any programming language, and scripting languages are no exception. Variables are useful for user input, control structures, and for giving short names to commonly used values such as long path names.

Most programming languages distinguish between variables and constants, but in shell scripting, we use variables for both.

Shell processes have access to two separate sets of string variables.

Recall from Section 1.16, “Environment Variables” that every Unix process has a set of string variables called the environment, which are handed down from the parent process in order to communicate important information.

For example, the TERM variable, which identifies the type of terminal a user us using, is used by programs such as top, vi, nano, more, and other programs that need to manipulate the terminal screen (move the cursor, highlight certain characters, etc.) The TERM environment variable is usually set by the shell process so that all of the shell's child processes (those running vi, nano, etc.) will inherit the variable.

Unix shells also keep another set of variables that are not part of the environment. These variables are used only for the shell's purpose and are not handed down to child processes.

There are some special shell variables such as "prompt" and "PS1" (which control the appearance of the shell prompt in C shell and Bourne shell, respectively).

Most shell variables, however, are defined by the user for use in scripts, just like variables in any other programming language.

2.11.1. Assignment Statements

In all Bourne Shell derivatives, a shell variable is created or modified using the same simple syntax:

varname=value
            

Caution

In Bourne shell and its derivatives, there can be no space around the '='. If there were, the shell would think that 'varname' is a command, and '=' and 'value' are arguments.
bash-4.2$ name = Fred
bash: name: command not found
bash-4.2$ name=Fred
bash-4.2$ printf "$name\n"
Fred
            

When assigning a string that contains white space, it must be enclosed in quotes or the white space characters must be escaped:

#!/usr/bin/env bash

name=Joe Sixpack    # Error
name="Joe Sixpack"  # OK
name=Joe\ Sixpack   # OK
            

C shell and T-shell use the set command for assigning variables.

#!/bin/csh

set name="Joe Sixpack"
            

Caution

Note that Bourne family shells also have a set command, but it has a completely different meaning, so take care not to get confused. The Bourne set command is used to set shell command-line options, not variables.

Unlike some languages, shell variables need not be declared before they are assigned a value. Declaring variables is unnecessary, since there is only one data type in shell scripts.

All variables in a shell script are character strings. There are no integers, Booleans, enumerated types, or floating point variables, although there are some facilities for interpreting shell variables as integers, assuming they contain only digits.

If you must manipulate real numbers in a shell script, you could accomplish it by piping an expression through bc, the Unix arbitrary-precision calculator:

printf "scale=5\n243.9 * $variable\n" | bc
            

Such facilities are very inefficient compared to other languages, however, partly because shell languages are interpreted, not compiled, and partly because they must convert each string to a number, perform arithmetic, and convert the results back to a string. Shell scripts are meant to automate sequences of Unix commands and other programs, not perform numerical computations.

In Bourne shell family shells, environment variables are set by first setting a shell variable of the same name and then exporting it.

TERM=xterm
export TERM
            

Modern Bourne shell derivatives such as bash (Bourne Again Shell) can do it in one line:

export TERM=xterm
            

Note

Exporting a shell variable permanently tags it as exported. Any future changes to the variable's value will automatically be copied to the environment. This type of linkage between two objects is very rare in programming languages: Usually, modifying one object has no effect on any other.

C shell derivatives use the setenv command to set environment variables:

setenv TERM xterm
            

Caution

Note that unlike the 'set' command, setenv requires white space, not an '=', between the variable name and the value.

2.11.2. Variable References

To reference a shell variable or an environment variable in a shell script, we must precede its name with a '$'. The '$' tells the shell that the following text is to be interpreted as a variable name rather than a string constant. The variable reference is then expanded, i.e. replaced by the value of the variable. This occurs anywhere in a command except inside a string bounded by single quotes or following an escape character (\), as explained in Section 2.9, “Strings”.

These rules are basically the same for all Unix shells.

#!/usr/bin/env bash

name="Joe Sixpack"
printf "Hello, name!\n"     # Not a variable reference!
printf "Hello, $name!\n"    # References variable "name"
            

Output:

Hello, name!
Hello, Joe Sixpack!
            

Practice Break

Type in and run the following scripts:

#!/bin/sh

first_name="Bob"
last_name="Newhart"
printf "%s %s is the man.\n" $first_name $last_name
                

CSH version:

#!/bin/csh

set first_name="Bob"
set last_name="Newhart"
printf "%s %s is the man.\n" $first_name $last_name
                

Note

If both a shell variable and an environment variable with the same name exist, a normal variable reference will expand the shell variable.

In Bourne shell derivatives, a shell variable and environment variable of the same name always have the same value, since exporting is the only way to set an environment variable. Hence, it doesn't really matter which one we reference.

In C shell derivatives, a shell variable and environment variable of the same name can have different values. If you want to reference the environment variable rather than the shell variable, you can use the printenv command:

Darwin heron bacon ~ 319: set name=Sue
Darwin heron bacon ~ 320: setenv name Bob
Darwin heron bacon ~ 321: echo $name
Sue
Darwin heron bacon ~ 322: printenv name
Bob
            

There are some special C shell variables that are automatically linked to environment counterparts. For example, the shell variable path is always the same as the environment variable PATH. The C shell man page is the ultimate source for a list of these variables.

If a variable reference is immediately followed by a character that could be part of a variable name, we could have a problem:

#!/usr/bin/env bash

name="Joe Sixpack"
printf "Hello to all the $names of the world!\n"
            

Instead of printing "Hello to all the Joe Sixpacks of the world", the printf will fail because there is no variable called "names". In Bourne Shell derivatives, non-existent variables are treated as empty strings, so this script will print "Hello to all the of the world!". C shell will complain that the variable "names" does not exist.

We can correct this by delimiting the variable name in curly braces:

#!/usr/bin/env bash

name="Joe Sixpack"
printf "Hello to all the ${name}s of the world!\n"
            

This syntax works for all shells.

2.11.3. Using Variables for Code Quality

Another very good use for shell variables is in eliminating redundant string constants from a script:

#!/usr/bin/env bash

output_value=`myprog`
printf "$output_value\n" >> Run2/Output/results.txt
more Run2/Output/results.txt
cp Run2/Output/results.txt latest-results.txt
            

If for any reason the relative path Run2/Output/results.txt should change, then you'll have to search through the script and make sure that all instances are updated. This is a tedious and error-prone process, which can be avoided by using a variable:

#!/usr/bin/env bash

output_value=`myprog`
output_file="Run2/Output/results.txt"
printf "$output_value\n" >> $output_file
more $output_file
cp $output_file latest-results.txt
            

In the second version of the script, if the path name of results.txt changes, then only one change must be made to the script.

Avoiding redundancy is one of the primary goals of any good programmer.

In a more general programming language such as C or Fortran, this role would be served by a constant, not a variable. However, shells do not support constants, so we use a variable for this.

In most shells, a variable can be marked read-only in an assignment to prevent accidental subsequent changes. Bourne family shells use the readonly command for this, while C shell family shells use set -r.

#!/bin/sh

readonly output_value=`myprog`
printf "$output_value\n" >> Run2/Output/results.txt
more Run2/Output/results.txt
cp Run2/Output/results.txt latest-results.txt
            
#!/bin/csh

set -r output_value=`myprog`
printf "$output_value\n" >> Run2/Output/results.txt
more Run2/Output/results.txt
cp Run2/Output/results.txt latest-results.txt
            

2.11.4. Output Capture

Output from a command can be captured and used as a string in the shell environment by enclosing the command in back-quotes (``). In Bourne-compatible shells, we can use $() in place of back-quotes.

            #!/bin/sh -e
            
            # Using output capture in a command
            printf "Today is %s.\n" `date`
            printf "Today is %s.\n" $(date)
            
            # Using a variable.  If using the output more than once, this will
            # avoid running the command multiple times.
            today=`date`
            printf "Today is %s\n" $today
            

2.11.5. Self-test

  1. Describe two purposes for shell variables.
  2. Are any shell variable names reserved? If so, describe two examples.
  3. Show how to assign the value "Roger" to the variable "first_name" in both Bourne shell and C shell.
  4. Why can there be no spaces around the '=' in a Bourne shell variable assignment?
  5. How can you avoid problems when assigning values that contain white space?
  6. Do shell variables need to be declared before they are used? Why or why not?
  7. Show how to assign the value "xterm" to the environment variable TERM in both Bourne shell and C shell.
  8. Why do we need to precede variables names with a '$' when referencing them?
  9. How can we output a letter immediately after a variable reference (no spaces between them). For example, show a printf statement that prints the contents of the variable fruit immediately followed by the letter 's'.
    fruit=apple
    # Show a printf that will produce the output "I have 10 apples", using
    # the variable fruit.
                    
  10. How can variables be used to enhance code quality? From what kinds of errors does this protect you?
  11. How can a variable be made read-only in Bourne shell? In C shell?