Command the Command Line

Part II - Getting Your Bearings

Command Invocation

vm$ ls
my-amazing-subdirectory
my-normal-file.txt
vm$ ls my-amazing-subdirectory
oh-boy-another-directory
just-another-file.txt
vm$ 

So far, we've been issuing commands in a relatively intuitive way: the name of the program, optionally followed by some specific "target"

vm$ LS_COLORS="di=42" ls -lr --all --sort=size music/*.mp3
Ace of Base - I Saw the Sign.mp3
Santana - Smooth.mp3
Mahler - Symphony No. 2 in C minor - 04 - Urlicht.mp3
vm$ 

Commands can get far more cryptic. In this section, we'll step through the common conventions around invoking Unix commands.

It's important to remember that every tool has its own set of options. For instance, the option -l does one thing to the ls utility, but it does something very different to tree, and it isn't even recognized for the cat utility.

We'll cover a handful of tools, but we're not looking to learn how to do everything with every utility. We just want to be able to understand what's going on in these arcane incantations.

The Executable

vm$ pwd
/home/sally
vm$ /bin/pwd
/home/sally
vm$ 

When we talk about "invoking a command," we normally mean running a program.

This is the terminal equivalent to double-clicking on an application icon in a macOS or Windows desktop.

Programs like pwd are just executable files that reside in special locations in the file system (more detail on that later in this chapter).

The "essential utilities" we cover here (like pwd and ls) have been written by many people over many years. They usually stick to set of conventions around how they are run--that's what this section is all about. Just remember: these utilities serve different purposes and were created independently, so they all have unique aspects as well.

Path options

vm$ ls ~/video ~/music
~/video:

~/music:
Ace of Base - I Saw the Sign.mp3
Mahler - Symphony No. 2 in C minor - 04 - Urlicht.mp3
Santana - Smooth.mp3
Stallman - Free Software Song.ogg
vm$ 

Many utilities accept one or more path references. ls for example, will display the contents of every directory that is specified as an option.

We've already seen how the shell replaces the tilde (~) character with the complete path to our "home" directory. We'll cover this behavior in more detail in just a moment.

Named options

vm$ cat my-normal-file.txt
     This is the first line of just-another-file.txt
     This is the second line of the file!
     The file only has three lines, and this is the last one!
vm$ cat --number my-normal-file.txt
     1 This is the first line of just-another-file.txt
     2 This is the second line of the file!
     3 The file only has three lines, and this is the last one!
vm$ 

Other options are "named"--they don't describe any particular file. Instead, their presence alters the behavior of the command. You can identify these because they usually start with two dashes (--)

In this example, we're invoking cat with a named option and a path option. The --number option changes how cat displays the file contents.

vm$ ls --sort=size ~/music
Stallman - Free Software Song.ogg
Ace of Base - I Saw the Sign.mp3
Santana - Smooth.mp3
Mahler - Symphony No. 2 in C minor - 04 - Urlicht.mp3
vm$ 

Some options accept "arguments"--these allow for finer control over the option's behavior.

By supplying the argument size to the sort option, we can cause the ls utility to arrange the list of directory contents in order of file size.

vm$ cat --number --show-ends my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ cat --show-ends --number my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ 

Multiple options can be specified at the same time. If the effect of the options are independent (like in --number and --show-ends for the cat utility), then you can usually list them in whatever order you like. However, the order may matter in some cases; when in doubt, check the utility's documentation.

vm$ cat -n --show-ends my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ cat --number -E my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ cat -n -E my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ cat -nE my-normal-file.txt
     1 This is the first line of just-another-file.txt$
     2 This is the second line of the file!$
     3 The file only has three lines, and this is the last one!$
vm$ 

Many options can also be specified with an abbreviated form, usually just a single letter. You'll prefix these with a single dash character (-). For example, the option -n has the same effect as --number on the cat utility.

To make things even more compact, you can combine any "short" arguments together and just use one dash. To cat, -n -E is the same as -nE (or -En for that matter).

If all these different forms behave the same, how should you choose between them? It comes down to personal preference. Short options are easier to type, and that becomes more important as you get more comfortable with the command line. Long options are easier to remember, and they're also easier to read. Readability is important when you write shell programs that you may share with other people (more on this in Chapter 13 - Scripting).

Anatomy of a Command

                   LS_COLORS="di=42" ls -lr --all --sort=size music/*.mp3
                   ^                 ^  ^   ^     ^      ^    ^     ^
                   |                 |  |   |     |      |    |     |
            ??? ---+                 |  |   |     |      |    |     |
     executable ---------------------+  |   |     |      |    |     |
  short options ------------------------+   |     |      |    |     |
   long options ----------------------------+-----+      |    |     |
option argument -----------------------------------------+    |     |
   file options ----------------------------------------------+     |
            ??? ----------------------------------------------------+

Options account for a majority of the complexity you will run across in shell commands.

Command Options

dance instructions

"Dance Steps on Broadway" by javacolleen is licensed under CC BY-NC-ND 2.0.

We've seen a bunch of examples of options for various programs, but you may be wondering, "How do we know what's available?" Even if you remember that ls is short for "list," there is nothing intuitive about the usage ls -lah.

Fortunately, there are tools available for discovery.

man

Reference manual pages (documentation)

vm$ man ls
(1)                  User Commands                 LS(1)

NAME
       ls - list directory contents

SYNOPSIS
       ls [OPTION]... [FILE]...

DESCRIPTION
       List  information  about  the  FILEs  (the  current
       directory by default).  Sort entries alphabetically
       if none of -cftuvSUX nor --sort is specified.

       Mandatory  arguments  to long options are mandatory
       for short options too.

       -a, --all
              do not ignore entries starting with .

help

Find information on built-in commands

vm$ help cd
cd: cd [-L|[-P [-e]] [-@]] [dir]
    Change the shell working directory.
    
    Change the current directory to DIR.  The default DIR is the value of the
    HOME shell variable.

If a command is not available in man, you can try help. The distinction here is beyond the scope of this chapter, although we'll revisit it in the next chapter. For now, you can use help as a fallback when man doesn't have what you're looking for.

The --help option

vm$ cat --help
Usage: cat [OPTION]... [FILE]...
Concatenate FILE(s), or standard input, to standard output.

  -A, --show-all           equivalent to -vET

Finally, many programs support a --help option. If man and help don't have any information, --help is a good "last ditch" place to try.

The Shell

                                 +-------+      +--------+
command text (via keyboard) ---> |       | ---> |        |
                                 | shell |      | system |
output (via display)        <--- |       | <--- |        |
                                 +-------+      +--------+

There's more to invocation than just options and arguments, though!

The command prompt we've been using is provided by a program called a "shell." When you enter text into the prompt, the shell is responsible for wrangling together all the necessary executables, options, and/or files. This is why it is sometimes referred to as "the interpreter."

Shell Expansion

The ~ character

                   +-------+
"vm$"         <--- |       |                                    +--------+
"ls ~/movies" ---> |       | -> "/bin/ls /home/sally/movies" -> |        |
                   | shell |                                    | system |
"hellboy.mp4" <--- |       | <------- "hellboy.mp4" <---------- |        |
"vm$"         <--- |       |                                    +--------+
                   +-------+

This might sound pretty abstract, but we've already been using the shell for its "interpreter" functionality.

The shell is responsible for translating the name "ls" to the absolute path to the executable at /bin/ls. It also translates the tilde character (~) into the path to the current user's "home" directory.

It's All Around You

photograph of trees extending into the sky

"Surrounded" by Arvin Asadi is licensed under CC BY 2.0

Understanding the way shell expansion works helps to explain why it is different from other invocation patterns. Individual applications can differ in the functionality they provide (for example, we've already seen that a given applications may or may not implement the --help flag). On the other hand, because shell expansion happens before program invocation, it will work for every application.

echo

Print text to the screen

vm$ echo Whatever we type here will be printed to the screen.
Whatever we type here will be printed to the screen.
vm$ echo Please expand the tilde character ~ there.
Please expand the tilde character /home/sally there.
vm$ 

echo is a tool for displaying text. It can be helpful when writing shell scripts (more on that in the next chapter), but it's also a useful way to experiment with shell substitution.

Shell Expansion

The * character

vm$ ls music
Ace of Base - I Saw the Sign.mp3
Mahler - Symphony No. 2 in C minor - 04 - Urlicht.mp3
Santana - Smooth.mp3
Stallman - Free Software Song.ogg
vm$ ls music/*.mp3
Ace of Base - I Saw the Sign.mp3
Mahler - Symphony No. 2 in C minor - 04 - Urlicht.mp3
Santana - Smooth.mp3
vm$ ls music/S*
Santana - Smooth.mp3
Stallman - Free Software Song.ogg
vm$ ls music/S*.mp3
Santana - Smooth.mp3
vm$ 

The asterisk character (*) is another powerful shell substitution tool for expressing paths. Sometimes called the "wildcard" pattern, the operator allows you to specify a group of files that share some common path "part" (e.g. file name or sub-directory name).

Whenever the shell encounters that character in a path, it replaces the path with a list of files that match the rest of the characters.

Shell Expansion

Opting out

vm$ echo I have ~ 2 oranges
I have /home/sally 2 oranges
vm$ echo I have \~ 2 oranges
I have ~ 2 oranges
vm$ 

In some cases, you will need to use these special characters for their "literal" value. To do this, write a \ just before them.

Anatomy of a Command (continued)

                   LS_COLORS="di=42" ls -lr --all --sort=size music/*.mp3
                   ^                 ^  ^   ^     ^      ^    ^     ^
                   |                 |  |   |     |      |    |     |
            ??? ---+                 |  |   |     |      |    |     |
     executable ---------------------+  |   |     |      |    |     |
  short options ------------------------+   |     |      |    |     |
   long options ----------------------------+-----+      |    |     |
option argument -----------------------------------------+    |     |
   file options ----------------------------------------------+     |
file "wildcard" ----------------------------------------------------+

There's just one more piece in this dissection of command patterns.

Environment Variables

The "process environment" describes a set of key-value pairs associated with a given process. Each member of the set is an "environment variable", where the key is the variable's name and the value is the variable's value.

Just like the current working directory, environment variables are a "hidden" aspect of the system that effect the behavior of many commands.

Environment Variables

Syntax for Definition

vm$ export myVariable=my-variable-value
vm$ echo Okay. Now what?
Okay, Now what?
vm$ 

We can use the export utility to create and modify environment variables.

Environment Variables

Syntax for Inspection

vm$ export myVariable=variable-value
vm$ echo The value of the variable "myVariable" is: $myVariable
The value of the variable "myVariable" is: variable-value
vm$ export mistake=value with spaces
vm$ echo The value of the variable "mistake" is: $mistake
The value of the variable "mistake" is: value
vm$ export correct='value with spaces'
vm$ echo The value of the variable "correct" is: $correct
The value of the variable "correct" is: value with spaces
vm$ 

To inspect them, another shell substitution feature comes to the rescue.

Environment Variables

Process Isolation

vm$ export foo=bar
vm$ echo $foo
bar
vm$ 

In a new shell:

vm$ echo $foo
vm$ 

Each process has an independent variable environment. This means we can feel confident experimenting with the environment; if anything goes wrong, we can just close the terminal window and try again with a new one.

Shells: under the hood

The PATH environment variable

vm$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/sbin
vm$ which ls
/bin/ls
vm$ which man
/usr/bin/man
vm$ export PATH=garbage
vm$ ls
Command 'ls' is available in '/bin/ls'
The command could not be located because '/bin' is not included in the PATH environment variable.
ls: command not found
vm$ 

For most commands we execute on the command-line, we specify the name of an executable file. For example, when we write ls, we're actually saying "please execute the file stored on the file system at /bin/ls." In fact, we could even write the command as /bin/ls, and ls would run exactly as if we'd written ls.

It's the shell's job to figure out that when we type ls, we mean /bin/ls. That saves us time typing, and it also means we don't have to remember where every executable is stored on the file system.

The shell does this using the PATH environment variable. It is a colon-separated list of directories. Every time we type a command, the shell looks for an executable file with that name in each of the directories until it finds a match, and then it runs that executable.

We can modify this value at our peril.

Shells: under the hood

The PS1 environment variable

vm$ echo Prompt: $PS1
Prompt: vm$
vm$ export PS1="my awesome prompt "
my awesome prompt ​echo Strange...
Strange...
my awesome prompt ​

Remember that the shell is itself a program, so it follows all these same rules. It uses the variable PS1 to display the command prompt. If we modify that variable, then we can change the behavior of the current shell.

Anatomy of a Command (continued)

                   LS_COLORS="di=42" ls -lr --all --sort=size music/*.mp3
                   ^                 ^  ^   ^     ^      ^    ^     ^
                   |                 |  |   |     |      |    |     |
environment var ---+                 |  |   |     |      |    |     |
     executable ---------------------+  |   |     |      |    |     |
  short options ------------------------+   |     |      |    |     |
   long options ----------------------------+-----+      |    |     |
option argument -----------------------------------------+    |     |
   file options ----------------------------------------------+     |
file "wildcard" ----------------------------------------------------+

In Review

Exercise

invoker is the name of a program installed in the virtual machine. You can execute it as follows:

vm$ invoker

You'll find that it is picky about its expected input. If get lost, invoke the program with the --help option, as in:

vm$ invoker --help

...or read the instructions below. If you'd like to start from the beginning, invoke the program with the --reset option, as in:

vm$ invoker --reset
  1. Please invoke me with every file in the "/bin" directory that contains the letter "t" and/or that ends with the letter "s".

  2. Please invoke me with the value "Making the big $bucks."

  3. I support the "--word" option and the "short" version "-w". I also support the "-x" option. There are six unique ways to specify these options, even though each version means the same thing. Can you invoke me with all of them?

  4. Please invoke me with the "foo" environment variable set to "bar" and the "LIMIT" environment variable to "max".

Solution

  1. We could invoke the program and type out each file name as an option. A quick look inside the directory (ls /bin) shows close to 150 files, though. Finding all the ones that satisfy the criteria would be pretty boring. This looks like a job for shell expansion.

    We've already seen how the shell expands * to potentially include many files. Let's split task into two sub-problems: finding the files that contain a t and finding files that end in s.

    • The pattern /bin/*s matches all the files that end in s.
    • The pattern /bin/*t* matches all the files that contain t.

    If we specify both patterns, then invoker will "see" the files in both sets:

    vm$ invoker /bin/*s /bin/*t*
    
  2. The shell will replace the text $bucks with the value of the environment variable named bucks unless we take special precautions to prevent it. We can do this by "escaping" the special dollar sign character ($) by writing a "backslash" character (\) just before it:

    vm$ invoker Making the big \$bucks.
    

    We could also wrap the string in single quotation mark characters (') to get a similar effect:

    vm$ invoker 'Making the big $bucks.'
    

    Notice how using double quotation mark characters (") does not work here. That's because shell expansion still occurs between those characters.

    We're taking all this special care to prevent shell expansion from occurring, but we could also solve the problem by embracing shell expansion. The goal is to provide the string "Making the big $bucks." to the invoker process; we can utilize environment variables and shell expansion to do this if we want to be tricky:

    vm$ bucks=\$bucks
    vm$ invoker Making the big $bucks.
    

    First, we storing the value '$bucks' in the environment variable named 'bucks' (note the "backslash" character). Then we invoke the program using the variable in the options. We expect the shell to replace the string $bucks, but since it uses the value $bucks, we still satisfy the invoker program.

  3. We're looking to use all the possible combinations of these two options. Remember that the "short" version of an option is equivalent to its "long" form. That's why -w --word is not a valid solution. Also recall that "short" options may be specified separately or together, so -w -x and -wx are both valid. The valid solutions are:

    • -w -x
    • -x -w
    • -wx
    • -xw
    • --word -x
    • -x --word
  4. If we set the variables in our current environment, they may not be "exported" to the invoker child process. For instance:

    vm$ foo=bar
    vm$ LIMIT=max
    vm$ echo $foo $LIMIT
    bar max
    vm$ invoker
    'foo' isn't set.
    'LIMIT' isn't set.
    vm$
    

    ...so we'll need to formally "export" the values:

    vm$ export foo=bar
    vm$ export LIMIT=max
    vm$ echo $foo $LIMIT
    bar max
    vm$ invoker
    Success!
    vm$
    

    Alternatively, we could specify the variables as we invoke the command. With this syntax, the variables will not be defined in the current environment.

    vm$ foo=bar LIMIT=max invoker
    Success!
    vm$ echo $foo $LIMIT
    
    vm$