Command the Command Line

Part III - Improving Your Workflow

File Management

Unix-like systems trade in files. A great deal of their configuration and even their state is stored in files spread across the file system. This is distinct from Microsoft Windows, for instance, where much of the analogous information is stored in a global "registry" database.

So not only is comfort with files an essential part of improving your workflow, its also crucial to understanding and interacting with the operating system itself.

Overview

We're going to cover a whole bunch of utilities in this section. With the exception of nano, these programs are included in the POSIX standard. For this course, we're concerned with the mainline use cases; you can consult the man pages for details on all the capabilities of each tool.

mv

Move files and directories

vm$ ls good-guys
dent-harvey.jpg
fries-victor.jpg
gordon-jim.jpg
kyle-selina.jpg
nigma-edward.jpg
vm$ ls bad-guys
vm$ mv good-guys/dent-harvey.jpg bad-guys
vm$ ls good-guys
fries-victor.jpg
gordon-jim.jpg
kyle-selina.jpg
vm$ ls bad-guys
dent-harvey.jpg
vm$ 

We'll use the mv utility to move files and directories around. The first option is the source and the second is the destination.

vm$ mv good-guys/fries-victor.jpg good-guys/kyle-selina.jpg bad-guys
vm$ ls good-guys
gordon-jim.jpg
vm$ ls bad-guys
dent-harvey.jpg
fries-victor.jpg
kyle-selina.jpg
vm$ 

We can use mv to move multiple files at once. If there are more than two file options, the very last will be interpreted as the destination directory for all of the others. This feature can be very helpful when used with the * shell expansion character.

vm$ cd bad-guys
vm$ mv dent-harvey.jpg twoface.jpg
vm$ ls
fries-victor.jpg
kyle-selina.jpg
twoface.jpg
vm$ 

mv is also commonly used to rename files. This might seem a little counter-intuitive, though. Just remember that renaming a file is no different from "moving" it between two names.

vm$ cd ..
vm$ mv bad-guys/twoface.jpg good-guys/dent-harvey.jpg
vm$ 

...and there is nothing stopping us from renaming a file as we change its directory.

rm

Remove files and directories

vm$ ls
my-directory
my-first-file
my-second-file
vm$ rm my-first-file my-second-file
vm$ ls
my-directory
vm$ 

We'll use rm to remove files and directories.

vm$ ls
my-directory
vm$ rm my-directory
rm: cannot remove ‘dir’: Is a directory
vm$ rm -r my-directory
vm$ ls
vm$ 

By default, rm will not remove directory--we'll need to use the r option to remove recursively. This extra option may seem inconvenient, but it often protects you from accidentally deleting things.

mkdir

Make directories

vm$ mkdir my-new-directory
vm$ ls
my-new-directory
vm$ ls my-new-directory
vm$ 

mkdir creates a new empty directory.

Overview

Now that we know a little more about organizing files, we'll learn about a few ways to inspect their contents.

wc

Calculate word counts

vm$ cat just-another-file.txt
This is the first line of just-another-file.txt
This is the second line of the file!
The file only has three lines, and this is the last one!
vm$ wc just-another-file.txt
  3  27 142 just-another-file.txt
vm$ 

The wc utility (short for "word count") displays the number of newlines, words, and bytes in a given file. The output is pretty terse, though!

grep

Search for patterns in text

vm$ grep CSSConf index.html
<h3 class="work-hed"><a href="http://2016.cssconf.com">CSSConf</a></h3>
CSSConf is a conference dedicated to the designers, developers and engineers
edge techniques, and tools. CSSConf US part of the international family of
CSSConfs, and is organized by Bocoup in collaboration with conference founder
vm$ 

grep allows us to search for text inside a file. By default, it displays each line of text that contains the provided value.

grep's name comes from its historic context (based on a command for the ed text editor), so this is a case where rote memorization may be necessary.

vm$ grep -E '<h[1-5]' index.html
<h1 class="logo">
<h2 class="mission">Open Design & Technology Services for
<h3 class="section-hed"><strong>Partner with us</strong>
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="section-hed">We work across industries to brin
<h3 class="work-hed"><a href="https://bocoup.com/work/lyr
<h3 class="work-hed"><a href="https://bocoup.com/work/jsi
<h3 class="work-hed"><a href="https://bocoup.com/work/hbr
<h3 class="work-hed"><a href="https://bocoup.com/work/rue
<h3 class="section-hed">We create events to bring communi
<h3 class="work-hed"><a href="http://2016.cssconf.com">CS
<h3 class="section-hed">Our team <strong>creates, champio
<h2 class="section-hed">We'd love to hear from you. <stro
<h2>Join our newsletter for Bocoup news you can use!</h2>
vm$ 

grep also accepts regular expressions, which are a very powerful way to express text queries. Regular expressions are beyond the scope of this course, though, so don't worry if you're not comfortable using them.

find

Locate files by meta-data

vm$ find src -name index.html
src/index.html
src/birds/index.html
src/birds/penguins/index.html
src/birds/puffins/index.html
src/cereal/index.html
src/cereal/capn-crunch/index.html
src/cereal/fruit-loops/index.html
vm$ 

find is also useful for locating files. Unlike grep (which is geared towards searching through files by their content), find is built to search for files according to meta-data such as file name and file type. If the provided path is a directory, find will search through all the files and directories inside.

Remember that every utility defines its own rules for command-line options. As mentioned in the previous chapter, this allows for some inconsistency. find is an example--notice how -name is a "long" option with only one leading hyphen character.

This is one small case where Unix's complicated history has negatively impacted the user experience.

vm$ find documents/recipes -name *carrot*
vm$ echo *carrot*
giant-carrot.jpg
vm$ echo \*carrot\*
*carrot*
vm$ find documents/recipes -name \*carrot\*
documents/recipes/appetizers/cold-carrot-soup.pdf
documents/recipes/carrot-free
documents/recipes/desserts/carrot-cake.odt
documents/recipes/sides/carrots.pdf
vm$ 

find interprets the asterisk character (*) in the same way that our shell does. As we've seen, though, the shell normally substitutes that pattern before invoking the command. Because we want the asterisk character to make it all the way to the find program, we need to escape it with the backslash character (\).

Overview

Finally, we'll take a look at a few tools for modifying file contents.

nano

Edit text files interactively

vm$ nano hello.txt
  GNU nano 2.2.6        File: hello.txt

Hello, world!




                       [ Read 1 line ]
^G Get Hel^O WriteOu^R Read Fi^Y Prev Pa^K Cut Tex^C Cur Pos
^X Exit   ^J Justify^W Where I^V Next Pa^U UnCut T^T To Spell

nano is a text editor program that is available on many systems. There are a great number of alternative editors available (for instance, vim and emacs), but nano is commonly considered the most straightforward. This makes it a great choice when first getting started.

Editor commands are performed using the Ctrl modifier key. nano displays the most essential commands at the very bottom of the terminal window. These are somewhat truncated in the example above due to space limitations.

sed

Replace text in "streams"

vm$ cat quote.txt
It's 106 miles to Chicago, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses. 
vm$ sed s/Chicago/Boston/ quote.txt
It's 106 miles to Boston, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses. 
vm$ 

The sed utility allows us to replace text matching one pattern with another. The search pattern is the string between the first and second forward slash character (/), and the replacement pattern the string between the second and third forward slash character.

vm$ cat quote.txt
It's 106 miles to Chicago, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses. 
vm$ sed --in-place s/Chicago/Boston/ quote.txt
vm$ cat quote.txt
It's 106 miles to Boston, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses. 
vm$ 

By default, sed prints the result on the terminal. Because it is such a powerful tool, we may want to verify the effect of our command before actually changing any files. Once we're confident that the change meets our expectation, we can instruct sed to modify the file "in place" with the -i/--in-place option.

vm$ sed -r s/(moz|webkit)R(equestAnimationFrame)/r\2/ -i src/utils/raf.js
vm$ 

Like grep, sed recognizes regular expressions. Actually, the behavior of sed can be controlled with an entire scripting language! The topic of writing sed scripts is outside the scope of this course, but it is an extremely powerful way to automate text file transformations.

awk

Display and modify text using a powerful scripting language

vm$ cat src/foo.css
body {
  margin-left: 0;
  padding: 0;
}
table.data {
  color: red;
  margin: 0 1em;
}
vm$ awk '/\{/ { s = $0 } /margin/ { print s "\n" $0 "\n}" }' src/foo.css
body {
  margin-left: 0;
}
table.data {
  margin: 0 1em;
}
vm$ 

The awk utility performs a similar function to sed, but it implements a programming language that some feel is much more maintainable. We won't go into the details of using awk in the course, but this is a good tool to reach for when you need to perform advanced text manipulation.

In Review

Exercise

The project in the /var/www/sportistician directory is a mess! Let's clean it up a bit.

  1. The files layout.html and style.css look like temporary work. Maybe you were in the middle of an experiment? We don't want to lose that work, so lets get them out of the way for now. Create a temporary directory and move those two files into it.

  2. The rest of the HTML files look good, but they probably shouldn't be strewn about in this directory. Make a directory within src named templates, and move all the HTML files there.

  3. The src directory has a flash sub-directory that no one has used in eight years. Lets delete it and everything inside.

  4. This directory has a bunch of hidden files. They aren't being used for anything, so please remove them.

  5. We maintain the src/styles/legacy.css file to support older web browsers. How many declarations does that file have for the <blink> tag?

  6. Another programmer left their editor's temporary files hanging around. They're all over the place, but at least all their names look similar--they all end in the tilde character (~). Could you remove them?

  7. This used to be a web application for tracking sports statistics, but the company pivoted and it is now a baking application. Let's update the readme.md file to reflect this--replace all occurrences of the word "stats" with "cupcakes".

  8. Finally, lets retrieve those experimental files. Move them from the temporary directory you created back to the project directory.

Solution

  1. First, we need to make a temporary directory using the mktemp utility

    $ mktemp -d
    

    That command should print a directory name to the screen. This is the path to the new temporary directory. We'll use that as the destination for the files--replace {destination} with the path to the temporary directory in the following command:

    $ mv layout.html style.css {destination}
    
  2. The mkdir utility will create a directory at whatever path we provide:

    $ mkdir src/templates
    

    ...now we can move the HTML files. That's a lot of files, though! We could type them all out, but it will be much quicker to use the * character (remember: that tells our shell to substitute all the files that match)

    $ mv *.html src/templates
    
  3. This sounds like a job for rm. We'll get an error if we try to use it on a directory without any options, so we need to explicitly say we're looking to delete a directory:

    $ rm -r src/flash
    
  4. It's tempting to use the * character here, too, but that won't quite work. If we use echo to see how the expansion works...

    $ echo .*
    . .. .a .b .c .d
    

    ...we can see that this includes the . and .. directories. Practically speaking, this is okay because rm will fail to delete those, report errors, and continue deleting the others. But if we want to be precise, we're better off explicitly deleting each of the others:

    $ rm -r .a .b .c .d
    
  5. We'll use grep to find the references:

    $ grep blink src/styles/legacy.css
    

    There are a bunch of false positives here, though. They are easy to spot, but we can use a more advanced query to do that work for us:

    $ grep -E '\bblink\b' src/styles/legacy.css
    
  6. This is a two-parter. First, we'll use find to locate the files we want to remove:

    $ find src -name '*~'

    Next, we'll remove each of those files:

This was better than manually inspecting the contents of each directory with ls, but it was still a little bit of a hassle to remove each file. Later on, we'll see how we can wire these two commands together.

  1. Whenever you want to replace text, sed should come to mind.

    $ sed s/stats/cupcakes/ readme.md

  2. This is the reverse of the first step, just remember the name of the temporary directory!

    $ mv {destination}/layout.html .
    $ mv {destination}/style.css .