Part III - Improving Your Workflow
File Management
Unix-like systems trade in files. A great deal of their configuration and even their state is stored in files spread across the file system. This is distinct from Microsoft Windows, for instance, where much of the analogous information is stored in a global "registry" database.
So not only is comfort with files an essential part of improving your workflow, its also crucial to understanding and interacting with the operating system itself.
Overview
- File organization:
mv
,rm
,mkdir
- File inspection:
wc
,grep
,find
- File manipulation:
nano
,sed
,awk
We're going to cover a whole bunch of utilities in this section. With the
exception of nano
, these programs are included in the POSIX standard. For
this course, we're concerned with the mainline use cases; you can consult the
man
pages for details on all the capabilities of each tool.
mv
Move files and directories
vm$ ls good-guys
dent-harvey.jpg
fries-victor.jpg
gordon-jim.jpg
kyle-selina.jpg
nigma-edward.jpg
vm$ ls bad-guys
vm$ mv good-guys/dent-harvey.jpg bad-guys
vm$ ls good-guys
fries-victor.jpg
gordon-jim.jpg
kyle-selina.jpg
vm$ ls bad-guys
dent-harvey.jpg
vm$
We'll use the mv
utility to move files and directories around. The
first option is the source and the second is the destination.
vm$ mv good-guys/fries-victor.jpg good-guys/kyle-selina.jpg bad-guys
vm$ ls good-guys
gordon-jim.jpg
vm$ ls bad-guys
dent-harvey.jpg
fries-victor.jpg
kyle-selina.jpg
vm$
We can use mv
to move multiple files at once. If there are more than two file
options, the very last will be interpreted as the destination directory for all
of the others. This feature can be very helpful when used with the *
shell
expansion character.
vm$ cd bad-guys
vm$ mv dent-harvey.jpg twoface.jpg
vm$ ls
fries-victor.jpg
kyle-selina.jpg
twoface.jpg
vm$
mv
is also commonly used to rename files. This might seem a little
counter-intuitive, though. Just remember that renaming a file is no different
from "moving" it between two names.
vm$ cd ..
vm$ mv bad-guys/twoface.jpg good-guys/dent-harvey.jpg
vm$
...and there is nothing stopping us from renaming a file as we change its directory.
rm
Remove files and directories
vm$ ls
my-directory
my-first-file
my-second-file
vm$ rm my-first-file my-second-file
vm$ ls
my-directory
vm$
We'll use rm
to remove files and directories.
vm$ ls
my-directory
vm$ rm my-directory
rm: cannot remove ‘dir’: Is a directory
vm$ rm -r my-directory
vm$ ls
vm$
By default, rm
will not remove directory--we'll need to use the r
option
to remove recursively. This extra option may seem inconvenient, but it
often protects you from accidentally deleting things.
mkdir
Make directories
vm$ mkdir my-new-directory
vm$ ls
my-new-directory
vm$ ls my-new-directory
vm$
mkdir
creates a new empty directory.
Overview
-
File organization:
mv
- change the location of files and directories (as in "mov")rm
- delete files and directories (as in "remove")mkdir
- create directories (as in "make directories")
- File inspection:
wc
,grep
,find
- File manipulation:
nano
,sed
,awk
Now that we know a little more about organizing files, we'll learn about a few ways to inspect their contents.
wc
Calculate word counts
vm$ cat just-another-file.txt
This is the first line of just-another-file.txt
This is the second line of the file!
The file only has three lines, and this is the last one!
vm$ wc just-another-file.txt
3 27 142 just-another-file.txt
vm$
The wc
utility (short for "word count") displays the number of
newlines, words, and bytes in a given file. The output is pretty terse, though!
grep
Search for patterns in text
vm$ grep CSSConf index.html
<h3 class="work-hed"><a href="http://2016.cssconf.com">CSSConf</a></h3>
CSSConf is a conference dedicated to the designers, developers and engineers
edge techniques, and tools. CSSConf US part of the international family of
CSSConfs, and is organized by Bocoup in collaboration with conference founder
vm$
grep
allows us to search for text inside a file. By default, it displays each
line of text that contains the provided value.
grep
's name comes from its historic context (based on a command for the ed
text editor), so this is a case where rote memorization may be necessary.
vm$ grep -E '<h[1-5]' index.html
<h1 class="logo">
<h2 class="mission">Open Design & Technology Services for
<h3 class="section-hed"><strong>Partner with us</strong>
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="work-hed"><a href="https://bocoup.com/services
<h3 class="section-hed">We work across industries to brin
<h3 class="work-hed"><a href="https://bocoup.com/work/lyr
<h3 class="work-hed"><a href="https://bocoup.com/work/jsi
<h3 class="work-hed"><a href="https://bocoup.com/work/hbr
<h3 class="work-hed"><a href="https://bocoup.com/work/rue
<h3 class="section-hed">We create events to bring communi
<h3 class="work-hed"><a href="http://2016.cssconf.com">CS
<h3 class="section-hed">Our team <strong>creates, champio
<h2 class="section-hed">We'd love to hear from you. <stro
<h2>Join our newsletter for Bocoup news you can use!</h2>
vm$
grep
also accepts regular expressions, which are a very powerful way to
express text queries. Regular expressions are beyond the scope of this course,
though, so don't worry if you're not comfortable using them.
find
Locate files by meta-data
vm$ find src -name index.html
src/index.html
src/birds/index.html
src/birds/penguins/index.html
src/birds/puffins/index.html
src/cereal/index.html
src/cereal/capn-crunch/index.html
src/cereal/fruit-loops/index.html
vm$
find
is also useful for locating files. Unlike grep
(which is geared
towards searching through files by their content), find
is built to search
for files according to meta-data such as file name and file type. If the
provided path is a directory, find
will search through all the files and
directories inside.
Remember that every utility defines its own rules for command-line options. As
mentioned in the previous chapter, this allows for some inconsistency. find
is an example--notice how -name
is a "long" option with only one leading
hyphen character.
This is one small case where Unix's complicated history has negatively impacted the user experience.
vm$ find documents/recipes -name *carrot*
vm$ echo *carrot*
giant-carrot.jpg
vm$ echo \*carrot\*
*carrot*
vm$ find documents/recipes -name \*carrot\*
documents/recipes/appetizers/cold-carrot-soup.pdf
documents/recipes/carrot-free
documents/recipes/desserts/carrot-cake.odt
documents/recipes/sides/carrots.pdf
vm$
find
interprets the asterisk character (*
) in the same way that our shell
does. As we've seen, though, the shell normally substitutes that pattern
before invoking the command. Because we want the asterisk character to make
it all the way to the find
program, we need to escape it with the backslash
character (\
).
Overview
-
File organization:
mv
- change the location of files and directories (as in "mov")rm
- delete files and directories (as in "remove")mkdir
- create directories (as in "make directories")
-
File inspection
wc
- display statistics for a text file, including details like number of words and number of lines ("word count")grep
- find text within a filefind
- find files and directories based on meta data
- File manipulation:
nano
,sed
,awk
Finally, we'll take a look at a few tools for modifying file contents.
nano
Edit text files interactively
vm$ nano hello.txt
GNU nano 2.2.6 File: hello.txt
Hello, world!
[ Read 1 line ]
^G Get Hel^O WriteOu^R Read Fi^Y Prev Pa^K Cut Tex^C Cur Pos
^X Exit ^J Justify^W Where I^V Next Pa^U UnCut T^T To Spell
nano
is a text editor program that is available on many systems. There are a
great number of alternative editors available (for instance, vim
and emacs
),
but nano
is commonly considered the most straightforward. This makes it a great
choice when first getting started.
Editor commands are performed using the Ctrl
modifier key. nano
displays
the most essential commands at the very bottom of the terminal window. These are
somewhat truncated in the example above due to space limitations.
sed
Replace text in "streams"
vm$ cat quote.txt
It's 106 miles to Chicago, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses.
vm$ sed s/Chicago/Boston/ quote.txt
It's 106 miles to Boston, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses.
vm$
The sed
utility allows us to replace text matching one pattern with another.
The search pattern is the string between the first and second forward slash
character (/
), and the replacement pattern the string between the second and
third forward slash character.
vm$ cat quote.txt
It's 106 miles to Chicago, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses.
vm$ sed --in-place s/Chicago/Boston/ quote.txt
vm$ cat quote.txt
It's 106 miles to Boston, we got a full tank of gas, half a pack of
cigarettes, it's dark... and we're wearing sunglasses.
vm$
By default, sed
prints the result on the terminal. Because it is such a
powerful tool, we may want to verify the effect of our command before actually
changing any files. Once we're confident that the change meets our expectation,
we can instruct sed
to modify the file "in place" with the -i
/--in-place
option.
vm$ sed -r s/(moz|webkit)R(equestAnimationFrame)/r\2/ -i src/utils/raf.js
vm$
Like grep
, sed
recognizes regular expressions. Actually, the behavior of
sed
can be controlled with an entire scripting language! The topic of writing
sed
scripts is outside the scope of this course, but it is an extremely
powerful way to automate text file transformations.
awk
Display and modify text using a powerful scripting language
vm$ cat src/foo.css
body {
margin-left: 0;
padding: 0;
}
table.data {
color: red;
margin: 0 1em;
}
vm$ awk '/\{/ { s = $0 } /margin/ { print s "\n" $0 "\n}" }' src/foo.css
body {
margin-left: 0;
}
table.data {
margin: 0 1em;
}
vm$
The awk
utility performs a similar function to sed
, but it implements a
programming language that some feel is much more maintainable. We won't go into
the details of using awk
in the course, but this is a good tool to reach for
when you need to perform advanced text manipulation.
In Review
-
File organization:
mv
- change the location of files and directories (as in "mov")rm
- delete files and directories (as in "remove")mkdir
- create directories (as in "make directories")
-
File inspection
wc
- display statistics for a text file, including details like number of words and number of lines ("word count")grep
- find text within a filefind
- find files and directories based on meta data
-
File manipulation
nano
- text editorsed
- find and replace text (as in "stream editor")awk
- find and replace text
Exercise
The project in the /var/www/sportistician
directory is a mess! Let's clean it
up a bit.
-
The files
layout.html
andstyle.css
look like temporary work. Maybe you were in the middle of an experiment? We don't want to lose that work, so lets get them out of the way for now. Create a temporary directory and move those two files into it. -
The rest of the HTML files look good, but they probably shouldn't be strewn about in this directory. Make a directory within
src
namedtemplates
, and move all the HTML files there. -
The
src
directory has aflash
sub-directory that no one has used in eight years. Lets delete it and everything inside. -
This directory has a bunch of hidden files. They aren't being used for anything, so please remove them.
-
We maintain the
src/styles/legacy.css
file to support older web browsers. How many declarations does that file have for the<blink>
tag? -
Another programmer left their editor's temporary files hanging around. They're all over the place, but at least all their names look similar--they all end in the tilde character (
~
). Could you remove them? -
This used to be a web application for tracking sports statistics, but the company pivoted and it is now a baking application. Let's update the
readme.md
file to reflect this--replace all occurrences of the word "stats" with "cupcakes". -
Finally, lets retrieve those experimental files. Move them from the temporary directory you created back to the project directory.
Solution
-
First, we need to make a temporary directory using the
mktemp
utility$ mktemp -d
That command should print a directory name to the screen. This is the path to the new temporary directory. We'll use that as the destination for the files--replace
{destination}
with the path to the temporary directory in the following command:$ mv layout.html style.css {destination}
-
The
mkdir
utility will create a directory at whatever path we provide:$ mkdir src/templates
...now we can move the HTML files. That's a lot of files, though! We could type them all out, but it will be much quicker to use the
*
character (remember: that tells our shell to substitute all the files that match)$ mv *.html src/templates
-
This sounds like a job for
rm
. We'll get an error if we try to use it on a directory without any options, so we need to explicitly say we're looking to delete a directory:$ rm -r src/flash
-
It's tempting to use the
*
character here, too, but that won't quite work. If we useecho
to see how the expansion works...$ echo .* . .. .a .b .c .d
...we can see that this includes the
.
and..
directories. Practically speaking, this is okay becauserm
will fail to delete those, report errors, and continue deleting the others. But if we want to be precise, we're better off explicitly deleting each of the others:$ rm -r .a .b .c .d
-
We'll use
grep
to find the references:$ grep blink src/styles/legacy.css
There are a bunch of false positives here, though. They are easy to spot, but we can use a more advanced query to do that work for us:
$ grep -E '\bblink\b' src/styles/legacy.css
-
This is a two-parter. First, we'll use
find
to locate the files we want to remove:$ find src -name '*~'
Next, we'll remove each of those files:
This was better than manually inspecting the contents of each directory with
ls
, but it was still a little bit of a hassle to remove each file. Later
on, we'll see how we can wire these two commands together.
-
Whenever you want to replace text,
sed
should come to mind.$ sed s/stats/cupcakes/ readme.md
-
This is the reverse of the first step, just remember the name of the temporary directory!
$ mv {destination}/layout.html . $ mv {destination}/style.css .