Composability
Objectives
Save the output of a program to a file
Connect the output of a program to the input of another
Use variables to store values
Isolate parts of a script with subshells
To follow along
Please navigate to the examples/composability
directory
Streams and redirections
Every process has 3 streams associated to it:
stdin
: the input stream coming into the program, for example from your keyboard;stdout
: the stream coming out of the program, i.e. most of the text that is printed out in a terminal;stderr
: similar tostdout
, but with messages that should be treated differently (e.g., error messages).
Clarification: stdin vs command arguments
Many commands will accept input from stdin
and use it as a file.
For example,
$ wc -l myfile
22
Will print the number of lines in the file myfile
.
But the command
wc -l
without an argument will wait for you to type in text,
that is for input data on its stdin
stream.
$ wc -l
line one
line two
line three # PRESS CTRL-D here
3 # get the number of lines here
$
We can take the stdout
of a process
and redirect it to a file.
For example, to get the list of all files in the data
directory:
find ./data
If we want to save the output into a new file, we use redirection. For example:
find ./data > all_files_here
This does not work easily for all commands by default:
Only the stdout
of a process is redirected to a file.
To see this try the script stream_exapmle.sh
:
$ ./stream_example.sh
This message goes to STDOUT
This message goes to STDERR instead
If we try to redirect the its output to a file, the message on STDERR are still printed out, and not in the file:
$ ./stream_example.sh > test_output
This message goes to STDERR instead
$ cat test_output
This message goes to STDOUT
If you want to redirect both streams,
you can use &>
$ ./stream_example.sh &> test_output
$ cat test_output
This message goes to STDOUT
This message goes to STDERR instead
Pipes
If a program stdin
can be
the stdout
of another program,
we can connect them together
with a |
character,
called a pipe.
For example, we can check how many files are in the data directory with the command
$ find ./data | wc -l
5
Variables
There is a way to store small amounts of information in the shell without using a file: environment variables.
They can be set with the =
sign (no spaces):
MYVAR=42
FILENAME=data/data1.dat
the value of a variable can be accessed
by putting $
in front of it.
It can be displayed with echo
:
$ echo "$MYVAR"
42
There is a lot of environment variables already defined in the shell. They typically influence the behaviour of the shell, and of the programs you launch in the shell.
To see the variables that are visible
by the programs you launch in the shell,
you can use the env
(or printenv
) command.
There is a lot of output,
so better pipe it into the less
command:
env | less
Variables can also be produced by commands, using command substitution:
FNAME="$(basename "$FILENAME")"
Tip
If you not not use spaces and special characters in your filenames,
using double quotes "..."
is not always necessary,
and your life will be easier.
Subshells and export
In order to define variables that are only visible to a subset of a program,
we can use a subshell. We can create one with ( ...list of commands...)
:
$ (
C=12
echo $C
)
12
The C variable is defined in a subshell and is not visible outside:
$ echo $C
In a script, it is convenient to use a subshell
if you need to change directory with cd
.
At the end of a subshell,
the execution will return automatically
to the previous directory.
For interactive work,
we can also create a subshell by invoking bash
,
and we can exit them with CTRL-D
or the exit
command.
Note: variables defined in the parent shell will not be visible in child shells unless we export them.
Variabes in a subshell
Try this:
$ A=42 $ ( echo $A )
What is the output?
Is it expected, even if A is not exported?
Try this instead
$ A=42 $ bash -c ‘echo $A’