More Bash features and their composability
Objectives
Recognize and read advanced shell features
Isolate parts of a script with subshells
Modularize and organize a script with functions
To follow along
Please navigate to the examples/composability directory.
This episode builds upon the composability episode from the Intermediate part, discussing more sofisticated (and more niche) bash features.
More streams and redirections
Use a file as stdin for a process: <
The < redirection operator takes a file
and put its contents in the stdin of a process,
as if we were typing it ourselves:
$ wc -l < myfile
17
stderr into stdout or vice versa
&>can be used to redirect both streams2>&1redirectsstderrtostdout1>&2does the opposite
Note: 2>&1 (or )
goes after any redirection to files with >,
but before:
$ ./stream_example.sh > stdout_and_stderr 2>&1
$ ./stream_example.sh 2>&1 | wc -l
2
Process substitution: <()
Create a temporary file
containing the output (stdout)
of the list of commands
inside the brackets.
Check the difference in content between two directories
In the current directory (examples/composability)
there are two files named file1 and file2.
How can you compare
only the first 3 lines of both files
with the diff command?
Solution
$ diff <(head -n 3 file1) <(head -n 3 file2)
3c3
< the lazy dog.
---
> the white dog.
Compound commands
Grouping a list commands together with { }.
The exit code of the compound command is the one of the last command executed.
Subshells and export
In order to define variables that are only visible to a subset of a program,
we can use a subshell. We can create one with ( ...list of commands...):
$ (
C=12
echo $C
)
12
The C variable is defined in a subshell and is not visible outside:
$ echo $C
In a script, it is convenient to use a subshell
if you need to change directory with cd.
At the end of a subshell,
the execution will return automatically
to the previous directory.
For interactive work,
we can also create a subshell by invoking bash,
and we can exit them with CTRL-D or the exit command.
Note: variables defined in the parent shell will not be visible in child shells unless we export them.
Variabes in a subshell
Try this:
$ A=42
$ (
echo $A
)
What is the output? Is it expected, even if A is not exported?
Try this instead:
$ A=42
$ bash -c 'echo $A'
And what is different when A is instead exported?
Solution
1.In the first case the value assigned to A is seen,
but just because in the expression
$ (
echo $A
)
the value 42 is substituted to $A
before the expression is evaluated,
meaning that the expression that is evaluated is actually
$ (
echo 42
)
In the second case, we use single quotes (
') to make sure that the expression$Ais not evaluated to 42 before thebashcommand is executed. Therefore, since in the new shellAis not defined, we get an empty line. ByexportingAinstead, we get42:$ export A=42 $ bash -c 'echo $A' 42
Bash functions
Functions can be created with the following syntax:
function myfunction(){
FIRST_ARG="$1"
echo $FIRST_ARG
}
(the function keyword is actually optional in this case).
They can then be used by calling them and passing them arguments as normal bash commands:
myfunction 1 2 3
Bash functions do not define a scope
Any operation made on variables visible by the functions will affect the whole shell where the function is executing.
This is actually by design.
What about the module command?
Subshells instead do define a scope.
Arrays
Arrays are a bash feature that allows to store a list of elements.
They are generated with the syntax
$ A=(first_element
second_element
...
)
The list of elements can be generated in various ways.
An element of the array (e.g., the second) can be accessed with the syntax
$ echo "${A[1]}"
second
and the whole array can be accessed,
for example in a for loop, in this way:
$ for element in "${A[@]}"
> do
> echo "$element"
> done
Note
Note the similarity and differences between the syntax for arrays and the syntax for subshells…
Arrays and command substitution
Create a variable named “outputfile” (which might represent the name of a file used for output in a script) that is composed of 3 strings:
Environment variable $LOGNAME
Arbitrary string of 4 characters generated in subshell via:
mktemp -u XXXXFirst 2 characters of the current month (→ use the
datecommand) using a bash array
Solution
A possible solution reads:
array=($(date))
month=${array[1]:0:2}
declare -r outputfile="${LOGNAME}_$(mktemp -u XXXX)_${month}.log"
echo ${outputfile}
If we try changing output file we do get an error:
outputfile="new"
Dealing with repetition: pipe into xargs instead of for
If the loop body is a one-liner, an alternative is particulary convenient.
Typically we use first the command to generate the list
we want to iterate on.
For example:
find data -name '*.dat'
We can use a pipe and the xargs command to
find data -name '*.dat' | xargs -I{} ./process.sh {}