Shell basics

A shell is a program that serves as an intermediate step between the user and the operating system. The main goal of the shell is to analyze user's commands and execute them (ask the OS to execute them). There are many different versions of shell: the Bourne Again Shell (bash), TC Shell (tcsh), Z Shell (zsh), and others. In these lectures we will discuss the bash.

Writing a simple script

To write a simple script we can just put several commands together in one file. Let's write a simple script that prints message "Today is" and then the current date.
echo Today is
date
We will save these two lines in the file simple. Now to execute this script we can either type the name of the shell and then the name of the file as the argument:
$ bash simple
or type a dot followed by a space and the file name:
$ . simple
Both of these methods will execute the simple script and we'll see something like
Today is
Thu Apr 24 15:07:56 EDT 2003
If we want to have all output on one line we can use -n argument of the echo command, which prevents the echo from outputting the trailing new lines:
echo -n "Today is "
date

Another way to execute a script is to make the script file executable. To do that, we first need to specify which shell we would like to use to execute this particular script. To specify a shell, we need to add one more line to the script. In the very first line we'll put #! followed by the full name of the shell (path and the shell name). For example, on the server we use in our classes this line is:

#!/bin/bash
echo -n "Today is "
date
Once we did this we need to make the file containing the script executable. The command chmod helps us to do that. Use the option +x to make a file executable and the file name as the argument:
$ chmod +x simple
After this step we can execute the script as usual command or program just by typing it's name and path
$ ./simple

Redirecting input and output

Traditionally all scripts and programs created for shells are relatively simple and designed to perform only one small task. The complicated scripts can be written by combining these small tasks together. To combine the small tasks, we need to be able to operate with the input and output of the scripts. There are several ways to redefine input and output for scripts and programs. The first one is to make the program read from a file instead of the standard input (console). To do that we need to use the symbol followed by the name of the file to read from after the script name. For example, program sort reads all lines from the standard input, sorts them, and prints the sorted lines to the standard output (screen). To sort a file and print it on the screen we can use
$ sort < simple

#!/bin/bash
date
echo -n "Today is "
To make the sort program print the output to another file instead of the screen, we can redefine the standard output. We will use the symbol <kbd>> followed by the name of the file to write to:
$ sort < simple > sorted
$ cat sorted

#!/bin/bash
date
echo -n "Today is "
Another way to redirect input and output is called piping. By using symbol | we can not only execute several command in one line but also redirect the standard output of the first command to the standard input of the second. For example, we would like to know how many times students visited the IST334 folder of the department web server. Information about all visits is stored in the file access_log located in the /usr/local/apache/logs directory. We will use command grep that takes a regular expression and a file name as arguments and prints all the lines from the given file where there is a match:
$ grep IST334 /usr/local/apache/logs/access_log
208.58.169.98 - - [02/Apr/2003:13:26:46 -0500] "GET /ist334/wsh_intro.html HTTP/1.1" 200 9628
10.102.44.122 - - [02/Apr/2003:13:31:57 -0500] "GET /ist334/ HTTP/1.1" 200 2249
10.102.44.122 - - [02/Apr/2003:13:31:58 -0500] "GET /ist334/perl_regexp.html HTTP/1.1" 200 17547
10.102.44.122 - - [02/Apr/2003:14:25:51 -0500] "GET /ist334/ HTTP/1.1" 304 -
10.102.44.122 - - [02/Apr/2003:14:28:13 -0500] "GET /ist334/ HTTP/1.1" 304 -
10.102.44.122 - - [02/Apr/2003:14:28:20 -0500] "GET /ist334/wsh_basics.html HTTP/1.1" 200 15730
10.102.44.122 - - [02/Apr/2003:14:28:58 -0500] "GET /ist334/code/mix.wsf HTTP/1.1" 200 330
10.102.44.122 - - [02/Apr/2003:14:44:03 -0500] "GET /ist334/ HTTP/1.1" 200 2452
...
However, we do not need to see all of this information. All we want is just to know how many of these lines we have in the file. To count this lines we will use small program wc that reads all lines from the standard input and counts the number or bytes, words, and lines. Using the symbol | we will redirect the grep output to the wc input:
$ grep ist334 /usr/local/apache/logs/access_log | wc -l
   1350
Knowing this we can write a small script that computes the number off visits for all classes offered this semester.
#!/bin/bash
echo -n "Scripts "
grep ist334 /usr/local/apache/logs/access_log | wc -l
echo -n "WEB     "
grep ist263 /usr/local/apache/logs/access_log | wc -l
echo -n "DB      "
grep ist480adbp /usr/local/apache/logs/access_log | wc -l
echo -n "C++     "
grep ist480acp /usr/local/apache/logs/access_log | wc -l
The output of the new script visits will look like
$  ./visits
Scripts    1350
WEB        1657
DB          550
C++         593
Now we can send the output of this script to the input of the sort program and sort these lines by the numbers of visits:
$ ./visits | sort -k 2 -n -r
WEB        1657
Scripts    1350
C++         593
DB          550

Arguments and variables

Arguments

We can pass arguments to a script just by typing these arguments after the script name in the command line. Inside the script we can refer to the arguments by using special variables $1, $2, etc. Special variable $0 corresponds to the script name, and variable $# contains the number of arguments. The following script prints the script name, number of arguments, and the first seven arguments
#!/bin/bash
echo Script name: $0
echo Number of arguments: $#
echo Arguments:
echo 1: $1
echo 2: $2
echo 3: $3
echo 4: $4
echo 5: $5
echo 5: $6
echo 7: $7
Its output may look like
$ ./args Please test these arguments
Script name: ./args
Number of arguments: 4
Arguments:
1: Please
2: test
3: these
4: arguments
5:
5:
7:
Please note that using names $n we can access only the first nine arguments. The special builtin command shift promotes each of the command line arguments. The first argument is discarded, the second becomes the first one, the third becomes the second one, and so on. Special variable $@ contains all arguments as a list.

Local variables

We can also create user variables just by assigning values to valid identifiers. For example, to create a new variable dir we do this

dir=/usr/local/apache/logs/
There should not be any spaces on the left and right sides of the assignment operator (=). If your variable value needs to have a leading space or a space inside you need to use double quotes:
names="bob bill mike"
To access the value of a variable inside the script we need to use the dollar sign followed by the variable identifier. For example, to print the value of the variable dir, we use
echo $dir
If we need to have a variable identifier immediately followed by some text, for instance, we would like to use the variable dir defined above as a directory containing file access_log, then we can not do this
grep ist334 $diraccess_log
because the shell will try to use the value of the variable diraccess_log. In situations such as this, we need to put the curly braces around the variable name:
grep ist334 ${dir}access_log
Using the variables and arguments we can write a small script visit that will take only one argument - a directory name and print how many times this web directory was visited over the last month
#!/bin/bash
dir=/usr/local/apache/logs
echo -n $1
grep $1 ${dir}/access_log | wc -l
Now we can use this script to get some statistic information:
$ ./visit ist334
ist334   1352

Using this script the previous script visists can be rewritten as

#!/bin/bash
locdir=/home/username/shell
${locdir}/visit ist334
${locdir}/visit ist263
${locdir}/visit ist489
${locdir}/visit ist362

Global variables

All variables we define in a script are local for this script. That is, we can see and use them only inside this script. Another script can define a variable with the same name but different value, but this will have no influence on the variable of the first script. Variables that are declared in the shell itself and are visible in all scripts are called global variables. You, of course, are familiar with some of these variables like PATH, HOSTNAME, USER. To get a list of all global variables we can start the set command. We can also redefine the output for this command
$ set > globals.txt
to read them from a file. Each script can use a global variable inside, but once a script modifies a global variable, it becomes a local copy of a global variable; that is, other scripts can not see its new value. To demonstrate this, let's write a short script gl_change that prints a value of the HOME global variable, then change its value and prints it again, and calls a script subscript that does exactly the same.
#!/bin/bash
# gl_change
MYHOME="my test home"
echo "Value of the variable MYHOME = '$MYHOME'"
MYHOME="'no home'"
echo "Now its value is $MYHOME"
subscript
#!/bin/bash
echo "subscript: Value of the variable MYHOME = $MYHOME"
MYHOME="'subscript: no home'"
echo "subscript: Now its value is $MYHOME"
When we execute this script we'll get the following output
$ ./gl_change
Value of the variable MYHOME = 'my test home'
Now its value is 'no home'
subscript: Value of the variable MYHOME =
subscript: Now its value is 'subscript: no home'
As we can see the new variable MYHOME is visible inside the script, but not to its child scripts. To make the changes made in a script visible to the children, we need to make a local variable global for them. To do that, we need to use the export command. This command takes a variable name or a list of variable names (separated with spaces) and makes them visible to all programs called from this script.

For example, if we do not want to type ./ in front of the script name (we have to do it, because Unix/Linux by default executes programs only from the directories listed in the PATH variable) we need to add the current directory to the value of the global variable PATH. Local variable in Unix/Linux systems is denoted by dot. That is, we need to change the value of PATH like this:

PATH=$PATH:.
export PATH
If we add these two lines to the file .bashrc located in our home directory, this new value of the global variable PATH will be visible to all scriptsm because the shell itself runs the commands from this file every time we login to the system.

The declare command

We can also define new variable using the built in declare command. The general syntax for the command is
declare [option_list] variable_name[=variable_value]
where option_list is a list of the following options:
Option Description
-a declares a variable as an array
-f makes a variable a function name
-i marks a variable so that integer values are stored more efficiently
-r makes a variable readonly
-x marks a variable to export
For example, to define a variable that is not allowed to be changed we need to do
declare -r constant="unchangable"
echo "constant=$constant"
constant="new value"
echo "constant=$constant"
After we execute this script we will receive an error message.
constant=unchangable
bash: constant: readonly variable
constant=unchangable

User input and program output

Most of the shells have special tools for accepting user input and capturing program output. That allows us not only print something to the user, but also get user's response, and redirect output of the programs executed inside a script to the script itself.

User input

The special command read allows us to read user input and put it into a script variable. This command reads one line from the standard input and assigns this line to one or more script variables, which a passed as arguments of the read command. If there are more words in the line than we assigned variables, then the read command assigns one word to each variable and the leftovers going to the last variable. Here are two examples:
#!/bin/sh
# script: readname
echo -n "Please eneter your name: "
read name
echo "You entered '$name'"
$ readname
Please eneter your name: Bob Smith
You entered 'Bob Smith'
And the same example with two variables:
#!/bin/sh
# modified script readname
echo -n "Please eneter your name: "
read firstname name
echo "    Your name: '$name'"
echo "You firstname: '$firstname'"
$ readname
Please eneter your name: Bobby Fisher
    Your name: 'Fisher'
You firstname: 'Bobby'

Capturing program output

Special construction `...` or $(...) allows us to execute the command that is specified instead of the dots and capture its output. We can either assign this output to a variable, immediately print it, or parse it. The following example prints the current date and time:
#!/bin/bash
today=`date`
echo "Today is $today"

By combining the read command and output capturing tool we can parse the output. For example, if we do not need to know the complete date, but only the day of the week we can write a small script fw (first word) that reads the whole input line, but prints back only the first word:

#!/bin/bash
# script: fw
read first rest
echo $first
And redirect the output of the date command to the input of this script. The result of the combination will be the day if the week:
#!/bin/bash
# script: weekday
weekday=`date | fw`
echo "Today is $weekday"
$ weekday
Today is Sat


References: