Many of us use the shell for years, but not all of us know how it actually works. That’s why today we will talk about what processes occur under the hood of the shell, and about how it is designed.
What is the shell anyway?
According to Wikipedia, a Unix shell is a command-line interpreter or shell that provides a traditional Unix-like command line user interface. The first Unix shell was written by Ken Thompson, was introduced in the first version of Unix in 1971. It was a simple command interpreter, not designed for scripting, but nonetheless introduced several innovative features to the command-line interface and led to the development of the later Unix shells (including bash, zsh, fish and many more).
Why on Earth do I need a command-line interface?
Among hardcore computer types, the command line persists. If you’re a developer or a sysadmin / DevOps, there are times when it makes more sense to use the command line interface built into your operating system. All computing, at some level, is an abstraction and yet deep down beneath there are hardware instructions doing the job. In some cases, command line interfaces provide access to lower levels of a machine’s software and hardware. And they’re often easier to manipulate with “scripts,” mini text programs that automate processes for system administrators and others. (You can read more about it in this Wired.com article)
Okay, you’ve convinced me. So how it actually works?
To explain that we should start from the fact, that shell is a program, written in C language (aka “High-level Assembly”), and to understand how it works we need to talk about how it is designed, what it uses, and what challenges were successfully solved by its authors. And because projects like bash are often too huge to even read their code (1792 lines of code only in the shell.c file!), we will create our own simple version of the shell.
Here’s the source code of the
shell.c file from our
Simple Shell project, the project we will use for the demonstration purpose:
Everything in C starts with the
main() function, which will return its exit code, which represented by
zero at the very end of the file (
non-zero indicates that our program has exited with an error). In our case, all non-zero exits are hidden under the hood of the functions used in this main function.
But enough about errors, let’s back to our code. As we can see, the heart of our shell is the
while loop, which is the infinite loop in our case. Let’s go through it step by step to understand what it does and why we need it to be infinite.
At the very beginning of the loop, we have the signal function:
This function handles the POSIX
SIGINT signal, which represents an interruption. We have the
sig_handler as a second parameter, and in this case it works as a callback, and handles the
ctrl-C hotkey while in its interactive mode.
Then we have the
if (isatty(STDIN_FILENO)) condition, that checks if our shell is running interactively and writes a dollar sign as its prompt.
The next group of code contains the
getline() function, which pauses the while loop and waits for the users input. Then it writes whatever from
stdin into the
line variable, using the
buf buffer. The next function,
_strtok() creates the array of string tokens from the
line variable and places it into a
The next group of code (lines 29–34) starts with the if statement, that checks if user’s input contains the built-in commands, such as
env, which will print the OS environment in our case. If
env_handler() function will return
true, code inside the if condition would be skipped, and the environment would be printed.
Our loop reaches its and here and repeats.
This time we will type the
ls -l command to our terminals prompt.
Because this time our env_handler did not found the ls command among its list of built-ins, it has returned false, and execution will continue to exit_handler, that checks if our token is the exit command or the
EOL character (more about EOL). This function will not stop the program, so it will continue to the
The Path Resolver
Any operating system has the environmental variables. They are dynamic-named values that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs and they are available system-wide. One of them,
PATH, is responsible for the storage of the paths to executable files (aka programs), that could run on the computer. This variable makes our life easier because without it we would be forced to remember the full path to each executable file. Moreover, we would be forced to type it every time (e.g. /bin/ls instead of just ls).
path_resolver() to the rescue:
The function above iterates through the env pointer (points to the current environment) line by line, and seeks for the line that begins exactly with the
“PATH=”(By convention, environmental variables have the
key=value format, so a function can relay on that), and if it has been found, stores the line into the path variable (as an array of characters).
On the next step, it gets the length of the path string and then checks if it is equal to 5 (this would mean that we’ve reached the case, where path equals to
“Path=”, which means the environmental variable
PATH is empty) and exits if it is true.
Then function gets the resolved path via
path_resolve_helper() function and returns it to the main function.
In the long listing above (sorry for it is being too long) we can see that function tries to parse each part of the path string (if our systems
PATH contains more than one path to executables, those paths will be delimited by the
:). It finds the string sequence starting from the beginning to the colon (line 19), then allocates memory in the heap for both string token
tok (line 22) and for a final path to binary, represented by the
bin variable (line 25).
The next operation represented by the
get_str_seq() function fills the
tok variable with the sequence from the
path string (line 29) and then copies it to the
bin variable (line 30). After that, the
_strcat() function is called twice (lines 31 & 32), and it builds the path to a binary executable by adding the
slash delimiter (
/) and the command that user has typed (e.g.
Now we have our path built. But we still don’t know if our executable exists in this path. To check that we have the while loop, that calls the
access() function that checks if the file exists at the path and if current process has rights to access it.
false, we have to repeat with the next sequence from the path. Whenever the
true, the path exists and it is accessible, so function proceeds to the line 48 and returns the resolved path back to our
path_resolver(). And if no executable is available (the program will know about that by checking if index of the last token is equal to the length of the whole string
PATH), the function will return the command as is.
The Process Handler
At this point we have our path resolved, and in case of our
ls -l command for Ubuntu Linux systems it looks like
/bin/ls (as well as for the most other Linux distributions). What happens, next is the most interesting: the execution of the shell goes to the
proc_handler() function, which stands for the creation and management of the subprocess:
But what is the process and why should we use them? The simple answer to this question is because the commands we are able to execute (e.g.
cat, etc.) are executables written in C, bash, python etc., and they have they own exit codes and error messages. The execution of one of the commands, if it will exit with an error, would crush our shell and would lead to its exit (it’s fine while we playing with it, but imagine the case when this shell is used for our ssh session). Moreover, without processes, we would not be able to use semicolons (
;), logical ANDs (
&&), pipes (
|) redirections (
<<) to combine our commands.
proc_handler() function starts with the assignment of a return value of the
fork() system call. In the context of the Unix operating system, fork is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. Fork is the primary (and historically, only) method of process creation on Unix-like operating systems. The process being created with the fork is a copy of the parent program, and the relationships between two processes are parent-child alike. The child process starts off with a copy of its parent’s file descriptors. It is important for us to know that the process identifier (aka pid variable) will have the non-zero positive value only for the parent process, while for the child it always will be zero. In the code above we have a check for that (line 19). It tries to execute the command provided by us via the
execve() function if the process is a child, otherwise, it is a parent process, and it waits for its child process to be completed.
execve() function will write the resulted output of the executable back to the standard output, and execution of the shell will go back to the shell.c file (line 26, with the
getline()). That’s how the cycle of the execution of shell looks like.
Of course, we didn’t touch the POSIX API of the kernel, but things are already 100% nerdy. Let’s talk about the kernel next time.
That’s all for today, folks, I hope it was interesting.
P.S. The link to a full GitHub repo