What happens when you type ls -l in the shell.

Many of us use the shell for years, but not all of us know how it actually works. That’s why today we will talk about what processes occur under the hood of the shell, and about how it is designed.

What is the shell anyway?

According to Wikipedia, a Unix shell is a command-line interpreter or shell that provides a traditional Unix-like command line user interface. The first Unix shell was written by Ken Thompson, was introduced in the first version of Unix in 1971. It was a simple command interpreter, not designed for scripting, but nonetheless introduced several innovative[citation needed] features to the command-line interface and led to the development of the later Unix shells (including bash, zsh, fish and many more).

Why on Earth do I need a command-line interface?

Among hardcore computer types, the command line persists. If you’re a developer or a sysadmin / DevOps, there are times when it makes more sense to use the command line interface built into your operating system. All computing, at some level, is an abstraction and yet deep down beneath there are hardware instructions doing the job. In some cases, command line interfaces provide access to lower levels of a machine’s software and hardware. And they’re often easier to manipulate with “scripts,” mini text programs that automate processes for system administrators and others. (You can read more about it in this Wired.com article)

Okay, you’ve convinced me. So how it actually works?

To explain that we should start from the fact, that shell is a program, written in C language (aka “High-level Assembly”), and to understand how it works we need to talk about how it is designed, what it uses, and what challenges were successfully solved by its authors. And because projects like bash are often too huge to even read their code (1792 lines of code only in the shell.c file!), we will create our own simple version of the shell.

Here’s the source code of the shell.c file from our Simple Shell project, the project we will use for the demonstration purpose:

shell.c

Everything in C starts with the main() function, which will return its exit code, which represented by zero at the very end of the file (non-zero indicates that our program has exited with an error). In our case, all non-zero exits are hidden under the hood of the functions used in this main function.

But enough about errors, let’s back to our code. As we can see, the heart of our shell is the while loop, which is the infinite loop in our case. Let’s go through it step by step to understand what it does and why we need it to be infinite.

At the very beginning of the loop, we have the signal function:

signal(SIGINT, sig_handler);

This function handles the POSIX SIGINT signal, which represents an interruption. We have the sig_handler as a second parameter, and in this case it works as a callback, and handles the ctrl-C hotkey while in its interactive mode.

Then we have the if (isatty(STDIN_FILENO)) condition, that checks if our shell is running interactively and writes a dollar sign as its prompt.

The simple shell prompt

The next group of code contains the getline() function, which pauses the while loop and waits for the users input. Then it writes whatever from stdin into the line variable, using the buf buffer. The next function, _strtok() creates the array of string tokens from the line variable and places it into a toks.

The next group of code (lines 29–34) starts with the if statement, that checks if user’s input contains the built-in commands, such as env, which will print the OS environment in our case. If env_handler() function will return true, code inside the if condition would be skipped, and the environment would be printed.

Our loop reaches its and here and repeats.

This time we will type the ls -l command to our terminals prompt.

Because this time our env_handler did not found the ls command among its list of built-ins, it has returned false, and execution will continue to exit_handler, that checks if our token is the exit command or the EOL character (more about EOL). This function will not stop the program, so it will continue to the path_resolver() and proc_handler() functions.

The Path Resolver

Any operating system has the environmental variables. They are dynamic-named values that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs and they are available system-wide. One of them, PATH, is responsible for the storage of the paths to executable files (aka programs), that could run on the computer. This variable makes our life easier because without it we would be forced to remember the full path to each executable file. Moreover, we would be forced to type it every time (e.g. /bin/ls instead of just ls).

The path_resolver() to the rescue:

The function above iterates through the env pointer (points to the current environment) line by line, and seeks for the line that begins exactly with the “PATH=”(By convention, environmental variables have the key=value format, so a function can relay on that), and if it has been found, stores the line into the path variable (as an array of characters).

On the next step, it gets the length of the path string and then checks if it is equal to 5 (this would mean that we’ve reached the case, where path equals to “Path=”, which means the environmental variable PATH is empty) and exits if it is true.

Then function gets the resolved path via path_resolve_helper() function and returns it to the main function.

In the long listing above (sorry for it is being too long) we can see that function tries to parse each part of the path string (if our systems PATH contains more than one path to executables, those paths will be delimited by the colon character :). It finds the string sequence starting from the beginning to the colon (line 19), then allocates memory in the heap for both string token tok (line 22) and for a final path to binary, represented by the bin variable (line 25).
The next operation represented by the get_str_seq() function fills the tok variable with the sequence from the path string (line 29) and then copies it to the bin variable (line 30). After that, the _strcat() function is called twice (lines 31 & 32), and it builds the path to a binary executable by adding the slash delimiter (/) and the command that user has typed (e.g. /usr/bin/ls).

Now we have our path built. But we still don’t know if our executable exists in this path. To check that we have the while loop, that calls the access() function that checks if the file exists at the path and if current process has rights to access it.

If access() returns false, we have to repeat with the next sequence from the path. Whenever the access() returns true, the path exists and it is accessible, so function proceeds to the line 48 and returns the resolved path back to our path_resolver(). And if no executable is available (the program will know about that by checking if index of the last token is equal to the length of the whole string PATH), the function will return the command as is.

The Process Handler

At this point we have our path resolved, and in case of our ls -l command for Ubuntu Linux systems it looks like /bin/ls (as well as for the most other Linux distributions). What happens, next is the most interesting: the execution of the shell goes to the proc_handler() function, which stands for the creation and management of the subprocess:

But what is the process and why should we use them? The simple answer to this question is because the commands we are able to execute (e.g. ls, cp, cat, etc.) are executables written in C, bash, python etc., and they have they own exit codes and error messages. The execution of one of the commands, if it will exit with an error, would crush our shell and would lead to its exit (it’s fine while we playing with it, but imagine the case when this shell is used for our ssh session). Moreover, without processes, we would not be able to use semicolons (;), logical ANDs (&&), pipes (|) redirections (>, >>, <, <<) to combine our commands.

The proc_handler() function starts with the assignment of a return value of the fork() system call. In the context of the Unix operating system, fork is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. Fork is the primary (and historically, only) method of process creation on Unix-like operating systems. The process being created with the fork is a copy of the parent program, and the relationships between two processes are parent-child alike. The child process starts off with a copy of its parent’s file descriptors. It is important for us to know that the process identifier (aka pid variable) will have the non-zero positive value only for the parent process, while for the child it always will be zero. In the code above we have a check for that (line 19). It tries to execute the command provided by us via the execve() function if the process is a child, otherwise, it is a parent process, and it waits for its child process to be completed.

The execve() function will write the resulted output of the executable back to the standard output, and execution of the shell will go back to the shell.c file (line 26, with the getline()). That’s how the cycle of the execution of shell looks like.

Of course, we didn’t touch the POSIX API of the kernel, but things are already 100% nerdy. Let’s talk about the kernel next time.

That’s all for today, folks, I hope it was interesting.

P.S. The link to a full GitHub repo

Hacker. 500 Startups Alumni. Envato Elite. School of AI Dean. Runner. Ukrainian. I write about software, AI & life. SF 🌁 https://cu7io.us

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store