About Shell And Echo

My first exposure to the UNIX shell was when I was being told that I have to share access to the Internet with my younger siblings. Not keen on turning my computer into a time-sharing station, I had to build my first router. My dad worked in telecommunications and made sure every kid had an own landline phone and a PC in the room. But he wanted us to share a single Internet dial-up to save money. Thanks, Dad, you paved the road! โค๏ธ

Growing up with the Intel 8086, MS-DOS and later Windows I knew nothing about UNIX except that everyone on the still small Internet thought it was the superior operating system.

It is always advised to trust people on the Internet ๐Ÿ˜‰ and so I started installing FreeBSD on a spare machine. And then I built a router. I have absolutely no idea how I made it work. At that time my knowledge about computer networks was practically zero. Internet was still delivered via dial-up lines. On top of that, I had not a clue how the UNIX shell worked. But I made it work. Somehow.

Since that early exposure to the UNIX shell, I occasionally uncover little wonders and surprises. In this article, I’d like to share a few of these “uhm… what?” moments that I had when I used echo and the shell. I’ll be using the Bourne Again Shell on Ubuntu Bionic for the demos. So it is not really a UNIX shell but close enough.

Let’s first look at echo and how it is invoked by the shell:

$ echo

$

If we don’t provide an argument, it prints a newline. We can suppress the newline by adding -n:

$ echo -n
$

That is a feature of echo, not the shell, but we will come back to this later. Let’s look at a popular shell feature now: Comments. We can add a comment to a command by using the # symbol. Comments will not be interpreted and are not part of the arguments that a binary is called with.

$ echo foo # bar
foo
$ echo foo #bar
foo

The string bar is never echoed because it is part of a comment. There is one important thing to notice: A comment must be a word. It cannot appear in the middle of another word. If it does, it is not a comment anymore:

$ echo foo#bar
foo#bar
$ echo foo# bar
foo# bar

Besides comments, there is more pre-processing the shell can do for us. We all know about the glob patterns, right? The patterns are applied to all files in a directory and save us a lot of time typing file names.

Let’s try that in an empty directory:

$ echo *
*

Uhm… wait? Isn’t the asterisk supposed to be replaced with the file names in that directory? Well, it is. But if there are no files to match, the shell keeps the asterisk and hands it over to echo. I find this surprising. This could be a problem in scripts maybe.

Ok, let’s add a file named * to that directory and see the difference:

$ touch '*'
$ echo *
*

๐Ÿง I can’t tell the difference. Can you? So how do we know if this is an actual filename or just the asterisk? Let’s check with ls if that file really exists:

$ ls
'*'

It does. And the filename is…? Is the file named * or '*'? Let’s find out using stat:

$ stat '*'
  File: *
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 801h/2049d	Inode: 3147147     Links: 1
โœ‚๏ธ

The filename is *. ls is just friendly enough to quote the name, that is why it appears as '*'. So if we call echo * we should see the filename * and not the asterisk *. We can prove that the shell does the globbing by adding another file to the directory and see if we get both files:

$ touch hello
$ echo *
* hello

Yeah, all the files are there. All the files? Well, not really. There is a convention that the shell ignores files that start with a . when matching the glob pattern. In every directory, there is a self-link named . and a link to the parent directory ... A special case is the root directory / in which both, . and .., are self-links. So how do we get all the files now? If we match for files that start with a dot, we do not get the other files. If we match for * the shell will hide the dot-files from us. The trick is to use both:

$ echo .* *
. .. * hello

The shell expands both patterns for us. The first one matched the dot-files only, the second one all files but the dot files. All results are passed to echo.

But enough about glob patterns. Remember the -n option from earlier? Let’s say we want to echo -n. How would we do that?

$ echo -n
$ 

๐Ÿ˜• Clearly, that does not work. How about this?

$ echo "-n"
$

๐Ÿ™ Nope. And this?

$ echo '-n'
$

๐Ÿ˜ฃ Nada. Nein. Njet. But why? The shell is processing the words and handling them to echo afterward. It does not make a difference if we quote them. echo always sees -n and thinks it is a command line option. So, how about using some force?

$ echo -n -n
$

๐Ÿ˜ซ Impossible! Now echo thinks we are a bit out of sync by passing the same option twice. Forgivable as it is, it ignores one of the options. But hey, I remember something about the double dash in bash! We can use it to mark the end of a parameter list. Let’s give that a shot:

$ echo -- -n
-- -n

๐Ÿ˜ค So close. But still not there. The shell won’t help us here. Luckily, the authors of echo have built something in that we can leverage. Using the -e option we can treat the input as an expression. Let me quickly show you why using an expression alone will not save us:

$ echo -e "-n"
$

๐Ÿ˜ก echo still thinks we are passing an argument. However, if we make this argument not look like an option, echo will think of it as a string.

$ echo -e "-n\n"
-n

$ 

๐Ÿคจ Almost there! Now we have the output we want. And an extra newline. By treating the string as expression, we unlocked the special control character \c. It can be used to indicate that we want to stop processing at the point where it appears in a string. Let’s combine this:

$ echo -e '-n\n\c'
-n
$

๐Ÿคฉ Hooray! We made it. We can even apply our new knowledge to make echo print -n but without a newline:

$ echo -e '-n\c'
-n$

Given what we just learned, what do you think this command does?

$ >>'>' echo **'*'

UNIX-like processes and the imaginary disco ball

Earlier this week I found myself discussing UNIX process states with my colleague Michael. When it came to orphaned processes and daemonizing I vaguely remembered that I had to fork() twice to get a daemon process. That statement was immediately challenged by Michael. Rightfully so, because forking twice is not the key action here. The missing piece was to obtain a new session between the first and the second forking. This ensures a properly daemonized process can not obtain a terminal again. ๐Ÿคฏ

But let’s slow down a little. What is all this forking and sessions that I am talking about? Time for a quick refresher in operating system internals.

That is a wonderful chance to write some good ol’ C code! Why C? Because it is the right language for the job. Most operating system kernels are written in C (and assembler). Furthermore, according to my other colleague Robert, the moment you start writing C “the light dims, an imaginary disco ball lowers from the ceiling, and an encouraging atmosphere is created”. I could not have described the coding C feeling better!

coding in C as described by Robert

I prepared a couple of Docker images to make it easier to follow the article. Please find the source code at Github in the process-fun repository. Small, ready to run images are available on Dockerhub in the (surprise!) process-fun repository. Run the images on your local Docker-enabled machine as you like.

A Natural Process State

Processes don’t exist just for fun. They are there to get work done, play music, mine crypto coins, train a neural net, or send spam emails. A process’ natural state is running or runnable. That means everything is more or less in order and the Kernel may grant some computing time to the process. Writing a program that does something meaningful is hard. Writing a program that does just something is much easier. Let’s stick with easy. ๐Ÿ˜‰

#include <time.h>

int main(int argc, char *argv[]) {
    // run for 10 seconds
    time_t end = time(0) + 10;

    // do something
    volatile int i;
    while (time(0) < end) {
        i++;
    }

    // exit ok
    return 0;
}

This program just runs for ten seconds, incrementing an integer to make sure some computing time is wasted. Let’s run it and see what the state of the corresponding process is:

$ docker run danrl/process-fun:running
โœ‚๏ธ
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 R    process-fun
    9     1     1     1 R    ps

We find the process state in the STAT column. Little surprise here, it is R which stands for running or runnable. Boring!

Just in case you are wondering: The tool used to generate the process list is ps but with a custom output format.

Sleeping And Waiting

A wise person once told me that the best things in life are worth waiting for. I don’t know if that holds true for the following piece of code. ๐Ÿค”

#include <unistd.h>

int main(int argc, char *argv[]) {
    // wait
    sleep(10);

    // exit ok
    return 0;
}

This program just sleeps for ten seconds. Let’s run it and see what the state of the corresponding process is:

$ docker run danrl/process-fun:waiting
โœ‚๏ธ
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    6     1     1     1 S    process-fun
    8     1     1     1 R    ps

Again, expected result, the process is in the interruptible sleep state which is indicated by S. Still pretty boring, right? Let’s awake the undead, that should be more fun! ๐ŸงŸโ€โ™€๏ธ๐ŸงŸโ€โ™‚๏ธ

Bad Parenting

Once ready and started, we expect a process to be either doing something, waiting for something, being stopped (e.g. for debugging), or terminated. But there is another state: The defunct or zombie state. This happens when a child process, that was forked off from a parent process, has been terminated but the parent process has not yet collected the return state. The return state of a child can be collected using the wait() call (we call that reaping). However, a parent process may be busy doing something else or just decided not to reap. In that case, the child process, although not alive, still has a process control block maintained by the Kernel. The process is neither dead nor alive. A true zombie!

Here are a few lines of code that create a zombie process:

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child exits immediately
        return 0;
    }
    // parent sleeping, not eager to call wait()
    sleep(10);

    // yeah, maybe now...
    wait(NULL);

    // exit ok
    return 0;
}

This program forks and immediately exists the child process. The parent process is lazy, it sleeps for a couple of seconds before it bothers to reap the child. During that timeframe, in between the child’s death and the parent’s call to wait() we run ps:

$ docker run danrl/process-fun:zombie
โœ‚๏ธ
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 S    process-fun
    9     7     1     1 Z    process-fun <defunct>
   10     1     1     1 R    ps

The child process is in the defunct or zombie state. In the PID column, we can see the unique process identifier (PID). The PPID column shows us the parent’s process identifier (PPID) respectively. The output of ps confirms that process number 9 is a zombie child of process number 7. Once process number 7 calls wait() or exits, the zombie child will be removed from the process list.

The Forgotten Child

Although not waiting for a child process may be considered bad parenting, it is not the worst that can happen to a process. What happens if we forked off a child process and then the parent process terminates but the child is still out there?

Here is the code for that:

#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child being lazy
        sleep(30);
        return 0;
    }
    // parent exiting without waiting for the child
    sleep(5);
    return 0;
}

After about five seconds the parent process exits. The child process becomes orphaned, meaning it has no valid parent process identifier anymore. Orphaned processes become foster children of the process with the PID 1 which is often the init process. In our container, the PID 1 process is the state.sh shell script (the CMD configured in the Dockerfile).

$ docker run danrl/process-fun:orphaned
โœ‚๏ธ
Starting process-fun... done!
Process list (before orphaned):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    6     1     1     1 S    process-fun
    8     6     1     1 S    process-fun
    9     1     1     1 R    ps

The parent process has PID 6 and the child process is identified by PID 8. The child’s parent is, therefore, the process with PID 6 (see column PPID). Once the parent process terminates the situation changes:

Process list (after orphaned):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    8     1     1     1 S    process-fun
   11     1     1     1 R    ps

Now the process with PID 6 is gone and the child process is assigned to a new parent process. The PPID now reads 1.

Summoning The Daemon

In the previous example, the child process was successfully detached from the parent process. But it was still part of the same (terminal) session. This means the process could theoretically attach to the terminal session again. We call a process a daemon process when it can not attach to a terminal session. This means it has to migrate to a new session to be fully detached from the parent process and the parent process’ session.

Let’s have a look how we can make that happen:

#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child 1 migrates to a new session
        setsid();
        pid_t pid2 = fork();
        if (pid2 == 0) {
            // child 2 (daemon) being lazy
            sleep(30);
            return 0;
        }
        // child 1 exiting ok
        return 0;
    }
    // parent exiting without waiting for the child
    sleep(5);
    return 0;
}

First, we fork off a child process and exit the parent. This orphans the child process and its new parent will be the init process. In the child, we request to be assigned to a new session. By doing so, we become the session leader. We don’t want a daemon to be a session leader, because we do not want a daemon to be able to attach to a terminal. The second child (the child of the first child) will finally belong to the new session which is different from the (terminal) session we originally ran the parent process from.

$ docker run danrl/process-fun:daemonized
โœ‚๏ธ
Starting process-fun... done!
Process list (before daemonizing):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 S    process-fun
    9     7     9     9 Zs   process-fun <defunct>
   10     1     9     9 S    process-fun
   11     1     1     1 R    ps

Here we see all three processes at once:

  • PID 7: The parent process.
  • PID 9: The first child. It is a zombie process since the parent is still running. It is the session leader of the new session. This is indicated by the lowercase s in the STAT column.
  • PID 10: The second child and the new daemonized process.

After waiting for the parent and the first child to terminate, the process table looks like this:

Process list (after daemonizing):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
   10     1     9     9 S    process-fun
   13     1     1     1 R    ps

The process with PID 10 is now a true daemon. ๐Ÿ˜ˆ

And this is why we fork twice. Mystery solved!

Further Reading

This was a quick, practical roundup of the most common process states that I see in my daily life as SRE. There is more on Process States on Wikipedia and in the UNIX Internals book.

A Concurrency-safe, Iterable Binary Search Tree

In today’s article, we will implement a simple, unbalanced, concurrency-safe binary search tree in Golang. We will use some nice patterns along the way that are worth knowing.

I assume that the valued reader has a basic understanding of the Go programming language and knows what a binary search tree is. What I want to show is how easy it is to create a concurrency-safe version of this basic data structure in Golang.

Let’s begin! The API for our binary search tree should offer the following actions:

  • Value(key) returns a node’s associated data (the value)
  • Upsert(key, value) updates or inserts the value for a node identified by key
  • Iter() iterates over all nodes of the tree in order

The key shall be a string unique to the instance of the tree. It is identifying an individual node. Regarding the value that is stored, we will be more agnostic and use an empty interface, thus allowing all kinds of data to be stored in our tree. From the data structure point of view, we do not care about the value. So why should we care about its type, then?

The Tree

For now, the tree is represented by a simple type that holds just the root node:

type BSTree struct {
	root *node
}

The Nodes

For the nodes we will be using a simple struct with fields for the key, value, and pointers to child nodes:

type node struct {
  key   string
  val   interface{}
  left  *node
  right *node
}

Retrieving the value of a node is, no surprise here, a recursive function. It leverages the fact that a binary search tree, per definition, is sorted. If we are searching for a key that is smaller then the key of the node we are looking at we have to proceed on to the left child. If the key is larger, we proceed on to the right child. If none of the above applies we have found the node we were looking for.

func (n *node) value(key string) (interface{}, error) {
	if n == nil {
		return nil, ErrorNotFound
	}
	if key < n.key {
		return n.left.value(key)
	} else if key > n.key {
		return n.right.value(key)
	}
	return n.val, nil
}

Upsert

Upsert, the union of the operations of updating and inserting, works very similar. The only difference is, that if we can not find a node, we create the node.

func (n *node) upsert(key string, val interface{}) {
	if key < n.key {
		if n.left == nil {
			n.left = &node{key: key, val: val}
		} else {
			n.left.upsert(key, val)
		}
	} else if key > n.key {
		if n.right == nil {
			n.right = &node{key: key, val: val}
		} else {
			n.right.upsert(key, val)
		}
	} else {
		n.val = val
	}
}

Concurrency-safe API using a Read-Write-Mutex

We will take two steps to ensure a concurrency-safe API:

  • Protect the tree from being written to by multiple writers at the same time by introducing a Read-Write-Mutex (a lock)
  • Create public wrapper methods honoring the mutex for the private method. For example the method Upsert() is concurrency-safe while the method upsert() is not. So Upsert() will have to take care of locking and unlocking.

It is considered good style to place the synchronizing entity directly above the field that it is protecting:

type BSTree struct {
	lock sync.RWMutex // <-- This one is protecting 'root'
	root *node
}

Here the sync.RWMutex is protecting the root node and all its descendants.

The public method Upsert() of the BSTree type is acquiring the lock and then calling the private method upsert() on the root node. The releasing of the lock is deferred for convenience.

func (b *BSTree) Upsert(key string, val interface{}) {
	b.lock.Lock()
	defer b.lock.Unlock()
	// if root node is empty, new node is root now
	if b.root == nil {
		b.root = &node{key: key, val: val}
	} else {}
		b.root.upsert(key, val)
	}
}

Using this simple pattern we can keep the actual data changing methods free of synchronization artifacts. At the same time, we have all the locking and unlocking concentrated in the public methods. I like his pattern because it helps me gain a quick overview of the entry points for calling functions into my library code.

Iterating using a Channel

Iterating over a binary tree in order is relatively easy. We recursively walk through all the nodes and return the left child’s value, then the node’s value, and finally the right child’s value. Assuming we want to send the values into a channel, the code might look like this:

func (n *node) iter(ch chan<- Item) {
	if n == nil {
		return
	}
	n.left.iter(ch)
	ch <- Item{
		Key: n.key,
		Val: n.val,
	}
	n.right.iter(ch)
}

The channel is of type Item which is a simple struct that holds the key and the value of a node. It looks like this:

type Item struct {
	Key string
	Val interface{}
}

Now comes the tricky part. We want the iter() method to be used by external code in a concurrency-safe way. This means we can allow one or more readers, each one having its own channel for Item retrieval, but no writer until the last reading channel is closed. The Read-Write synchronization is taken care of by the mutex already. Nice! However, parallel, non-blocking reading through multiple channels is not that easy. It asks for a smart use of channels and goroutines.

Let’s start simple: A reader calling Iter() shall receive a read-only channel. Obviously, we have to create this channel at the beginning of the method and return the channel when the method ends. This is easy:

func (b *BSTree) Iter() <-chan Item {
	ch := make(chan Item)
	b.lock.RLock()
	// here be dragons
	return ch
}

But wait, don’t we have to close the channel eventually? And how do we keep the program from blocking, given that the reader cannot receive from the channel before we have returned it? Furthermore, we can not (continuously) send to the channel until the reader has received from the channel, thus freeing up slots for us to write into. And when do we unlock the mutex?

The solution is to delegate all these tasks to a goroutine. The goroutine has access to the context from which it was called. This allows us to call the iter() method on the root node, unlock the mutex, and also close the channel all inside the goroutine. And suddenly, all we need is taken care of and we can return the channel we just created.

func (b *BSTree) Iter() <-chan Item {
	ch := make(chan Item)
	b.lock.RLock()
	go func() {
		b.root.iter(ch)
		b.lock.RUnlock()
		close(ch)
	}()
	return ch
}

It is OK to be confused by the code at first. A lot is happening here and the pattern might not be clear from the beginning. Feel free to take your time to grasp it.

Conclusion

Implementing a binary search tree in Golang is straightforward and does not require deep language knowledge. By adding synchronizing tools like mutexes and channels, data structures can be made ready for showtime in concurrent programs.

Source

A full implementation including a Delete() method and 100% test coverage can be found in the golibby repository.