SREcon Asia/Australia Day 1 (Report)

It is SREcon time again! This iteration takes places in beautiful Singapore! 🌴I flew in yesterday in a half-empty A380. First time A380 (amazing piece of engineering!), first time visiting Asia. It’s also the first time I’m staying in the same hotel where the conference I’m attending takes place. Not the first time I’m investing my savings into getting up to speed with a new role, but maybe the first time I was very generous to myself. Anyway, here’s my report from day one.

The organizers urged us, attendees, to show up early to pick up badges and have breakfast. A minor hiccup with the network prevented the downloading of the conference badges so we all headed to the much-needed coffee and breakfast first. Some, I learned, flew in at midnight and were up again at 7 am to help to organize. Wow. Badge pickup started a few minutes later than planned but as far as I understand was still within the agreed-on SLO range. 😉

Opening Remarks

Paul Cowan and Xiao Li welcomed us and I learned this is the second SREcon in Asia. The organizers set up an on-call room for those of us who could not get ourselves out of rotation. Awesome service! A quick reminder of the USENIX Code of Conduct and the SREcon Slack followed. Some stats: 58 speakers and over 300 attendees this time. More companies, more diversity, and 2% more engineers. Hooray for engineers! 👩‍💻👩‍🔧

The Evolution of Site Reliability Engineering

Benjamin Purgason (LinkedIn) shared his experience with running an SRE team. When he joined the team, on-call was in-office and regular site outages were happening whenever the sun rose over California. It was incredibly helpful to learn that the big players had problems like this, it’s not only us. The founding principles for SRE at LinkedIn are:

  • Site Up (website and backend services)
  • Empower Developer Ownership
  • Operations is an Engineering Problem (They don’t want heroic actions in Ops, but rather build reliable software in the first place.)

I learned about the evolutionary steps of an SRE:

  • The Firefighter: Purely reactive, Incident Management all the time
  • The Gatekeeper: Change control. Protect “our” (SRE) site from “them” (Software Engineers). It is an evolutionary dead end, a team can get stuck in there. Don’t to that!
  • The Advocate: Creating a reliability culture. Rebuilding trusted relationships. Still reactive to Software Engineering plans.
  • The Partner: Empowering intelligent risk. Proactive and joint planning with Software Engineering. Collaborating to magnify the impact.
  • The Engineer: Reliability throughout the software lifecycle. Proactive, once plan for SRE and SWE. Everyone has the same job: Help the company win.

Money Quotes:

  • Every day is Monday in Operations.
  • What gets measured gets fixed!
  • If you solve your biggest problem every day, you start with 100 problems and still have 100 problems a year later. But they have a smaller scope by then.
  • Human gatekeeping doesn’t scale.
  • Attack the problem, not the person.
  • There is no such thing as ‘the hole is in your side of the boat.’ (Fred Kofman)
  • How do you want to spend your time? Help me build a reliable site or help me at 3 am in the morning fighting the fire?
  • Do not insulate, share the pain.
  • Contribute where it counts.
  • Unify SWE and SRE planning and priorities.

Link to the talk: The Evolution of Site Reliability Engineering

Safe Client Behavior

Ariel Goh from Google Sydney dug into the problem of handling over 2bn Android clients with a significantly lower number of servers. Essentially, safe client behavior means Do Not DDoS. Unsafe requests include periodic retries which are not safeguarded by proper backoffs and unintentional syncing. The worst thing that can happen is the backend (servers) going down. Here’s what Ariel suggested for safe client behavior:

  • Add jitter to client code, do not sync periodically without having at least some randomness in the backoff time.
  • A synchronized startup does not seem like a problem, because not everyone starts their app at the same time, right? Well, some apps do background tasks that are bound to a specific time. E.g. synchronize at 4 am in the morning. Adding a jitter to the startup can help here.
  • Do not retry by default!
  • Retry with jitter and capped, exponential backoff and you are a much better behaving citizen.
  • Do not retry on out of quota or client errors (e.g. HTTP 400 errors)
  • Do (carefully) retry on networks and server errors (e.g. HTTP 500 errors)
  • Implement Retry-After header in client and server.
  • Improve debugging by adding tags to requests including client name and version, the feature that triggered the request, if the request is the initial one or a retry.
  • On the server side: Prioritize interactive requests over background requests.
  • Additional tips for microservices: Have retry budgets and adaptive throttling. (The reasoning here is, that microservices in your managed infrastructure probably have more insight into the state of the overall system than some random clients out there in the wild.)

Example code for adding jitter:

Make sure to get your hands on the slides once they are published. A lot of graphs in there showing the effects of different variants of jitter and backoff code. Eye opening!

Ariel summarized the talk as follows:

  • Jitter everything
  • Don’t retry
  • If you retry, back off
  • Move control to the server
  • Expose info to the server
  • Use retry budgets
  • Use adaptive throttling

Link to the talk: Safe Client Behavior

Service Monitoring Manual - 2018 Edition

Nikola Dipanov from Facebook’s Production Engineering talked about monitoring in production. First, we have to ask the right question: What to monitor? You may want to monitor different things, whether you are collecting data for a developer audience or for customers who are more interested in an SLA.

Levels on which data collection happens:

  • Host level
  • Service level
  • Mesh level (referring to the service mesh, the networking layer in a sense)
  • Rack/Cluster/Pod/… level (higher levels, failure domains)

Most of the talk was pretty basic. Suggesting to use a time series database (what else?) However, there were interesting insights into how Facebook deals with monitoring challenges. They open sourced a couple of their tools, believe in structured logs, and are able to aggregate and query structured data using an internal tool called SCUBA. 📈

My highlight of the talk: War stories from Facebook Production Engineering. But I won’t spoiler those, watch the recording once it is out. 🤫

Money Quotes:

  • Data hopefully become the lingua franca in your engineering organization.
  • Monitoring should be like git: Init on project start and be there for the whole lifecycle.
  • Do not wake up people for noise.

Link to the talk: Service Monitoring Manual - 2018 Edition

Doing Things the Hard Way

The more forgiving right-after-lunch time slot was taken by Chris Sinjakli from GoCardless. He did not need any forgiveness for the talk’s content which was great. But the AV wasn’t forgiving of his USB-C MacBook. I gave him my older MacBook for the presentation and used his shiny new one to take notes. (I want my old keyboard back…)

The dangers of hiring a DevOps engineer when you have an infrastructure problem: It creates a new bottleneck, as everything goes through DevOps then. Make contributions to infrastructure easier. Make it obvious for developers what and how to change to modify the infrastructure. That enabled developers to contribute to the infrastructure code. So when hiring someone for infrastructure make sure they have a developer background.

Observability pays off in the longer term. It has to permeate everything you do to provide more value. Results include:

  • Faster debugging
  • Shorter outages

Another point I took home was: Once you change the core of your infrastructure, you may end up with an Everything project. A change that touches everything risks not changing anything at all in the end. So where to start? Stop building with the new world in mind. Build the smallest version possible.

Money Quotes:

  • In reality, the hard problems are not necessarily the most important problems.
  • Features are not done when shipped, but done when measured.
  • The one leap into the perfect infrastructure is ludacris.
  • Do not rewrite everything from scratch.
  • You won’t avoid every mistake. It’s perfectly fine to correct…

Link to the talk: Doing Things the Hard Way

Achieving Observability into Your Application with OpenCensus

OpenCensus developer and former Google SRE Emil Mikulic introduced the OpenCensus framework. My team recently started OpenCensus in new Golang microservices and we love it. The talk was about distributed tracing, explaining traces and spans. That for good propagation you have to generate the Trace ID and Span IDs as early as possible. This metadata is then propagated using HTTP headers. (I use gRPC often and get this for free there. Can highly recommend!) One probably wants to add application-level metrics (e.g. queue lengths) to the data that comes out of OpenCensus.

If you just starting with tracing, look into OpenCensus. I think it is the new standard and we use it all the time on my team.

There was a cool demo. The code is on GitHub.

Link to the talk: Achieving Observability into Your Application with OpenCensus

Comprehensive Container-Based Service Monitoring with Kubernetes and ISTIO

Being a huge fan of ISTIO, I had to go to Fred Moyer’s talk about Kubernetes and ISTIO. Fred works for Circonus. Fun fact, he wrote the very first ISTIO adapter and got awarded with a ship in a bottle for that. ⛴

After a quick overview of the ISTIO components, Fred demonstrated the book shop example app. If you have, like me, played a bit with ISTIO already this specific part of the talk will not provide too many new insights. I liked that he put the kubectl output on the slides rather than showing them in a small terminal window. That makes it more approachable to people watching the recording later.

Much has been said about the Four Golden Signals. Fred showed how a different set of metrics, called RED (stands for Rate, Error, Duration) that can be gathered with ISTIO:

  • Rate: We have the number of requests and also get the ops per second on the ISTIO standard dashboard. That was easy!
  • Errors: We have the number of requests by HTTP status code. From that, we can derive the errors easily.
  • Duration: The best approximation may be the request duration percentiles. However, there are some dangers to that. They are an aggregated metric and may hide some bad tail.

The way to go for measuring durations may be the histogram. Histograms make some effects visible that would be hidden by percentiles. Also use heatmaps, of course. I love heat maps! I learned that writing custom metrics adapters for ISTIO is not very hard.

Fun story: With the metrics that ISTIO provides, we can measure the number of rage clicks (user-induced retries). An indirect indicator of customer satisfaction. 😂

If you deal with SLIs or SLOs, you want to watch this talk. Highly recommended!

Money Quotes:

  • Percentiles are an output, not an input!
  • If you work with percentiles as SLI, ask yourself: Can you do better?
  • The code is hosted on Microsoft… opens github.com Good one! 🙃
  • Monitor services, not containers!

Link to the talk: Comprehensive Container-Based Service Monitoring with Kubernetes and ISTIO

Randomized Load Balancing, Caching, and Big-O-Math

Julius Plenz from Google started with letting us know that he won’t do the hard math on the slides but rather use visualizations. Very much appreciated! He started with bins of servers receiving requests. With random load balancing, those requests are not uniformly distributed. So we derive a metric from that called peak-to-average ratio. We have to provision for peak load, so the natural thing to do is to reduce the peak-to-average ratio.

We can, with a high probability, predict the peak value for a server. One way to reduce the peak-to-average ratio is to scale vertically instead of horizontally. That’s not always possible, though. When you scale horizontally, the peak-to-average ratio becomes statistically worse. Typical peak-to-average ratios typically range from 1.25 to 1.4. The more you scale up your systems, the worse it gets (if you provision for peak load).

Money quotes:

  • Math to the rescue!
  • Don’t scale instances with traffic 1:1.
  • Moore’s law makes non-linear scaling more affordable over time.
  • Randomized load balancing is good if you have many things.
  • Randomized load balancing becomes worse if you scale your system in the wrong way.
  • Pay attention to the size of the (frontend) cache.

From the Q&A:

  • Usually, we can not scale sublinearly. So the question here is not how to scale sublinearly, but how to design the system to not scale too greedy above linear.
  • There are better load balancing strategies than randomization. However, beware of feedback loops! This is, in the end, an engineering question: How much are you willing to sacrifice another roundtrip to learn about a server’s load before sending a request there.

Link to the talk: Randomized Load Balancing, Caching, and Big-O-Math

Cultural Nuance and Effective Collaboration for Multicultural Teams

Another talk I was super excited about. I spent the better part of my career in the military. While I learned some unique crisis solving skills there, working in a multicultural team was not a strong focus in that environment. Unless you consider the western-dominated NATO a multicultural institution.

Ayyappadas Ravindran from LinkedIn presented three stories of intercultural experiences from his career. I am going to spoiler one of them and leave the other two for the interested reader to check out by watching the recording.

Ayyappadas had his first one-on-one with his manager via phone. His manager always asked What do you mean? when they talked. That made him feel insulted. Was the manager thinking he was not capable of understanding what he was talking about? All of that was perceived as rude by Ayyappadas. When he met his manager in person, however, he learned that his manager was a really nice person and held a very high opinion of Ayyappadas. How come? The key is the cultural difference here: His manager, coming from a low context culture, really wanted to know what Ayyappadas meant when he asked his question. But Ayyappadas, coming from a high context culture, interpreted the question and understood it in a very different way.

I can highly recommend this talk!

Money quotes:

  • Look for what people mean and not what people say.
  • When in doubt, ask and do not assume.

Link to the talk: Cultural Nuance and Effective Collaboration for Multicultural Teams

My Summary

This is my personal summary which comes without further explanation. Think of this as a note to myself that accidentally went public:

  • Develop SRE to become a partner in crime with the devs, not police their code. Kind of hard when the code is barely production ready. No one said it will be easy, right?
  • ISTIO and OpenCensus are the way to go. I’m glad we are already on it and gaining experience with those frameworks.
  • Really cool how the community builds these flexible frameworks (Kubernetes, ISTIO, OpenCensus) which are inclusive of all kinds of underlying systems and connected TSDBs and log storage systems.
  • Histograms! We need more histograms! Don’t be afraid of non-uniform bin sizes.
  • Lee Kuan Yew, the founding father of Singapore, once claimed that air conditioning enabled Singapore’s success as much as multicultural tolerance. But does that mean every room must be chilled down that much? I did not pack warm clothes, but I wish I had. I think it is freezing cold in the conference rooms. ⛄️

About Shell And Echo

My first exposure to the UNIX shell was when I was being told that I have to share access to the Internet with my younger siblings. Not keen on turning my computer into a time-sharing station, I had to build my first router. My dad worked in telecommunications and made sure every kid had an own landline phone and a PC in the room. But he wanted us to share a single Internet dial-up to save money. Thanks, Dad, you paved the road! ❤️

Growing up with the Intel 8086, MS-DOS and later Windows I knew nothing about UNIX except that everyone on the still small Internet thought it was the superior operating system.

It is always advised to trust people on the Internet 😉 and so I started installing FreeBSD on a spare machine. And then I built a router. I have absolutely no idea how I made it work. At that time my knowledge about computer networks was practically zero. Internet was still delivered via dial-up lines. On top of that, I had not a clue how the UNIX shell worked. But I made it work. Somehow.

Since that early exposure to the UNIX shell, I occasionally uncover little wonders and surprises. In this article, I’d like to share a few of these “uhm… what?” moments that I had when I used echo and the shell. I’ll be using the Bourne Again Shell on Ubuntu Bionic for the demos. So it is not really a UNIX shell but close enough.

Let’s first look at echo and how it is invoked by the shell:

$ echo

$

If we don’t provide an argument, it prints a newline. We can suppress the newline by adding -n:

$ echo -n
$

That is a feature of echo, not the shell, but we will come back to this later. Let’s look at a popular shell feature now: Comments. We can add a comment to a command by using the # symbol. Comments will not be interpreted and are not part of the arguments that a binary is called with.

$ echo foo # bar
foo
$ echo foo #bar
foo

The string bar is never echoed because it is part of a comment. There is one important thing to notice: A comment must be a word. It cannot appear in the middle of another word. If it does, it is not a comment anymore:

$ echo foo#bar
foo#bar
$ echo foo# bar
foo# bar

Besides comments, there is more pre-processing the shell can do for us. We all know about the glob patterns, right? The patterns are applied to all files in a directory and save us a lot of time typing file names.

Let’s try that in an empty directory:

$ echo *
*

Uhm… wait? Isn’t the asterisk supposed to be replaced with the file names in that directory? Well, it is. But if there are no files to match, the shell keeps the asterisk and hands it over to echo. I find this surprising. This could be a problem in scripts maybe.

Ok, let’s add a file named * to that directory and see the difference:

$ touch '*'
$ echo *
*

🧐 I can’t tell the difference. Can you? So how do we know if this is an actual filename or just the asterisk? Let’s check with ls if that file really exists:

$ ls
'*'

It does. And the filename is…? Is the file named * or '*'? Let’s find out using stat:

$ stat '*'
  File: *
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 801h/2049d	Inode: 3147147     Links: 1
✂️

The filename is *. ls is just friendly enough to quote the name, that is why it appears as '*'. So if we call echo * we should see the filename * and not the asterisk *. We can prove that the shell does the globbing by adding another file to the directory and see if we get both files:

$ touch hello
$ echo *
* hello

Yeah, all the files are there. All the files? Well, not really. There is a convention that the shell ignores files that start with a . when matching the glob pattern. In every directory, there is a self-link named . and a link to the parent directory ... A special case is the root directory / in which both, . and .., are self-links. So how do we get all the files now? If we match for files that start with a dot, we do not get the other files. If we match for * the shell will hide the dot-files from us. The trick is to use both:

$ echo .* *
. .. * hello

The shell expands both patterns for us. The first one matched the dot-files only, the second one all files but the dot files. All results are passed to echo.

But enough about glob patterns. Remember the -n option from earlier? Let’s say we want to echo -n. How would we do that?

$ echo -n
$ 

😕 Clearly, that does not work. How about this?

$ echo "-n"
$

🙁 Nope. And this?

$ echo '-n'
$

😣 Nada. Nein. Njet. But why? The shell is processing the words and handling them to echo afterward. It does not make a difference if we quote them. echo always sees -n and thinks it is a command line option. So, how about using some force?

$ echo -n -n
$

😫 Impossible! Now echo thinks we are a bit out of sync by passing the same option twice. Forgivable as it is, it ignores one of the options. But hey, I remember something about the double dash in bash! We can use it to mark the end of a parameter list. Let’s give that a shot:

$ echo -- -n
-- -n

😤 So close. But still not there. The shell won’t help us here. Luckily, the authors of echo have built something in that we can leverage. Using the -e option we can treat the input as an expression. Let me quickly show you why using an expression alone will not save us:

$ echo -e "-n"
$

😡 echo still thinks we are passing an argument. However, if we make this argument not look like an option, echo will think of it as a string.

$ echo -e "-n\n"
-n

$ 

🤨 Almost there! Now we have the output we want. And an extra newline. By treating the string as expression, we unlocked the special control character \c. It can be used to indicate that we want to stop processing at the point where it appears in a string. Let’s combine this:

$ echo -e '-n\n\c'
-n
$

🤩 Hooray! We made it. We can even apply our new knowledge to make echo print -n but without a newline:

$ echo -e '-n\c'
-n$

Given what we just learned, what do you think this command does?

$ >>'>' echo **'*'

UNIX-like processes and the imaginary disco ball

Earlier this week I found myself discussing UNIX process states with my colleague Michael. When it came to orphaned processes and daemonizing I vaguely remembered that I had to fork() twice to get a daemon process. That statement was immediately challenged by Michael. Rightfully so, because forking twice is not the key action here. The missing piece was to obtain a new session between the first and the second forking. This ensures a properly daemonized process can not obtain a terminal again. 🤯

But let’s slow down a little. What is all this forking and sessions that I am talking about? Time for a quick refresher in operating system internals.

That is a wonderful chance to write some good ol’ C code! Why C? Because it is the right language for the job. Most operating system kernels are written in C (and assembler). Furthermore, according to my other colleague Robert, the moment you start writing C “the light dims, an imaginary disco ball lowers from the ceiling, and an encouraging atmosphere is created”. I could not have described the coding C feeling better!

coding in C as described by Robert

I prepared a couple of Docker images to make it easier to follow the article. Please find the source code at Github in the process-fun repository. Small, ready to run images are available on Dockerhub in the (surprise!) process-fun repository. Run the images on your local Docker-enabled machine as you like.

A Natural Process State

Processes don’t exist just for fun. They are there to get work done, play music, mine crypto coins, train a neural net, or send spam emails. A process’ natural state is running or runnable. That means everything is more or less in order and the Kernel may grant some computing time to the process. Writing a program that does something meaningful is hard. Writing a program that does just something is much easier. Let’s stick with easy. 😉

#include <time.h>

int main(int argc, char *argv[]) {
    // run for 10 seconds
    time_t end = time(0) + 10;

    // do something
    volatile int i;
    while (time(0) < end) {
        i++;
    }

    // exit ok
    return 0;
}

This program just runs for ten seconds, incrementing an integer to make sure some computing time is wasted. Let’s run it and see what the state of the corresponding process is:

$ docker run danrl/process-fun:running
✂️
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 R    process-fun
    9     1     1     1 R    ps

We find the process state in the STAT column. Little surprise here, it is R which stands for running or runnable. Boring!

Just in case you are wondering: The tool used to generate the process list is ps but with a custom output format.

Sleeping And Waiting

A wise person once told me that the best things in life are worth waiting for. I don’t know if that holds true for the following piece of code. 🤔

#include <unistd.h>

int main(int argc, char *argv[]) {
    // wait
    sleep(10);

    // exit ok
    return 0;
}

This program just sleeps for ten seconds. Let’s run it and see what the state of the corresponding process is:

$ docker run danrl/process-fun:waiting
✂️
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    6     1     1     1 S    process-fun
    8     1     1     1 R    ps

Again, expected result, the process is in the interruptible sleep state which is indicated by S. Still pretty boring, right? Let’s awake the undead, that should be more fun! 🧟‍♀️🧟‍♂️

Bad Parenting

Once ready and started, we expect a process to be either doing something, waiting for something, being stopped (e.g. for debugging), or terminated. But there is another state: The defunct or zombie state. This happens when a child process, that was forked off from a parent process, has been terminated but the parent process has not yet collected the return state. The return state of a child can be collected using the wait() call (we call that reaping). However, a parent process may be busy doing something else or just decided not to reap. In that case, the child process, although not alive, still has a process control block maintained by the Kernel. The process is neither dead nor alive. A true zombie!

Here are a few lines of code that create a zombie process:

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child exits immediately
        return 0;
    }
    // parent sleeping, not eager to call wait()
    sleep(10);

    // yeah, maybe now...
    wait(NULL);

    // exit ok
    return 0;
}

This program forks and immediately exists the child process. The parent process is lazy, it sleeps for a couple of seconds before it bothers to reap the child. During that timeframe, in between the child’s death and the parent’s call to wait() we run ps:

$ docker run danrl/process-fun:zombie
✂️
Starting process-fun... done!
Process list:
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 S    process-fun
    9     7     1     1 Z    process-fun <defunct>
   10     1     1     1 R    ps

The child process is in the defunct or zombie state. In the PID column, we can see the unique process identifier (PID). The PPID column shows us the parent’s process identifier (PPID) respectively. The output of ps confirms that process number 9 is a zombie child of process number 7. Once process number 7 calls wait() or exits, the zombie child will be removed from the process list.

The Forgotten Child

Although not waiting for a child process may be considered bad parenting, it is not the worst that can happen to a process. What happens if we forked off a child process and then the parent process terminates but the child is still out there?

Here is the code for that:

#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child being lazy
        sleep(30);
        return 0;
    }
    // parent exiting without waiting for the child
    sleep(5);
    return 0;
}

After about five seconds the parent process exits. The child process becomes orphaned, meaning it has no valid parent process identifier anymore. Orphaned processes become foster children of the process with the PID 1 which is often the init process. In our container, the PID 1 process is the state.sh shell script (the CMD configured in the Dockerfile).

$ docker run danrl/process-fun:orphaned
✂️
Starting process-fun... done!
Process list (before orphaned):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    6     1     1     1 S    process-fun
    8     6     1     1 S    process-fun
    9     1     1     1 R    ps

The parent process has PID 6 and the child process is identified by PID 8. The child’s parent is, therefore, the process with PID 6 (see column PPID). Once the parent process terminates the situation changes:

Process list (after orphaned):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    8     1     1     1 S    process-fun
   11     1     1     1 R    ps

Now the process with PID 6 is gone and the child process is assigned to a new parent process. The PPID now reads 1.

Summoning The Daemon

In the previous example, the child process was successfully detached from the parent process. But it was still part of the same (terminal) session. This means the process could theoretically attach to the terminal session again. We call a process a daemon process when it can not attach to a terminal session. This means it has to migrate to a new session to be fully detached from the parent process and the parent process’ session.

Let’s have a look how we can make that happen:

#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        // child 1 migrates to a new session
        setsid();
        pid_t pid2 = fork();
        if (pid2 == 0) {
            // child 2 (daemon) being lazy
            sleep(30);
            return 0;
        }
        // child 1 exiting ok
        return 0;
    }
    // parent exiting without waiting for the child
    sleep(5);
    return 0;
}

First, we fork off a child process and exit the parent. This orphans the child process and its new parent will be the init process. In the child, we request to be assigned to a new session. By doing so, we become the session leader. We don’t want a daemon to be a session leader, because we do not want a daemon to be able to attach to a terminal. The second child (the child of the first child) will finally belong to the new session which is different from the (terminal) session we originally ran the parent process from.

$ docker run danrl/process-fun:daemonized
✂️
Starting process-fun... done!
Process list (before daemonizing):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
    7     1     1     1 S    process-fun
    9     7     9     9 Zs   process-fun <defunct>
   10     1     9     9 S    process-fun
   11     1     1     1 R    ps

Here we see all three processes at once:

  • PID 7: The parent process.
  • PID 9: The first child. It is a zombie process since the parent is still running. It is the session leader of the new session. This is indicated by the lowercase s in the STAT column.
  • PID 10: The second child and the new daemonized process.

After waiting for the parent and the first child to terminate, the process table looks like this:

Process list (after daemonizing):
  PID  PPID  PGID  SESS STAT COMMAND
    1     0     1     1 Ss   state.sh
   10     1     9     9 S    process-fun
   13     1     1     1 R    ps

The process with PID 10 is now a true daemon. 😈

And this is why we fork twice. Mystery solved!

Further Reading

This was a quick, practical roundup of the most common process states that I see in my daily life as SRE. There is more on Process States on Wikipedia and in the UNIX Internals book.