IPv6 and Let's Encrypt TLS on Google Kubernetes Engine

In a previous article I described how I deployed my blog on kubernetes and served it over HTTP. Today I’d like to add three more pieces:

  • Automate Let’s Encrypt certificate retrieval (and renewal)
  • Add a TLS-capable load balancer
  • Add IPv6 support (because it’s 2017)

Automating certificate management

Thanks to Let’s Encrypt web servers can request trusted and signed certificate for free in a fully automated manner. A web traffic load balancer is basically a proxy server, acting like a web server on the frontend and like a HTTP client towards the backend. So why not let the load balancer’s fronted (the web server part) take care of fetching a certificate from Let’s Encrypt? We have seen other web servers, such as Caddy, taking care of certificate management.

Unfortunately, this is not a feature that is available on Google Cloud Platform (GCP). Furthermore, I can imagine this working fine with a single load balancer, but failing at scale in a multi-balancer setup. The reason is, that Let’s Encrypt has an API limit. One can request only so many certificates in a week. But even if we had access to an unlimited API, it would still be a non-trivial task to make sure the right load balancer is responding to the HTTP challenge request from Let’s Encrypt.

What we need to address the problem is a software that retrieves and renews certificates and deploys them to our load balancer(s) whenever a relevant change occurs. A relevant change in this sense could be a modified hostname, a new subdomain, or the nearing expiration date of a currently deployed certificate. Fortunately, there is a tool for that already. There are actually multiple tools, and they run on kubernetes, making deployment really straightforward:

In this article we will use kube-lego, but I can highly recommend cert-manager, too. Of course, for non-production use cases only. 😉

Note: If your kubernetes cluster has Role Based Access Control (RBAC) enabled then apply a profile to kube-lego that grants the required privileges before you proceed!

Deploying kube-lego

Like every other workload, we like to cage kube-lego into a dedicated namespace. We define the namespace in k8s/kube-lego.ns.yaml:

apiVersion: v1
kind: Namespace
  name: kube-lego

And create it via the command line tool kubectl:

$ kubectl create -f k8s/kube-lego.ns.yaml

The next step is to define and configure the kube-lego deployment in k8s/kube-lego.deployment.yaml. For the initial deployment of kube-lego, I recommend setting LEGO_LOG_LEVEL to debug:

apiVersion: extensions/v1beta1
kind: Deployment
  name: kube-lego
  namespace: kube-lego
  replicas: 1
        app: kube-lego
      - name: kube-lego
        image: jetstack/kube-lego:0.1.5
        imagePullPolicy: Always
        - containerPort: 8080
        - name: LEGO_LOG_LEVEL
          value: info  # more verbose: debug
        - name: LEGO_EMAIL
          value: mail@example.com  # change this!
        - name: LEGO_URL
          value: https://acme-v01.api.letsencrypt.org/directory
        - name: LEGO_NAMESPACE
              fieldPath: metadata.namespace
        - name: LEGO_POD_IP
              fieldPath: status.podIP
            cpu: 100m
            memory: 50Mi
            cpu: 50m
            memory: 50Mi
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 1

Once the namespace is ready we can deploy and check if the deployment succeeded:

$ kubectl create -f k8s/kube-lego.deployment.yaml
$ kubectl -n kube-lego get deployments
kube-lego   1         1         1            1           10m

Tip: Consider using a configmap as alternative to hard-coding configuration parameters into a deployment.

Addding a TLS-enabled load balancer

With kube-lego there are two different ways of defining a load balancer. The easier (but more expensive) one is to use a load balancer provided by GCP. The alternative is deploying an nginx ingress pod and using that as the load balancer. I got good results from both in my experiments. For the sake of brevity, we will use the quicker GCP way in this article.

First, we need to create a kubernetes ingress object to balance and proxy incoming web traffic. The important part here is, that we can influence the behavior of the ingress object by providing annotations.

  • kubernetes.io/ingress.class: "gce" This annotation let’s kubernetes know that we want to use a GCP load balancer for ingress traffic. Obviously, this annotation does not make sense on kubernetes installations which do not run on GCP.
  • kubernetes.io/tls-acme: "true"` This annotation allows kube-lego to manage the domains and certificates referenced in this ingress object for us. If we leave out this annotation, kube-lego will refrain from touching it or its associated kubernetes secrets.
apiVersion: extensions/v1beta1
kind: Ingress
    kubernetes.io/tls-acme: "true"
    kubernetes.io/ingress.class: "gce"
  name: website
  namespace: website
  - host: test.danrl.com
      - backend:
          serviceName: website
          servicePort: 80
        path: /
  - hosts:
    - test.danrl.com
    secretName: test-danrl-com-certificate
$ kubectl create -f k8s/website.ingress.yaml

It may take a while for the ingress object to become fully visible. GCP is not the fastest fellow to spin up new load balancers in my experience. ⏱

$ kubectl -n website get ingress
NAME      HOSTS            ADDRESS       PORTS     AGE
website   test.danrl.com   80, 443   3m

Very soon after the load balancer is up and running, kube-lego should jump in and notice the lack of a certificate. It will fetch one and deploy it automatically. Awesome! We can watch this process in the logs. I use Stackdriver for collecting logs from kubernetes workloads, but there are many other options as well. Wherever your logs are, lookout for a line similar to this one:

level=info msg="requesting certificate for test.danrl.com" context="ingress_tls" name=website namespace=website

Once the requested certificate has been received, kube-lego will create or update the secret for it. We can verify the existence of the secret:

$ kubectl -n website get secrets
NAME                          TYPE                                  DATA      AGE
test-danrl-com-certificate    kubernetes.io/tls                     2         22m

From now on, kube-lego will monitor the certificate and renew and replace it as necessary. The certificate should also show up in the load balancer configuration on the GCP console at Network Services → Load balancing → Certificates (you may have to enable the advanced menu at the bottom):

initial certificate,small

To test the automation further we could trigger a certificate renewal by tweaking the LEGO_MINIMUM_VALIDITY environment variable (optional). For reference, here is the automatically retrieved follow-up certificate I got:

followup certificate,small

Adding IPv6 to the load balancer

In the standard configuration GCP load balancers are started without an IPv6 address assigned. Technically, they can handle IPv6 traffic and we are free to assign IPv6 addresses to a GCP load balancer. To do this, we first have to reserve a static IPv6 address. This is done at VPC network → External IP addresses.

vpc external addresses,small

Reserving an address means, that this address can not be used by anyone else on the platform. If we reverse addresses but don’t use them charges will apply.

reserve static address,small

Once the address is reserved, we can assign it to the load balancer. To do that, we have to add an additional frontend for every address and every protocol (HTTP, HTTPS). That is, two frontends for each additional address.

add ipv6 to load balancer

We have to do the same for HTTPS, too, of course. When setting the IPv6 HTTPS frontend, we select the current certificate from the dropdown menu.

Almost automated… 😤

And now I have some bad news for you. ☚ī¸ IPv6 load balancer frontends, certificate renewal via kube-lego, and GCP load balancers do not go very well together (as of time of writing). When kube-lego renews the certificates it ignores manually added frontends. This means, the certificate for the IPv6 address will not be replaced automatically. Very frustrating!

certificates differ

In the screenshot we can see the new certificate k8s-ssl-1-website2-website2–a02b6ae745a706f8 alongside the old one k8s-ssl-website2-website2–a02b6ae745a706f8. Only for the IPv4 frontend was the certificate replaced.

My Blog on Kubernetes

In my job as Site Reliability Engineer I deploy new or updated services with zero downtime multiple times per day. In this article I’d like to explain how I usually perform this task by using my website as example service.

The idea for applied over-engineering to put my website on kubernetes came from this tweet by @dexhorthy.


As you can see in the picture he tweeted, running a small website on a planet-scale orchestration platform is like driving around a small load on a flatbed truck. However, using a service with limited complexity as an example allows us to concentrate on the important aspects of this article: The build pipeline and kubernetes. Kubernetes is an open source container orchestration software which was inspired by Google’s famous job scheduler Borg.


So here is what we are going to do:

Here is a visualization to sprinkle some color into this topic:



I previously described how my website’s source is written in Markdown and compiled into static HTML using Hugo. Many of my articles in this blog contain syntax-highlighted code examples. The syntax highlighting is not performed by Hugo itself but it is done using an external library called Pygments. Rendering my website’s static HTML therefore requires both softwares to be installed, Hugo and the Pygments library. Although essential for rendering, neither of both is need for serving the website. They are considered build tools in this context, comparable to a compiler for a program. Once a deployable artifact has been created, the build tools are no longer needed and should not be deployed or be part of the service in production.

To containerize the website we need two stages:

  • The first stage is a defined build environment containing all required build tools and the source of the website.
  • The second stage is the build artifact (HTML and assets) and a webserver to serve the artifact over HTTP.

Probably the most famous container building solution is Docker. Starting from version 17.05 Docker supports multi-stage builds. multi-stage builds allow us to have a fully fledged build environment and still produce a lean artifact by moving specific files from one stage to the next and throwing away the rest of the stage.

Here is the multi-stage Dockerfile I used to containerize my website:

FROM ubuntu:latest as STAGEONE

# install hugo
ADD https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz /tmp/
RUN tar -xf /tmp/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz -C /usr/local/bin/

# install syntax highlighting
RUN apt-get update
RUN apt-get install -y python3-pygments

# build site
COPY source /source
RUN hugo --source=/source/ --destination=/public/

FROM nginx:stable-alpine
COPY --from=STAGEONE /public/ /usr/share/nginx/html/

In stage one we fetch the latest Ubuntu Linux and name the stage STAGEONE. Then we install a newer version of Hugo and fetch Pygments via the distribution’s repositories. Unfortunately, the Hugo version that is in the repositories does not support a feature that my website needs, otherwise we would have installed Hugo via apt-get, too. Once the software is in place, we build the website by running hugo. Here the resulting folder /public is the build artifact. The second stage of the build is described in the last three lines. It starts with a minimal image containing the nginx webserver and adds the build artifact from the previous stage using the --from=STAGEONE flag. Everything else from stage one is thrown away. The resulting container is solely based on the last stage’s commands.

The Dockerfile is stored in the root directory of the private git repository that contains the website’s source code.

Build Pipeline

When a change gets merged into the master branch it is considered production ready. For a simple service like my website a change is usually a fixed typo or a new blog article resulting in a new version of the website. For more complex services I tend to build an image for every change in every branch and automatically push the resulting image through regression tests and a smoke test. But let’s not over-engineer an already over-engineered example. 🤓 So for every commit in the master branch we want a new image to be built.

Google Container Builder has become my favorite tool for building images. I especially like that I do not have to run a fully fledged continuous integration framework including build slaves and related maintenance work. All we need to create a new image is a build trigger and a source repository that is accessible from Google Cloud Platform (GCP). This can either be a repository hosted directly on the platform or at a third party supporting oAuth.

Fortunately, GitHub and GCP work well together. Here is the config for our build trigger:

container builder trigger,small

We can manually trigger the build or just commit to the master branch. Once the build was successful, we can pull the image from kubernetes (or any other container orchestration) via gcr.io/danrl-com/website:master. The tag master indicates the latest build of that branch. Also very popular is the use of a tag named latest to indicate the latest build, which may or may not be stable.

container builder images

We can also pull specific versions, e.g. for performing a roll-back, using the corresponding tags instead of master. If we wanted to pin the deployment to the last version that was created on Nov 1 2017, we could use this URL: gcr.io/danrl-com/website:2409248a51dd (see tags in screenshot).

Kubernetes Cluster

Now that we have our service nicely packed in an image, it is time to fire up the underlying infrastructure: A kubernetes (k8s) cluster.

For that we head over to Google Kubernetes Engine (GKE) and either use an existing cluster of create a new one. Here is my cluster configuration for reference:

GKE cluster config,small

My cluster is backed by one node pool of three nodes. It is possible to run smaller node pools. If you want to make use of automatic node updates without downtime, it is advised to use at least three nodes in a node pool. One node pool is usually enough, though.

GKE node pool config,small

Note: Be aware that while kubernetes itself is pretty cheap on GCP (even free for small clusters), the node pools may be surprisingly costly. For every node the standard Google Compute Engine pricing applies. The cluster used in this article, for example, costs me between $40 and $50 per month.

Cloud SDK and Cloud Shell

For the following parts of the article make sure you have installed the Google Cloud SDK including authorization of your Google account. If you don’t want to go through the hassle of installing the Cloud SDK you can also fallback to the Cloud Shell, a in-browser command line connected to a virtual machine with Cloud SDK already installed.

cloud shell

However, you would need to create the YAML files on the ephemeral Cloud Shell machine instead of your computer. Don’t forget to back them up, though, as the Cloud Shell machine will disappear for good after some idle time!


I like to store my kubernetes files in a k8s subfolder in the repository of the related service. This puts them under version control and keeps them close to the service they belong to. For multi-container applications I tend to create an exclusive repository only for kubernetes files, often distinguishing between production and test namespaces.

For our simple and small website service it is sufficient to keep the files in a subfolder of the root directory.

In kubernetes we use namespacing to separate workloads from each other. Our first action is therefore creating a namespace for the website service by defining it in k8s/ns.yaml:

apiVersion: v1
kind: Namespace
  name: website

To create the namespace we use kubectl from the command line.

$ kubectl create -f k8s/ns.yaml

We can always list our name namespaces via:

$ kubectl get ns
NAME            STATUS    AGE
default         Active    19d
kube-public     Active    19d
kube-system     Active    19d
website         Active    1m

Let’s now deploy our application into the new namespace. For that we need another YAML file. This time we will create a so-called deployment. A deployment consists of one or more containers which make up an application. That application is then started with a replication factor. For our service we will use a replication factor of three. That is, we will run three instances of the website container. This allows us to do rolling updates to the application as well as to the underlying nodes. Given that the containers are spread across the nodes in the node pool, we can safely pull out a node of rotation for system upgrades without harming the availability of our application. We can also replace the containers with new images one by one if we want to update the application. This is slightly simplified, there is much more that affects availability here. I highly recommend reading the kubernetes documentation if you like to learn more about deployments.

Moving on, here is the k8s/deployment.yaml file for the website application.

apiVersion: extensions/v1beta1
kind: Deployment
  name: website
  namespace: website
  replicas: 3
        app: website
      - name: website
        image: gcr.io/danrl-com/website:master
        imagePullPolicy: Always
        - containerPort: 80

After applying the configuration the application will be available internally to kubernetes on port 80.

$ kubectl create -f k8s/deployment.yaml

Let’s check the result:

$ kubectl -n website get deployments
website   3         3         3            3           5m

The workload should now look similar to this on GCP:

GKE workload config,small

Now that we have an application up and running it is time to turn it into a service. For a website that is making it available to the general public. There are different ways to publish an application in kubernetes. In our example we will use a service for that.

The corresponding YAML file looks like this:

kind: Service
apiVersion: v1
  name: website
  namespace: website
    app: website
  - protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

You may wonder: How does the service know which application it is supposed to serve? After all, there could be multiple applications running on port 80. This is what the selector is for. Compare the selector app: website from service.yaml with the label app: website from deployment.yaml. This is where the connection is being made.

Let’s create the service:

$ kubectl create -f k8s/service.yml

This time we have to be a bit patient with kubernetes. It takes a while for kubernetes to allocate an external IPv4 address from GCP. After a minute, though, the services should be ready and look similar to this:

$ kubectl -n website get service
NAME      TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
website   LoadBalancer   80:32005/TCP   45s

The website is now available under the given IPv4 address over HTTP. Hooray! 🎉


Kubernetes is a state-of-the-art way of running services and orchestrating containerized applications. Running a small, static website via Kubernetes is a nice example project and total overkill at the same time. 🤩đŸ¤Ē In one of the next articles I will show how we can bring the website service into the IPv6 world using a cloud load balancer and an ingress object. We will also add TLS certificates and automate their renewal.

How I continuously deploy my static website

How I configured a deployment pipeline for my website.


The source of my website is managed in a local git repository. It consists of markdown and image files for content, HTML, CSS and JS files for the theme and layout, and some files for visitors to download such as PDFs. Everything is compiled using a static website generator. I switched from Jekyll to Hugo for that because I liked Hugo’s theme engine more. I don’t exactly remember why, but for some reason I use to compile the source files into static HTML locally. It would probably make more sense to do that on a server as part of the deployment pipeline. After compilation I usually check the resulting files into the repository as well, placing them in a folder named public. That unnecessarily blows up the repository, so I do not really recommend that.

Once the website is compiled and the result is pushed to a private GitHub repository, magic kicks in. My website is served from a virtual machine running on Vultr infrastructure hosted in New York. The virtual machine runs Ubuntu Server as operating systems and Caddy as webserver. Caddy is an awesome, extensible HTTP/2 webserver written in Go with built-in TLS certificate management. By default, Caddy will automatically request (and renew) TLS certificates from Let’s Encrypt for all the domains and subdomains it serves. Bringing a TLS-secured website online has never been easier I think.

For my website, I use the git plugin for Caddy. The git plugin exposes a webhook that can be triggered by GitHub. I configured GitHub to trigger the webhook for every push to the website repository. Once triggered, Caddy will pull the latest changes thus automatically updating the files it serves.

This leaves me with only little to do when I want to publish new articles: Edit the markdown source file, compile the website, push and commit. Everything after that is taken care of by the pipeline I just described.

I’ll use the following paragraphs to share my configuration. Feel free to build upon it and improve the pipeline.

Configuring GitHub

Caddy expects the webhook payload to be in JSON format. Setting the right content type makes it easier for Caddy to recognize the payload. I highly recommend using a webhook secret! Without the secret, anyone knowing the webhook URL could trigger the webserver to pull the repository. However, as there would be no changes in that case, no visible damage would happen and no information would leak. But it is a waste of resources and it is a nice distributed denial of service (DDOS) attack surface.

Note: If you are using a private repository, make sure to allow the machine running Caddy to connect to your GitHub account. You can use the per-repository deploy keys for that. Just place the SSH pubkey of the webserver in read-only mode there.

Configuring Caddy

Configuring Caddy is usually pretty straightforward. The important parts for the pipeline to work are the lines starting with hook and hook_type. The first parameter for hook defines the webhook’s absolute URL. The second parameter defines the secret. Both parameter must match the GitHub webhook configuration.

danrl.com, www.danrl.com {
    root /srv/danrl.com/public/
    git {
        repo git@github.com:danrl/danrl.com.git
        path /srv/danrl.com/
        hook /webhook WEBHOOKSECRET
        hook_type generic

This is how updating the website looks like in the webserver’s log file:

caddy[2431]: Received pull notification for the tracking branch, updating...
caddy[2431]: From github.com:danrl/danrl.com
caddy[2431]:  * branch            master     -> FETCH_HEAD
caddy[2431]:    d3b5f7b..3fb9302  master     -> origin/master
caddy[2431]: Updating d3b5f7b..3fb9302
caddy[2431]: Fast-forward
caddy[2431]: 2 files changed, 2 insertions(+), 2 deletions(-)
caddy[2431]: ssh://git@github.com:danrl/danrl.com.git pulled.

There is more

If you consider using Caddy as webserver I recommend having a look at the other plugins. There is some really useful stuff out there, for example a Hugo administration plugin.