Basic HTTPS on Kubernetes with Traefik

Back in February of 2018 Google’s Security blog announced that Chrome would be start displaying “not secure” for websites starting in July. In doing so they cemented HTTPS as part of the constantly-rising baseline expectations for modern web developers.

These constantly-rising baseline expectations are written into a new generation of tools like Traefik and Caddy. Both are written in Go and both leverage Let’s Encrypt to automate away the requesting and renewal, and by extension the unexpected expiration, of TLS certificates. Kubernetes is another modern tool aimed at meeting some of the other modern baseline expectations around monitoring, scaling and uptime.

Using Traefik and Kubernetes together is a little fiddly, and getting a working deployment on a cloud provider even more so. The aim here is to show how to use Traefik to get Let’s Encrypt based HTTPS working on the Google Kubernetes Engine.
An obvious prerequisite is to have a domain name, and to point it at a static IP you’ve created.

Let’s start with creating our project:

mike@sleepycat:~$ gcloud projects create --name k8s-https
No project id provided.

Use [k8s-https-212614] as project id (Y/n)?  y

Create in progress for [].
Waiting for [operations/cp.6673958274622567208] to finish...done.

Next lets create a static IP.

mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses create k8s-https --region=northamerica-northeast1 --network-tier=PREMIUM
Created [].
mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses list
NAME       REGION                   ADDRESS        STATUS
k8s-https  northamerica-northeast1  RESERVED

Because I am easily amused, I own the domain In my settings for that domain I created an A record pointing at that IP address. When you have things set up correctly, you can verify the DNS part is working with dig

mike@sleepycat:~$ dig

; <<>> DiG 9.13.0 <<>>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62565
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 4096
;		IN	A


;; Query time: 188 msec
;; WHEN: Tue Aug 07 23:02:11 EDT 2018
;; MSG SIZE  rcvd: 62

With that squared away, we need to create our Kubernetes cluster. Before we can do that we need to get a little administrative stuff out of the way. First we need to get our billing details and link them to our project.

mike@sleepycat:~$ gcloud beta billing accounts list
ACCOUNT_ID            NAME                OPEN  MASTER_ACCOUNT_ID
0X0X0X-0X0X0X-0X0X0X  My Billing Account  True
mike@sleepycat:~$ gcloud beta billing projects link k8s-https-212614 --billing-account 0X0X0X-0X0X0X-0X0X0X
billingAccountName: billingAccounts/0X0X0X-0X0X0X-0X0X0X
billingEnabled: true
name: projects/k8s-https-212614/billingInfo
projectId: k8s-https-212614

Then we’ll need to enable the Kubernetes engine for this project.

mike@sleepycat:~$ gcloud services enable --project k8s-https-212614
Waiting for async operation operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd to complete...
Operation finished successfully. The following command can describe the Operation details:
 gcloud services operations describe operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd

Let’s create our cluster. Because both Kubernetes and Google move pretty quickly, it’s good to check the current Kubernetes version for your region with something like gcloud container get-server-config --region "northamerica-northeast1". In my case that shows “1.10.5-gke.3” as the newest so I’ll use that for my cluster. If you are interested in beefier machines explore your options with gcloud compute machine-types list --filter="northamerica-northeast1" but for this I’ll slum it with a f1-micro.

mike@sleepycat:~$ gcloud container --project=k8s-https-212614 clusters create "k8s-https" --zone "northamerica-northeast1-a" --username "admin" --cluster-version "1.10.5-gke.3" --machine-type "f1-micro" --image-type "COS" --disk-type "pd-standard" --disk-size "100" --scopes "","","","","","","" --num-nodes "3" --enable-cloud-logging --enable-cloud-monitoring --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard --enable-autoupgrade --enable-autorepair

Creating cluster k8s-https...done.
Created [].
To inspect the contents of your cluster, go to:
kubeconfig entry generated for k8s-https.
k8s-https  northamerica-northeast1-a  1.10.5-gke.3  f1-micro      1.10.5-gke.3  3          RUNNING

You will notice that kubectl (which you obviously have installed already) is now configured to access this cluster.
As part of the Traefik setup we are about to do we will need to change some RBAC rules. To do that we will need to create a cluster admin role and load that into our cluster.

mike@sleepycat:~$ cat cluster-admin-rolebinding.yaml 
kind: ClusterRoleBinding
  name: owner-cluster-admin-binding
  kind: ClusterRole
  name: cluster-admin
- apiGroup:
  kind: User
  name: <your_username@your_email_you_use_with_google_cloud.whatever>

mike@sleepycat:~$ kubectl apply -f cluster-admin-rolebinding.yaml

With that done we can apply the rest of the config I’ve posted in a snippet here with kubectl apply -f https.yaml.
It’s a fair bit of yaml, but a few things are worth pointing out.

First, we are running a single pod with my helloworld image. It’s just the output of create-react-app that I use for testing stuff.
If you look at the traefik-ingress-service, you will notice we are telling Google we want the service mapped to the static IP we created earlier using loadBalancerIP.

apiVersion: v1
kind: Service
  name: traefik-ingress-service
  namespace: kube-system
  - name: http
    port: 80
    protocol: TCP
  - name: https
    port: 443
    protocol: TCP
  - name: admin
    port: 8080
    protocol: TCP
    k8s-app: traefik-ingress-lb
  type: LoadBalancer

When looking at the traefik-ingress-controller itself, it’s worth noting the choice of kind: Deployment instead of kind: DaemonSet. This choice was made for simplicity’s sake (only a single pod will read/write to my certs-claim volume so no Multi-Attach errors), and means that I will have a single pod acting as my ingress controller. Read more about the tradeoffs here.

Here is the traefik-ingress-controller in it’s entirety. It’s a long chunk of code, but I find this helps see everything in context.

Special note about the args being passed to the container; make sure they are strings. You can end up with some pretty baffling errors if you don’t. Other than that, it’s the full set of options to get you TLS certs and automatic redirects to HTTPS.

apiVersion: extensions/v1beta1
kind: Deployment
    k8s-app: traefik-ingress-lb
  name: traefik-ingress-controller
  namespace: kube-system
        k8s-app: traefik-ingress-lb
      - args:
        - "--api"
        - "--kubernetes"
        - "--logLevel=DEBUG"
        - "--debug"
        - "--defaultentrypoints=http,https"
        - "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
        - "--entrypoints=Name:https Address::443 TLS"
        - "--acme"
        - "--acme.onhostrule"
        - "--acme.entrypoint=https"
        - ""
        - ""
        - ""
        - "--acme.httpchallenge"
        - "--acme.httpchallenge.entrypoint=http"
        image: traefik:1.7
        name: traefik-ingress-lb
        - containerPort: 80
          hostPort: 80
          name: http
        - containerPort: 443
          hostPort: 443
          name: https
        - containerPort: 8080
          hostPort: 8080
          name: admin
            - NET_BIND_SERVICE
            - ALL
        - mountPath: /certs
          name: certs-claim
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 60
      - name: certs-claim
          claimName: certs-claim

The contents of the snippet should be all you need to get up and running. You should be able to visit your domain and see the reassuring green of the TLS lock in the URL bar.

A TLS certificate from Let's Encrypt

If things aren’t working you can get a sense of what’s up with the following commands:

kubectl get all --all-namespaces
kubectl logs --namespace=kube-system traefik-ingress-controller-...

Where to go from here

As you can see, there is a fair bit going on here. We have DNS, Kubernetes, Traefik and the underlying Google Cloud Platform all interacting and it’s not easy to get a minimal “hello world” style demo going when that is the case. Hopefully this shows enough to give people a jumping off point so they can start refining this into a more robust configuration. The next steps for me will be exploring DaemonSets and storing the acme.json in a way that multiple copies of Traefik can access, maybe a key/value like consul. We’ll see what the next layer of learning brings.