Back in February of 2018 Google’s Security blog announced that Chrome would be start displaying “not secure” for websites starting in July. In doing so they cemented HTTPS as part of the constantly-rising baseline expectations for modern web developers.
These constantly-rising baseline expectations are written into a new generation of tools like Traefik and Caddy. Both are written in Go and both leverage Let’s Encrypt to automate away the requesting and renewal, and by extension the unexpected expiration, of TLS certificates. Kubernetes is another modern tool aimed at meeting some of the other modern baseline expectations around monitoring, scaling and uptime.
Using Traefik and Kubernetes together is a little fiddly, and getting a working deployment on a cloud provider even more so. The aim here is to show how to use Traefik to get Let’s Encrypt based HTTPS working on the Google Kubernetes Engine.
An obvious prerequisite is to have a domain name, and to point it at a static IP you’ve created.
Let’s start with creating our project:
mike@sleepycat:~$ gcloud projects create --name k8s-https No project id provided. Use [k8s-https-212614] as project id (Y/n)? y Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/k8s-https-212614]. Waiting for [operations/cp.6673958274622567208] to finish...done.
Next lets create a static IP.
mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses create k8s-https --region=northamerica-northeast1 --network-tier=PREMIUM Created [https://www.googleapis.com/compute/beta/projects/k8s-https-212614/regions/northamerica-northeast1/addresses/k8s-https]. mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses list NAME REGION ADDRESS STATUS k8s-https northamerica-northeast1 35.203.65.136 RESERVED
Because I am easily amused, I own the domain actually.works
. In my settings for that domain I created an A
record pointing at that IP address. When you have things set up correctly, you can verify the DNS part is working with dig
mike@sleepycat:~$ dig it.actually.works ; <<>> DiG 9.13.0 <<>> it.actually.works ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62565 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;it.actually.works. IN A ;; ANSWER SECTION: it.actually.works. 3600 IN A 35.203.65.136 ;; Query time: 188 msec ;; SERVER: 192.168.0.1#53(192.168.0.1) ;; WHEN: Tue Aug 07 23:02:11 EDT 2018 ;; MSG SIZE rcvd: 62
With that squared away, we need to create our Kubernetes cluster. Before we can do that we need to get a little administrative stuff out of the way. First we need to get our billing details and link them to our project.
mike@sleepycat:~$ gcloud beta billing accounts list ACCOUNT_ID NAME OPEN MASTER_ACCOUNT_ID 0X0X0X-0X0X0X-0X0X0X My Billing Account True mike@sleepycat:~$ gcloud beta billing projects link k8s-https-212614 --billing-account 0X0X0X-0X0X0X-0X0X0X billingAccountName: billingAccounts/0X0X0X-0X0X0X-0X0X0X billingEnabled: true name: projects/k8s-https-212614/billingInfo projectId: k8s-https-212614
Then we’ll need to enable the Kubernetes engine for this project.
mike@sleepycat:~$ gcloud services enable container.googleapis.com --project k8s-https-212614 Waiting for async operation operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd to complete... Operation finished successfully. The following command can describe the Operation details: gcloud services operations describe operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd
Let’s create our cluster. Because both Kubernetes and Google move pretty quickly, it’s good to check the current Kubernetes version for your region with something like gcloud container get-server-config --region "northamerica-northeast1"
. In my case that shows “1.10.5-gke.3” as the newest so I’ll use that for my cluster. If you are interested in beefier machines explore your options with gcloud compute machine-types list --filter="northamerica-northeast1"
but for this I’ll slum it with a f1-micro.
mike@sleepycat:~$ gcloud container --project=k8s-https-212614 clusters create "k8s-https" --zone "northamerica-northeast1-a" --username "admin" --cluster-version "1.10.5-gke.3" --machine-type "f1-micro" --image-type "COS" --disk-type "pd-standard" --disk-size "100" --scopes "https://www.googleapis.com/auth/compute","https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" --num-nodes "3" --enable-cloud-logging --enable-cloud-monitoring --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard --enable-autoupgrade --enable-autorepair Creating cluster k8s-https...done. Created [https://container.googleapis.com/v1/projects/k8s-https-212614/zones/northamerica-northeast1-a/clusters/k8s-https]. To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/northamerica-northeast1-a/k8s-https?project=k8s-https-212614 kubeconfig entry generated for k8s-https. NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS k8s-https northamerica-northeast1-a 1.10.5-gke.3 35.203.64.6 f1-micro 1.10.5-gke.3 3 RUNNING
You will notice that kubectl (which you obviously have installed already) is now configured to access this cluster.
As part of the Traefik setup we are about to do we will need to change some RBAC rules. To do that we will need to create a cluster admin role and load that into our cluster.
mike@sleepycat:~$ cat cluster-admin-rolebinding.yaml --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: owner-cluster-admin-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: <your_username@your_email_you_use_with_google_cloud.whatever> --- mike@sleepycat:~$ kubectl apply -f cluster-admin-rolebinding.yaml
With that done we can apply the rest of the config I’ve posted in a snippet here with kubectl apply -f https.yaml
.
It’s a fair bit of yaml, but a few things are worth pointing out.
First, we are running a single pod with my helloworld image. It’s just the output of create-react-app that I use for testing stuff.
If you look at the traefik-ingress-service
, you will notice we are telling Google we want the service mapped to the static IP we created earlier using loadBalancerIP
.
--- apiVersion: v1 kind: Service metadata: name: traefik-ingress-service namespace: kube-system spec: loadBalancerIP: 35.203.65.136 ports: - name: http port: 80 protocol: TCP - name: https port: 443 protocol: TCP - name: admin port: 8080 protocol: TCP selector: k8s-app: traefik-ingress-lb type: LoadBalancer ---
When looking at the traefik-ingress-controller
itself, it’s worth noting the choice of kind: Deployment
instead of kind: DaemonSet
. This choice was made for simplicity’s sake (only a single pod will read/write to my certs-claim volume so no Multi-Attach
errors), and means that I will have a single pod acting as my ingress controller. Read more about the tradeoffs here.
Here is the traefik-ingress-controller
in it’s entirety. It’s a long chunk of code, but I find this helps see everything in context.
Special note about the args
being passed to the container; make sure they are strings. You can end up with some pretty baffling errors if you don’t. Other than that, it’s the full set of options to get you TLS certs and automatic redirects to HTTPS.
--- apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: k8s-app: traefik-ingress-lb name: traefik-ingress-controller namespace: kube-system spec: template: metadata: labels: k8s-app: traefik-ingress-lb spec: containers: - args: - "--api" - "--kubernetes" - "--logLevel=DEBUG" - "--debug" - "--defaultentrypoints=http,https" - "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https" - "--entrypoints=Name:https Address::443 TLS" - "--acme" - "--acme.onhostrule" - "--acme.entrypoint=https" - "--acme.domains=it.actually.works" - "--acme.email=mike@korora.ca" - "--acme.storage=/certs/acme.json" - "--acme.httpchallenge" - "--acme.httpchallenge.entrypoint=http" image: traefik:1.7 name: traefik-ingress-lb ports: - containerPort: 80 hostPort: 80 name: http - containerPort: 443 hostPort: 443 name: https - containerPort: 8080 hostPort: 8080 name: admin securityContext: capabilities: add: - NET_BIND_SERVICE drop: - ALL volumeMounts: - mountPath: /certs name: certs-claim serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 volumes: - name: certs-claim persistentVolumeClaim: claimName: certs-claim
The contents of the snippet should be all you need to get up and running. You should be able to visit your domain and see the reassuring green of the TLS lock in the URL bar.
If things aren’t working you can get a sense of what’s up with the following commands:
kubectl get all --all-namespaces kubectl logs --namespace=kube-system traefik-ingress-controller-...
Where to go from here
As you can see, there is a fair bit going on here. We have DNS, Kubernetes, Traefik and the underlying Google Cloud Platform all interacting and it’s not easy to get a minimal “hello world” style demo going when that is the case. Hopefully this shows enough to give people a jumping off point so they can start refining this into a more robust configuration. The next steps for me will be exploring DaemonSets and storing the acme.json
in a way that multiple copies of Traefik can access, maybe a key/value like consul. We’ll see what the next layer of learning brings.