Kubernetes config with Kustomize

If you are working with Kubernetes, it’s pretty important to be able to generate variations of your configuration. Your production cluster probably has TLS settings that won’t make sense in while testing locally in Minikube, or you’ll have service types that should be NodePort here and LoadBalancer there.

While tools like Helm tackle this problem with PHP style templates, kustomize offers a different approach based on constructing config via composition.

With each piece of Kubernetes config uniquely identified, composite key style, through a combination of kind, apiVersion and metadata.name, kustomize can generate new config by patching one yaml with others.

This lets us generate config without wondering what command line arguments a given piece of config might have been created from, or having turing complete programming languages embedded in it.

This sounds a little stranger than it is, so let’s make this more concrete with an example.

Lets say you clone a project and see kustomization.yaml files lurking in some of the subdirectories.

mike@sleepycat:~/projects/hello_world$ tree
.
├── base
│   ├── helloworld-deployment.yaml
│   ├── helloworld-service.yaml
│   └── kustomization.yaml
└── overlays
    ├── gke
    │   ├── helloworld-service.yaml
    │   └── kustomization.yaml
    └── minikube
        ├── helloworld-service.yaml
        └── kustomization.yaml

4 directories, 7 files

This project is using kustomize. The folder structure suggests that there is some base configuration, and variations for GKE and Minikube. We can generate the Minikube version with kubectl kustomize overlays/minikube.

This actually shows one of the nice things about kustomize, you probably already have it, since it was built into the kubectl command in version 1.14 after a brief kerfuffle.

mike@sleepycat:~/projects/hello_world$ kubectl kustomize overlays/minikube/
apiVersion: v1
kind: Service
metadata:
  labels:
    app: helloworld
  name: helloworld
spec:
  ports:
  - name: "3000"
    port: 3000
    protocol: TCP
    targetPort: 3000
  selector:
    app: helloworld
  type: NodePort
status:
  loadBalancer: {}
...more yaml...

If you want to get this config into your GKE cluster, it would be as simple as kubectl apply -k overlays/gke.

This tiny example obscures one of the other benefits of kustomize: it sorts the configuration it outputs in the following order to avoid dependency problems:

  • Namespace
  • StorageClass
  • CustomResourceDefinition
  • MutatingWebhookConfiguration
  • ServiceAccount
  • PodSecurityPolicy
  • Role
  • ClusterRole
  • RoleBinding
  • ClusterRoleBinding
  • ConfigMap
  • Secret
  • Service
  • LimitRange
  • Deployment
  • StatefulSet
  • CronJob
  • PodDisruptionBudget

Because it sorts it’s output this way, kustomize makes it far less error prone to get your application up and running.

Setting up your project to use kustomize

To get your project set up with kustomize, you will want a little more than the functionality built into kubectl. There are a few ways to install kustomize, put I think the easiest (assuming you have Go on your system) is go get:

go get sigs.k8s.io/kustomize

With that installed, we can create some folders and use the fd command to give us the lay of the land.

$ mkdir -p {base,overlays/{gke,minikube}}
$ fd
base
overlays
overlays/gke
overlays/minikube

In the base folder we’ll need to create a kustomization file some config. Then we tell kustomize to add the config as resources to be patched.

base$ touch kustomization.yaml
base$ kubectl create deployment helloworld --image=mikewilliamson/helloworld --dry-run -o yaml > helloworld-deployment.yaml
base$ kubectl create service clusterip helloworld --tcp=3000 --dry-run -o yaml > helloworld-service.yaml
base$ kustomize edit add resource helloworld-*

The kustomize edit series of commands (add, fix, remove, set) all exist to modify the kustomization.yaml file.

You can see that kustomize edit add resource helloworld-* added a resources: key with an array of explicit references rather than an implicit file glob.

$ cat base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- helloworld-deployment.yaml
- helloworld-service.yaml

Moving over to the overlays/minikube folder we can do something similar.

minikube$ touch kustomization.yaml
minikube$ kubectl create service nodeport helloworld --tcp=3000 --dry-run -o yaml > helloworld-service.yaml
minikube$ kustomize edit add patch helloworld-service.yaml
minikube$ kustomize edit add base ../../base

Worth noting is the base folder where kustomize will look for the bases to apply the patches to. The resulting kustomization.yaml file looks like the following:

$ cat overlays/minikube/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
patchesStrategicMerge:
- helloworld-service.yaml
bases:
- ../../base

One final jump into the overlays/gke folder gives us everything we will need to see the difference between two configs.

gke$ kubectl create service loadbalancer helloworld --tcp=3000 --dry-run -o yaml > helloworld-service.yaml
gke$ touch kustomization.yaml
gke$ kustomize edit add base ../../base
gke$ kustomize edit add patch helloworld-service.yaml

Finally we can generate the two different configs and diff them to see the changes.

$ diff -u --color <(kustomize build overlays/gke) <(kustomize build overlays/minikube/)
--- /dev/fd/63	2019-05-31 21:38:56.040572159 -0400
+++ /dev/fd/62	2019-05-31 21:38:56.041572186 -0400
@@ -12,7 +12,7 @@
     targetPort: 3000
   selector:
     app: helloworld
-  type: LoadBalancer
+  type: NodePort
 status:
   loadBalancer: {}
 ---

It won't surprise you that what you see here is just scratching the surface. There are many more fields that are possible in a kustomization.yaml file, and nuance in what should go in which file given that kustomize only allows addition not removal.

The approach kustomize is pursuing feels really novel in a field that has been dominated by DSLs (which hide the underlying construct) and templating (with the dangers of embedded languages and concatenating strings).

Working this way really helps deliver on the promise of portability made by Kubernetes; Thanks to kustomize, you’re only a few files and a `kustomize build` away from replatforming if you need to.

Advertisements

Exploring GraphQL(.js)

When Facebook released GraphQL in 2015 they released two separate things; a specification and a working implementation of the specification in JavaScript called GraphQL.js.

GraphQL.js acts as a “reference implementation” for people implementing GraphQL in other languages but it’s also a polished production-worthy JavaScript library at the heart of the JavaScript GraphQL ecosystem.

GraphQL.js gives us, among other things, the graphql function, which is what does the work of turning a query into a response.

graphql(schema, `{ hello }`)
{
  "data": {
    "hello": "world"
  }
}

The graphql function above is taking two arguments, one is the { hello } query, the other, the schema could use a little explanation.

The Schema

In GraphQL you define types for all the things you want to make available.
GraphQL.org has a simple example schema written in Schema Definition Language.

type Book {
  title: String
  author: Author
  price: Float
}

type Author {
  name: String
  books: [Book]
}

type Query {
  books: [Book]
  authors: [Author]
}

There are a few types we defined (Book, Author, Query) and some that GraphQL already knew about (String, Float). All of those types are collectively referred to as our schema.

You can define your schema with Schema Definition Language (SDL) as above or, as we will do, use plain JavaScript. It’s up to you. For our little adventure today I’ll use JavaScript and define a single field called “hello” on the mandatory root Query type.

var { GraphQLObjectType, GraphQLString, GraphQLSchema } = require('graphql')
var query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    hello: {type: GraphQLString, resolve: () => 'world'}
  }
})
var schema = new GraphQLSchema({ query })

The queries we receive are written in the GraphQL language, which will be checked against the types and fields we defined in our schema. In the schema above we’ve defined a single field on the Query type, and mapped a function that returns the string ‘world’ to that field.

GraphQL is a language like JavaScript or Python but the inner workings of other languages aren’t usually as visible or approachable as GraphQL.js make them. Looking at how GraphQL works can tell us a lot about how to use it well.

The life of a GraphQL query

Going from a query like { hello } to a JSON response happens in four phases:

  • Lexing
  • Parsing
  • Validation
  • Execution

Let’s take that little { hello } query and see what running it through that function looks like.

Lexing: turning strings into tokens

The query { hello } is a string of characters that presumably make up a valid query in the GraphQL language. The first step in the process is splitting that string into tokens. This work is done with a lexer.

var {createLexer, Source} = require('graphql/language')
var lexer = createLexer(new Source(`{ hello }`))

The lexer can tell us the current token, and we can advance the lexer to the next token by calling lexer.advance()

lexer.token
Tok {
  kind: '',
  start: 0,
  end: 0,
  line: 0,
  column: 0,
  value: undefined,
  prev: null,
  next: null }

lexer.advance()
Tok {
  kind: '{',
  start: 0,
  end: 1,
  line: 1,
  column: 1,
  value: undefined,
  next: null }

lexer.advance()
Tok {
  kind: 'Name',
  start: 1,
  end: 6,
  line: 1,
  column: 2,
  value: 'hello',
  next: null }

lexer.advance()
Tok {
  kind: '}',
  start: 6,
  end: 7,
  line: 1,
  column: 7,
  value: undefined,
  next: null }

lexer.advance()
Tok {
  kind: '',
  start: 7,
  end: 7,
  line: 1,
  column: 8,
  value: undefined,
  next: null }

It’s important to note that we are advancing by token not by character. Characters like commas, spaces, and new lines are all allowed in GraphQL since they make code nice to read, but the lexer will skip right past them in search of the next meaningful token.
These two queries will produce the same tokens you see above.

createLexer(new Source(`{ hello }`))
createLexer(new Source(`    ,,,\r\n{,\n,,hello,\n,},,\t,\r`))

The lexer also represents the first pass of input validation that GraphQL provides. Invalid characters are rejected by the lexer.

createLexer(new Source("*&^%$")).advance()
Syntax Error: Cannot parse the unexpected character "*"

Parsing: turning tokens into nodes, and nodes into trees

Parsing is about using tokens to build higher level objects called nodes.
node
If you look you can see the tokens in there but nodes have more going on.

If you use a tool like grep or ripgrep to search through the source of GraphQL.js you will see where these nodes are coming from. There specialised parsing functions for each type of node, the majority of which are used internally by the parse function. These functions follow the pattern of accepting a lexer, and returning a node.

$ rg "function parse" src/language/parser.js
124:export function parse(
146:export function parseValue(
168:export function parseType(
183:function parseName(lexer: Lexer): NameNode {
197:function parseDocument(lexer: Lexer): DocumentNode {
212:function parseDefinition(lexer: Lexer): DefinitionNode {
246:function parseExecutableDefinition(lexer: Lexer): ExecutableDefinitionNode {
271:function parseOperationDefinition(lexer: Lexer): OperationDefinitionNode {
303:function parseOperationType(lexer: Lexer): OperationTypeNode

Using the parse function is a simple as passing it a GraphQL string. If we print the output of parse with some spacing we can see the what’s actually happening: it’s constructing a tree. Specifically, it’s an Abstract Syntax Tree (AST).

> var { parse } = require('graphql/language')
> console.log(JSON.stringify(parse("{hello}"), null, 2))
{
  "kind": "Document",
  "definitions": [
    {
      "kind": "OperationDefinition",
      "operation": "query",
      "variableDefinitions": [],
      "directives": [],
      "selectionSet": {
        "kind": "SelectionSet",
        "selections": [
          {
            "kind": "Field",
            "name": {
              "kind": "Name",
              "value": "hello",
              "loc": {
                "start": 1,
                "end": 6
              }
            },
            "arguments": [],
            "directives": [],
            "loc": {
              "start": 1,
              "end": 6
            }
          }
        ],
        "loc": {
          "start": 0,
          "end": 7
        }
      },
      "loc": {
        "start": 0,
        "end": 7
      }
    }
  ],
  "loc": {
    "start": 0,
    "end": 7
  }
}

If you play with this, or a more deeply nested query you can see a patten emerge. You’ll see SelectionSets containing selections containing SelectionSets. With a structure like this, a function that calls itself would be able to walk it’s way down the this entire object. We’re all set up for some recursive evaluation.

Validation: Walking the tree with visitors

The reason for an AST is to enable us to do some processing, which is exactly what happens in the validation step. Here we are looking to make some decisions about the tree and how well it lines up with our schema.

For any of that to happen, we need a way to walk the tree and examine the nodes. For that there is a pattern called the Vistor pattern, which GraphQL.js provides an implementation of.

To use it we require the visit function and make a visitor.

var { visit } = require('graphql')

var depth = 0
var vistor = {
  enter: node => {
    depth++
    console.log(' '.repeat(depth).concat(node.kind))
    return node
  },
  leave: node => {
    depth--
    return node
  },
}

Our vistor above has enter and leave functions attached to it. These names are significant since the visit function looks for them when it comes across a new node in the tree or moves on to the next node.
The visit function accepts an AST and a visitor and you can see our visitor at work printing out the kind of the nodes being encountered.

> visit(parse(`{ hello }`, visitor)
 Document
  OperationDefinition
   SelectionSet
    Field
     Name

With the visit function providing a generic ability to traverse the tree, the next step is to use this ability to determine if this query is acceptable to us.
This happens with the validate function. By default, it seems to know that kittens are not a part of our schema.

var { validate } = require('graphql')
validate(schema, parse(`{ kittens }`))
// GraphQLError: Cannot query field "kittens" on type "Query"

The reason it knows that is that there is a third argument to the validate function. Left undefined, it defaults to an array of rules exported from ‘graphql/validation’. These “specifiedRules” are responsible for all the validations that ensure our query is safe to run.

> var { validate } = require('graphql')
> var { specifiedRules } = require('graphql/validation')
> specifiedRules
[ [Function: ExecutableDefinitions],
  [Function: UniqueOperationNames],
  [Function: LoneAnonymousOperation],
  [Function: SingleFieldSubscriptions],
  [Function: KnownTypeNames],
  [Function: FragmentsOnCompositeTypes],
  [Function: VariablesAreInputTypes],
  [Function: ScalarLeafs],
  [Function: FieldsOnCorrectType],
  [Function: UniqueFragmentNames],
  [Function: KnownFragmentNames],
  [Function: NoUnusedFragments],
  [Function: PossibleFragmentSpreads],
  [Function: NoFragmentCycles],
  [Function: UniqueVariableNames],
  [Function: NoUndefinedVariables],
  [Function: NoUnusedVariables],
  [Function: KnownDirectives],
  [Function: UniqueDirectivesPerLocation],
  [Function: KnownArgumentNames],
  [Function: UniqueArgumentNames],
  [Function: ValuesOfCorrectType],
  [Function: ProvidedRequiredArguments],
  [Function: VariablesInAllowedPosition],
  [Function: OverlappingFieldsCanBeMerged],
  [Function: UniqueInputFieldNames] ]

validate(schema, parse(`{ kittens }`), specifiedRules)
// GraphQLError: Cannot query field "kittens" on type "Query"

In there you can see checks to ensure that the query only includes known types (KnownTypeNames) and things like variables having unique names (UniqueVariableNames).
This is the next level of input validation that GraphQL provides.

Rules are just visitors

If you dig into those rules (all in src/validation/rules/) you will realize that these are all just visitors.
In our first experiment with visitors, we just printed out the node kind. If we look at this again, we can see that even our tiny little query ends up with 5 levels of depth.

visit(parse(`{ hello }`, visitor)
 Document  // 1
  OperationDefinition // 2
   SelectionSet // 3
    Field // 4
     Name // 5

Let’s say for the sake of experimentation that 4 is all we will accept. To do that we’ll write ourselves a visitor, and then pass it into the third argument to validate.

var fourDeep = context => {
  var depth = 0, maxDepth = 4 // 😈
  return {
    enter: node => {
      depth++
      if (depth > maxDepth)
        context.reportError(new GraphQLError('💥', [node]))
      }
      return node
    },
    leave: node => { depth--; return node },
  }
}
validate(schema, parse(`{ hello }`), [fourDeep])
// GraphQLError: 💥

If you are building a GraphQL API server, you can take a rule like this and pass it as one of the options to express-graphql, so your rule will be applied to all queries the server handles.

Execution: run resolvers. catch errors.

This us to the execution step. There isn’t much exported from ‘graphql/execution’. What’s worthy of note is here is the root object, and the defaultFieldResolver. This work in concert to ensure that wherever there isn’t a resolver function, by default you get the value for that fieldname on the root object.

var { execute, defaultFieldResolver } = require('graphql/execution')
var args = {
  schema,
  document: parse(`{ hello }`),
  // value 0 in the "value of the previous resolver" chain
  rootValue: {},
  variableValues: {},
  operationName: '',
  fieldResolver: defaultFieldResolver,
}
execute(args)
{
  "data": {
    "hello": "world"
  }
}

Why all that matters

For me the take-away in all this is a deeper appreciation of what GraphQL being a language implies.

First, giving your users a language is empowering them to ask for what they need. This is actually written directly into the spec:

GraphQL is unapologetically driven by the requirements of views and the front‐end engineers that write them. GraphQL starts with their way of thinking and requirements and builds the language and runtime necessary to enable that.

Empowering your users is always a good idea but server resources are finite, so you’ll need to think about putting limits somewhere. The fact that language evaluation is recursive means the amount of recursion and work your server is doing is determined by the person who writes the query. Knowing the mechanism to set limits on that (validation rules!) is an important security consideration.

That caution comes alongside a big security win. Formal languages and type systems are the most powerful tools we have for input validation. Rigorous input validation is one of the most powerful things we can do to increase the security of our systems. Making good use of the type system means that your code should never be run on bad inputs.

It’s because GraphQL is a language that it let’s us both empower users and increase security, and that is a rare combination indeed.

Tagged template literals and the hack that will never go away

Tagged template literals were added to Javascript as part of ES 2015. While a fair bit has been written about them, I’m going to argue their significance is underappreciated and I’m hoping this post will help change that. In part, it’s significant because it strikes at the root of a problem people had otherwise resigned themselves to living with: SQL injection.

Just so we are clear, before ES 2015, combining query strings with untrusted user input to create a SQL injection was done via concatenation using the plus operator.

let query = 'select * from widgets where id = ' + id + ';'

As of ES 2015, you can create far more stylish SQL injections using backticks.

let query = `select * from widgets where id = ${id};`

By itself this addition is really only remarkable for not being included in the language sooner. The backticks are weird, but it gives us some much-needed multiline string support and a very Rubyish string interpolation syntax. It’s pairing this new syntax with another language feature known as tagged templates that has a real potential to make an impact on SQL injections.

> let id = 1
// define a function to use as a "tag"
> sql = (strings, ...vars) => ({strings, vars})
[Function: sql]
// pass our tag a template literal
> sql`select * from widgets where id = ${id};`
{ strings: [ 'select * from widgets where id = ', ';' ], vars: [ 1 ] }

What you see above is just a function call, but it no longer works like other languages. Instead of doing the variable interpolation first and then calling the sql function with select * from widgets where id = 1;, the sql function is called with an array of strings and the variables that are supposed to be interpolated.

You can see how different this is from the standard evaluation process by adding brackets to make this a standard function invocation. The string is interpolated before being passed to the sql function, entirely loosing the distinction between the variable (which we probably don’t trust) and the string (that we probably do). The result is an injected string and an empty array of variables.

> sql(`select * from widgets where id = ${id};`)
{ strings: 'select * from widgets where id = 1;', vars: [] }

This loss of context is the heart of matter when it comes to SQL injection (or injection attacks generally). The moment the strings and variables are combined you have a problem on your hands.

So why not just use parameterized queries or something similar? It’s generally held that good code expresses the programmers intent. I would argue that our SQL injection example code perfectly expresses the programmers intent; they want the id variable to be included in the query string. As a perfect expression of the programmers intent, this should be acknowledged as “good code”… as well as a horrendous security problem.

let query = sql(`select * from widgets where id = ${id};`)

When the clearest expression of a programmers intent is also a security problem what you have is a systemic issue which requires a systemic fix. This is why despite years of security education, developer shaming and “push left” pep-talks SQL injection stubbornly remains “the hack that will never go away”. It’s also why you find Mike Samuel from Google’s security team as the champion of the “Template Strings” proposal.

You can see the fruits of this labour by noticing library authors leveraging this to deliver a great developer experience while doing the right thing for security. Allan Plum, the driving force behind the Arangodb Javascript driver leveraging tagged template literals to let users query ArangoDB safely.

The aql (Arango Query Language) function lets you write what would in any other language be an intent revealing SQL injection, safely returns an object with a query and some accompanying bindvars.

aql`FOR thing IN collection FILTER thing.foo == ${foo} RETURN thing`
{ query: 'FOR thing IN collection FILTER thing.foo == @value0 RETURN thing',
  bindVars: { value0: 'bar' } }

Mike Samuel himself has a number of node libraries that leverage Tagged Template Literals, among them one to safely handle shell commands.

sh`echo -- ${a} "${b}" 'c: ${c}'`

It’s important to point out that Tagged Template Literals don’t entirely solve SQL injections, since there are no guarantees that any particular tag function will do “the right thing” security-wise, but the arguments the tag function receives set library authors up for success.

Authors using them get to offer an intuitive developer experience rather than the clunkiness of prepared statements, even though the tag function may well be using them under the hood. The best experience is from safest thing; It’s a great example of creating a “pit of success” for people to fall into.

// Good security hinges on devs learning to write
// stuff like this instead of stuff that makes sense.
// Clunky prepared statement is clunky.
const ps = new sql.PreparedStatement(/* [pool] */)
ps.input('param', sql.Int)
ps.prepare('select * from widgets where id = @id;', err => {
    // ... error checks
    ps.execute({id: 1}, (err, result) => {
        // ... error checks
        ps.unprepare(err => {
            // ... error checks
        })
    })
})

It’s an interesting thought that Javascripts deficiencies seem to have become it’s strength. First Ryan Dahl filled out the missing IO pieces to create Node JS and now missing features like multiline string support provide an opportunity for some of the worlds most brilliant minds to insert cutting edge security features along-side these much needed fixes.

I’m really happy to finally see language level fixes for things that are clearly language level problems, and excited to see where Mike Samuel’s mission to “make the easiest way to express an idea in code a secure way to express that idea” takes Javascript next. It’s the only way I can see to make “the hack that will never go away” go away.

Basic HTTPS on Kubernetes with Traefik

Back in February of 2018 Google’s Security blog announced that Chrome would be start displaying “not secure” for websites starting in July. In doing so they cemented HTTPS as part of the constantly-rising baseline expectations for modern web developers.

These constantly-rising baseline expectations are written into a new generation of tools like Traefik and Caddy. Both are written in Go and both leverage Let’s Encrypt to automate away the requesting and renewal, and by extension the unexpected expiration, of TLS certificates. Kubernetes is another modern tool aimed at meeting some of the other modern baseline expectations around monitoring, scaling and uptime.

Using Traefik and Kubernetes together is a little fiddly, and getting a working deployment on a cloud provider even more so. The aim here is to show how to use Traefik to get Let’s Encrypt based HTTPS working on the Google Kubernetes Engine.
An obvious prerequisite is to have a domain name, and to point it at a static IP you’ve created.

Let’s start with creating our project:

mike@sleepycat:~$ gcloud projects create --name k8s-https
No project id provided.

Use [k8s-https-212614] as project id (Y/n)?  y

Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/k8s-https-212614].
Waiting for [operations/cp.6673958274622567208] to finish...done.

Next lets create a static IP.

mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses create k8s-https --region=northamerica-northeast1 --network-tier=PREMIUM
Created [https://www.googleapis.com/compute/beta/projects/k8s-https-212614/regions/northamerica-northeast1/addresses/k8s-https].
mike@sleepycat:~$ gcloud beta compute --project=k8s-https-212614 addresses list
NAME       REGION                   ADDRESS        STATUS
k8s-https  northamerica-northeast1  35.203.65.136  RESERVED

Because I am easily amused, I own the domain actually.works. In my settings for that domain I created an A record pointing at that IP address. When you have things set up correctly, you can verify the DNS part is working with dig

mike@sleepycat:~$ dig it.actually.works

; <<>> DiG 9.13.0 <<>> it.actually.works
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62565
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;it.actually.works.		IN	A

;; ANSWER SECTION:
it.actually.works.	3600	IN	A	35.203.65.136

;; Query time: 188 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)
;; WHEN: Tue Aug 07 23:02:11 EDT 2018
;; MSG SIZE  rcvd: 62

With that squared away, we need to create our Kubernetes cluster. Before we can do that we need to get a little administrative stuff out of the way. First we need to get our billing details and link them to our project.

mike@sleepycat:~$ gcloud beta billing accounts list
ACCOUNT_ID            NAME                OPEN  MASTER_ACCOUNT_ID
0X0X0X-0X0X0X-0X0X0X  My Billing Account  True
mike@sleepycat:~$ gcloud beta billing projects link k8s-https-212614 --billing-account 0X0X0X-0X0X0X-0X0X0X
billingAccountName: billingAccounts/0X0X0X-0X0X0X-0X0X0X
billingEnabled: true
name: projects/k8s-https-212614/billingInfo
projectId: k8s-https-212614

Then we’ll need to enable the Kubernetes engine for this project.

mike@sleepycat:~$ gcloud services enable container.googleapis.com --project k8s-https-212614
Waiting for async operation operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd to complete...
Operation finished successfully. The following command can describe the Operation details:
 gcloud services operations describe operations/tmo-acf.74966272-39c8-4b7b-b973-8f7fa4dac4fd

Let’s create our cluster. Because both Kubernetes and Google move pretty quickly, it’s good to check the current Kubernetes version for your region with something like gcloud container get-server-config --region "northamerica-northeast1". In my case that shows “1.10.5-gke.3” as the newest so I’ll use that for my cluster. If you are interested in beefier machines explore your options with gcloud compute machine-types list --filter="northamerica-northeast1" but for this I’ll slum it with a f1-micro.

mike@sleepycat:~$ gcloud container --project=k8s-https-212614 clusters create "k8s-https" --zone "northamerica-northeast1-a" --username "admin" --cluster-version "1.10.5-gke.3" --machine-type "f1-micro" --image-type "COS" --disk-type "pd-standard" --disk-size "100" --scopes "https://www.googleapis.com/auth/compute","https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" --num-nodes "3" --enable-cloud-logging --enable-cloud-monitoring --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard --enable-autoupgrade --enable-autorepair

Creating cluster k8s-https...done.
Created [https://container.googleapis.com/v1/projects/k8s-https-212614/zones/northamerica-northeast1-a/clusters/k8s-https].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/northamerica-northeast1-a/k8s-https?project=k8s-https-212614
kubeconfig entry generated for k8s-https.
NAME       LOCATION                   MASTER_VERSION  MASTER_IP    MACHINE_TYPE  NODE_VERSION  NUM_NODES  STATUS
k8s-https  northamerica-northeast1-a  1.10.5-gke.3    35.203.64.6  f1-micro      1.10.5-gke.3  3          RUNNING

You will notice that kubectl (which you obviously have installed already) is now configured to access this cluster.
As part of the Traefik setup we are about to do we will need to change some RBAC rules. To do that we will need to create a cluster admin role and load that into our cluster.

mike@sleepycat:~$ cat cluster-admin-rolebinding.yaml 
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: owner-cluster-admin-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: <your_username@your_email_you_use_with_google_cloud.whatever>
---

mike@sleepycat:~$ kubectl apply -f cluster-admin-rolebinding.yaml

With that done we can apply the rest of the config I’ve posted in a snippet here with kubectl apply -f https.yaml.
It’s a fair bit of yaml, but a few things are worth pointing out.

First, we are running a single pod with my helloworld image. It’s just the output of create-react-app that I use for testing stuff.
If you look at the traefik-ingress-service, you will notice we are telling Google we want the service mapped to the static IP we created earlier using loadBalancerIP.

---
apiVersion: v1
kind: Service
metadata:
  name: traefik-ingress-service
  namespace: kube-system
spec:
  loadBalancerIP: 35.203.65.136
  ports:
  - name: http
    port: 80
    protocol: TCP
  - name: https
    port: 443
    protocol: TCP
  - name: admin
    port: 8080
    protocol: TCP
  selector:
    k8s-app: traefik-ingress-lb
  type: LoadBalancer
---

When looking at the traefik-ingress-controller itself, it’s worth noting the choice of kind: Deployment instead of kind: DaemonSet. This choice was made for simplicity’s sake (only a single pod will read/write to my certs-claim volume so no Multi-Attach errors), and means that I will have a single pod acting as my ingress controller. Read more about the tradeoffs here.

Here is the traefik-ingress-controller in it’s entirety. It’s a long chunk of code, but I find this helps see everything in context.

Special note about the args being passed to the container; make sure they are strings. You can end up with some pretty baffling errors if you don’t. Other than that, it’s the full set of options to get you TLS certs and automatic redirects to HTTPS.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: traefik-ingress-lb
  name: traefik-ingress-controller
  namespace: kube-system
spec:
  template:
    metadata:
      labels:
        k8s-app: traefik-ingress-lb
    spec:
      containers:
      - args:
        - "--api"
        - "--kubernetes"
        - "--logLevel=DEBUG"
        - "--debug"
        - "--defaultentrypoints=http,https"
        - "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
        - "--entrypoints=Name:https Address::443 TLS"
        - "--acme"
        - "--acme.onhostrule"
        - "--acme.entrypoint=https"
        - "--acme.domains=it.actually.works"
        - "--acme.email=mike@korora.ca"
        - "--acme.storage=/certs/acme.json"
        - "--acme.httpchallenge"
        - "--acme.httpchallenge.entrypoint=http"
        image: traefik:1.7
        name: traefik-ingress-lb
        ports:
        - containerPort: 80
          hostPort: 80
          name: http
        - containerPort: 443
          hostPort: 443
          name: https
        - containerPort: 8080
          hostPort: 8080
          name: admin
        securityContext:
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
        volumeMounts:
        - mountPath: /certs
          name: certs-claim
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 60
      volumes:
      - name: certs-claim
        persistentVolumeClaim:
          claimName: certs-claim

The contents of the snippet should be all you need to get up and running. You should be able to visit your domain and see the reassuring green of the TLS lock in the URL bar.

A TLS certificate from Let's Encrypt

If things aren’t working you can get a sense of what’s up with the following commands:

kubectl get all --all-namespaces
kubectl logs --namespace=kube-system traefik-ingress-controller-...

Where to go from here

As you can see, there is a fair bit going on here. We have DNS, Kubernetes, Traefik and the underlying Google Cloud Platform all interacting and it’s not easy to get a minimal “hello world” style demo going when that is the case. Hopefully this shows enough to give people a jumping off point so they can start refining this into a more robust configuration. The next steps for me will be exploring DaemonSets and storing the acme.json in a way that multiple copies of Traefik can access, maybe a key/value like consul. We’ll see what the next layer of learning brings.

Minimum viable Kubernetes

I remember sitting in the audience at the first Dockercon in 2014 when Google announced Kubernetes and thinking “what kind of a name is that?”. In the intervening years, Kubernetes, or k8s for short, has battled it out with Cattle and Docker swarm and emerged as the last orchestrator standing.

I’ve been watching this happen but have been procrastinating on learning it because from a distance it looks hella complicated. Recently I decided to rip off the bandaid and set myself the challenge of getting a single container running in k8s.

While every major cloud provider is offering k8s, so far Google looks to be the easiest to get started with. So what does it take to get a container running on Google Cloud?

First some assumptions: you’ve installed the gcloud command (I used this) with the alpha commands, and you have a GCP account, and you’ve logged in with gcloud auth login.

If you have that sorted, let’s create a project.

mike@sleepycat:~$ gcloud projects create --name projectfoo
No project id provided.

Use [projectfoo-208401] as project id (Y/n)?  

Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/projectfoo-208401].
Waiting for [operations/cp.4790935341316997740] to finish...done.

With a project created we need to enable billing for it, so Google can charge you for the compute resources Kubernetes uses.

                                                                                                                                      
mike@sleepycat:~$ gcloud alpha billing projects link projectfoo-208401 --billing-account 0X0X0X-0X0X0X-0X0X0X
billingAccountName: billingAccounts/0X0X0X-0X0X0X-0X0X0X
billingEnabled: true
name: projects/projectfoo-208401/billingInfo
projectId: projectfoo-208401

Next we need to enable the Kubernetes Engine API for our new project.

mike@sleepycat:~$ gcloud services --project=projectfoo-208401 enable container.googleapis.com
Waiting for async operation operations/tmo-acf.445bb50c-cf7a-4477-831c-371fea91ddf0 to complete...
Operation finished successfully. The following command can describe the Operation details:
 gcloud services operations describe operations/tmo-acf.445bb50c-cf7a-4477-831c-371fea91ddf0

With that done, we are free to fire up a Kubernetes cluster. There is a lot going on here, more than you need, but it’s good to be able to see some of the options available. Probably the only ones to care about initially are the zone and the machine-type.

mike@sleepycat:~$ gcloud beta container --project=projectfoo-208401 clusters create "projectfoo" --zone "northamerica-northeast1-a" --username "admin" --cluster-version "1.8.10-gke.0" --machine-type "f1-micro" --image-type "COS" --disk-type "pd-standard" --disk-size "100" --scopes "https://www.googleapis.com/auth/compute","https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" --num-nodes "3" --enable-cloud-logging --enable-cloud-monitoring --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard --enable-autoupgrade --enable-autorepair
This will enable the autorepair feature for nodes. Please see
https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more
information on node autorepairs.

This will enable the autoupgrade feature for nodes. Please see
https://cloud.google.com/kubernetes-engine/docs/node-management for more
information on node autoupgrades.

Creating cluster projectfoo...done.                                                                                                                                                                         
Created [https://container.googleapis.com/v1beta1/projects/projectfoo-208401/zones/northamerica-northeast1-a/clusters/projectfoo].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/northamerica-northeast1-a/projectfoo?project=projectfoo-208401
kubeconfig entry generated for projectfoo.
NAME        LOCATION                   MASTER_VERSION  MASTER_IP     MACHINE_TYPE  NODE_VERSION  NUM_NODES  STATUS
projectfoo  northamerica-northeast1-a  1.8.10-gke.0    35.203.8.163  f1-micro      1.8.10-gke.0  3          RUNNING

With that done we can take a quick peek at what that last command created: a Kubernetes cluster on three f1-micro VMs.

mike@sleepycat:~$ gcloud compute instances --project=projectfoo-208401 list
NAME                                       ZONE                       MACHINE_TYPE  PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP    STATUS
gke-projectfoo-default-pool-190d2ac3-59hg  northamerica-northeast1-a  f1-micro                   10.162.0.4   35.203.87.122  RUNNING
gke-projectfoo-default-pool-190d2ac3-lbnk  northamerica-northeast1-a  f1-micro                   10.162.0.2   35.203.78.141  RUNNING
gke-projectfoo-default-pool-190d2ac3-pmsw  northamerica-northeast1-a  f1-micro                   10.162.0.3   35.203.91.206  RUNNING

Let’s put those f1-micro‘s to work. We are going to use the kubectl run command to run a simple helloworld container that just has the basic output of create-react-app in it.

mike@sleepycat:~$ kubectl run projectfoo --image mikewilliamson/helloworld --port 3000
deployment "projectfoo" created

The result of that is the helloworld container, running inside a pod, inside a replica set inside a deployment, which of course is running inside a VM on Google Cloud. All that’s needed now is to map the port the container is listening on (3000) to port 80 so we can talk to it from the outside world.

mike@sleepycat:~$ kubectl expose deployment projectfoo --type LoadBalancer --port 80 --target-port 3000
service "projectfoo" exposed

This creates a LoadBalancer service, and eventually we get allocated our own IP.

mike@sleepycat:~$ kubectl get services
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP      10.59.240.1    <none>        443/TCP        3m
projectfoo   LoadBalancer   10.59.245.55   <pending>     80:32184/TCP   34s
mike@sleepycat:~$ kubectl get services
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)        AGE
kubernetes   ClusterIP      10.59.240.1    <none>           443/TCP        4m
projectfoo   LoadBalancer   10.59.245.55   35.203.123.204   80:32184/TCP   1m

Then we can use our newly allocated IP and talk to our container. The moment of truth!

mike@sleepycat:~$ curl 35.203.123.204
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta name="viewport" content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name="theme-color" content="#000000"><link rel="manifest" href="/manifest.json"><link rel="shortcut icon" href="/favicon.ico"><title>React App</title><link href="/static/css/main.c17080f1.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div><script type="text/javascript" src="/static/js/main.61911c33.js"></script></body></html>

After you’ve taken a moment to marvel at the layers of abstractions involved here, it’s worth remembering that you probably don’t want this stuff hanging around if you aren’t really using it, otherwise you’re going to regret connecting your billing information.

mike@sleepycat:~$ gcloud container --project projectfoo-208401 clusters delete projectfoo
The following clusters will be deleted.
 - [projectfoo] in [northamerica-northeast1-a]

Do you want to continue (Y/n)?  y

Deleting cluster projectfoo...done.                                                                                                                                                                         
Deleted [https://container.googleapis.com/v1/projects/projectfoo-208401/zones/northamerica-northeast1-a/clusters/projectfoo].
mike@sleepycat:~$ gcloud projects delete projectfoo-208401
Your project will be deleted.

Do you want to continue (Y/n)?  y

Deleted [https://cloudresourcemanager.googleapis.com/v1/projects/projectfoo-208401].

You can undo this operation for a limited period by running:
  $ gcloud projects undelete projectfoo-208401

There is a lot going on here, and since this is new territory, much of it doesn’t mean lots to me yet. What’s exciting to me is finally being able to get a toe-hold on an otherwise pretty intimidating subject.

Having finally started working with it, I have to say both the kubectl and gcloud CLI tools are thoughtfully designed and pretty intuitive, and Google’s done a nice job making a lot of stuff happen in just a few approachable commands. I’m excited to dig in further.

Customizing your R command line experience

I’ve come to appreciate how powerful R is for working with data, but I find it has some pretty awkward and clunky defaults that make every interaction with the command line kind of aggravating.

mike@sleepycat:~$ R

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages("tidyverse")
Warning in install.packages("tidyverse") :
  'lib = "/usr/lib/R/library"' is not writable
Would you like to use a personal library instead?  (y/n)
>
Save workspace image? [y/n/c]: n

Here you can see a few things that I’m not a fan of. First up, uppercase commands are just weird. Second, holy moly that’s a lot of blah blah to give me a repl.

Next, this warning about the non-writable directory; /usr will always be read only and owned by root. Why even try to write there?

Accepting the offer to use a “personal library” translates to creating a directory like ~/R/x86_64-pc-linux-gnu-library/3.4, which is another annoyance. Why isn’t this a hidden directory? Why must I be forced to look at this R directory every time I see my home folder?

Finally, if I’m doing something more than a quick experiment, I’ll write a script in a text file so I can track changes to it with git rather than saving my work in some opaque .RData file. Given that, having R always asking to save my workspace is deeply irritating.

We can pick off a few of these problems with a simple alias, by adding the following to your ~/.bashrc file:

alias r="R --no-save --quiet"

This immediately gives us a far more civilized experience getting in and out of R: I can launch R with a lowercase r command and then exit without a fuss with Ctrl+d (signaling the end of input). The --quiet kills the introductory text while --no-save gets rid of the “Save workspace image?” nag.

mike@sleepycat:~$ r
> 
mike@sleepycat:~$

This is a good start but goofy stuff like where to save your libraries will need to be solved another way: your Rprofile. While starting up R looks for certain configuration files one of them being ~/.Rprofile.

I’ve created a hidden folder called .rlibs folder in my home directory which my Rprofile then sets as my chosen place to save libraries among a few other things likes setting a default mirror and loading and saving my command history.

Here is what’s working for me:

# Load utils so we can use it here.
library(utils)

# Save R libraries into my /home/$USER/.rlibs instead of somewhere that
# requires root privileges:
.libPaths('~/.rlibs')

# Stop asking me about mirrors, and always use the https
# cran mirror at muug.ca:
local({r <- getOption("repos")
      r["CRAN"] <- "https://muug.ca/mirror/cran/"
      options(repos=r)})

# Some reasonable defaults:
options(stringsAsFactors=FALSE)
options(max.print=100)
options(editor="vim")
options(menu.graphics=FALSE)
Sys.setenv(R_HISTFILE="~/.Rhistory")
Sys.setenv(R_HISTSIZE="100000")

# Run at startup
.First <- function(){
  # Load my history if this is an interactive session
  if (interactive()) utils::loadhistory(file = "~/.Rhistory")
  # Load the packages in the tidyverse without warnings.
  # suppressMessages(library(tidyverse))
}

# Run at the end of your session
.Last <- function(){
  if (interactive()) utils:::savehistory(file = "~/.Rhistory")
}

This little foray into R configuration has made R really nice to use from the command line. Hopefully this will be a decent starting point for others as well.

GraphQL i18n

One of the things I love about GraphQL is that it is “self documenting”, but of course, here in Canada the obvious question that follows is “in both languages?”. Since GraphQL is one of the core technologies for me I really wanted to figure out a decent answer when people ask about i18n, but it’s never been clear to me how to handle it.

Since I’m familiar with GraphQL-js, I’ll be using that here but Apollo Server is on my list to explore and may have some different answers. I’ll also be using my favourite i18n library Lingui here, but any other i18n library could be easily substituted.

Just to be clear, the issue at hand is not i18n for the data (that you’d handle with something like ArangoDB’s TRANSLATE) it’s i18n for the description attributes that can be attached to all your schema objects.
Once description strings are added, anyone (and “anyone” could and probably will include developer tools, IDEs or developers themselves) can introspect on the schema by querying the __schema or __type fields to find the descriptions with a query like this:

{
  __schema {
    queryType {
      fields {
        name
        description
        args {
          name
          description
        }
      }
    }
  }
}

To help think this through lets create a basic schema that just returns the current time and returns the DateTime type from above and use it to explore i18n.

const DateTime = new GraphQLObjectType({
  name: 'DateTime',
  description: 'An example date/time object.',
  fields: () => ({
    date: {
      description: 'The current date in DD/MM/YYYY format.',
      type: GraphQLString,
    },
    time: {
      description: 'The current time in HH:MM:SS AM/PM format.',
      type: GraphQLString,
    },
  }),
})

const query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    now: {
      description: 'Returns current time and date values.',
      type: DateTime, // what the resolve function will produce
      resolve: (root, args, context) => {
        let now = new Date()
        let time = now.toLocaleDateString(context.language, {
          timeZone: 'America/Toronto',
        })
        let date = now.toLocaleTimeString(context.language, {
          timeZone: 'America/Toronto',
        })
        return { date, time }
      },
    },
  },
})

With our query type and it’s return type created, we just need to wrap that in a schema and pass it to the express-graphql middleware. It will mount the schema on the url we specify and pass the request object to all our resolvers as the third argument (aka “the context”).

Adding the requestLanguage middleware from the express-request-language library before express-graphql means that incoming requests will be checked for the Accept-Language header, and the best matching language of the languages you specify will be attached to the request as request.language. Remember that express-graphql is passing the request to our resolvers as context so that means that we access the request.language as context.language.

const schema = new GraphQLSchema({ query })

let server = express()

server
  .use(
    requestLanguage({
      languages: ['en', 'fr'], // First locale becomes the default
    }),
  )
  .use('/graphql', graphqlHTTP({ schema }))

server.listen(3000)

With this basic setup, we can already see the outline of the issue: our schema and types with their accompanying descriptions are defined once when the script is run, but the language we want is whatever is appropriate for each request.

It probably won’t surprise you that our middleware can been passed a function that it will execute per request that will produce the configuation needed. With that in place we have a pretty clear path forward.

server
  .use(
    requestLanguage({
      languages: ['en', 'fr'], // First locale becomes the default
    }),
  )
  .use(
    '/graphql',
    graphqlHTTP((request, response, graphQLParams) => {
      return {
        schema: new GraphQLSchema({
          query: // define a schema and types using the request language and pass it in
        }),
      }
    }),
  )

My first step was to define a function that recieves an i18n object and returns a schema.

const createSchema = i18n => {
  // Define a type that describes the data
  const DateTime = new GraphQLObjectType({
    name: 'DateTime',
    description: i18n.t`An example date/time object.`,
    fields: () => ({
      date: {
        description: i18n.t`The current date in DD/MM/YYYY format.`,
        type: GraphQLString,
      },
      time: {
        description: i18n.t`The current time in HH:MM:SS AM/PM format.`,
        type: GraphQLString,
      },
    }),
  })

  const query = new GraphQLObjectType({
    name: 'Query',
    fields: {
      now: {
        description: i18n.t`Returns current time and date values.`,
        type: DateTime, // what the resolve function will produce
        resolve: (root, args, context) => {
          let now = new Date()
          let time = now.toLocaleDateString(context.language, {
            timeZone: 'America/Toronto',
          })
          let date = now.toLocaleTimeString(context.language, {
            timeZone: 'America/Toronto',
          })
          return { date, time }
        },
      },
    },
  })

  return query
}

With that defined I can use it to produce a schema, but because lingui works by scanning for and extracting things like i18n.t`...` from my code, I have to remember not to rename that otherwise lingui extract won’t find my translations. Additionally I don’t want variable shadowing, so I import i18n under a different name and rename it to what Lingui expects only when I go to create the schema:

const express = require('express')
const graphqlHTTP = require('express-graphql')
const { GraphQLSchema, GraphQLObjectType, GraphQLString } = require('graphql')
const { i18n: internationalization, unpackCatalog } = require('lingui-i18n') // import i18n as something else
const requestLanguage = require('express-request-language')

internationalization.load({  // Load our language files
  fr: unpackCatalog(require('./locale/fr/messages.js')),
  en: unpackCatalog(require('./locale/en/messages.js')),
})

// Our function that creates the schema
const createSchema = i18n => {
...
}

let server = express()

server
  .use(
    requestLanguage({
      languages: internationalization.availableLanguages.sort(), // First locale becomes the default
    }),
  )
  .use(
    '/graphql',
    graphqlHTTP(async (request, response, graphQLParams) => {
      internationalization.activate(request.language)
      return {
        schema: new GraphQLSchema({
          query: createSchema(internationalization),
        }),
      }
    }),
  )

server.listen(3000)

Getting Lingui properly integrated took a little fiddling. It brings along it’s own copy of babel which doesn’t seem to see your other babel plugins but does read your .babelrc.
Configuring my projects babel using only command line options, solved the clashes with lingui’s babel, and then making sure Lingui would only look for translations in the src folder was all I needed to get Lingui in working along-side the usual “transpile to dist” workflow (the finished code is available here).
After running Lingui’s lingui extract and doing my translations I’m now able to hit my endpoint and see the translated descriptions:

# Ask for English!
mike@sleepycat:~$ curl -H "Accept-Language: en" -H "Content-Type: application/graphql" -d "{ __schema {queryType { fields { name description } } } }" "localhost:3000/graphql"
{"data":{"__schema":{"queryType":{"fields":[{"name":"now","description":"Returns current time and date values."}]}}}}
# Ask for French!
mike@sleepycat:~$ curl -H "Accept-Language: fr" -H "Content-Type: application/graphql" -d "{ __schema {queryType { fields { name description } } } }" "localhost:3000/graphql"
{"data":{"__schema":{"queryType":{"fields":[{"name":"now","description":"Renvoie les valeurs actuelles de date et d'heure."}]}}}}

Performance

Obviously defining the schema and types before processing each request is going to cost something but it would be good to know what. This is a question some load testing with wrk2 can give us insight into (with the caveat that both the server and the load testing program were running on the same laptop, so take this with a large grain of salt).

First, a version without the schema per request:

mike@sleepycat:~$ wrk2 -t4 -c100 -d30s -R2000 --latency "http://localhost:3000/graphql?query=%7B%0A%20%20now%20%7B%0A%20%20%20%20date%0A%20%20%20%20time%0A%20%20%7D%0A%7D"
Running 30s test @ http://localhost:3000/graphql?query=%7B%0A%20%20now%20%7B%0A%20%20%20%20date%0A%20%20%20%20time%0A%20%20%7D%0A%7D
  4 threads and 100 connections
  Thread calibration: mean lat.: 3309.908ms, rate sampling interval: 11272ms
  Thread calibration: mean lat.: 3310.066ms, rate sampling interval: 11280ms
  Thread calibration: mean lat.: 3308.845ms, rate sampling interval: 11280ms
  Thread calibration: mean lat.: 3310.191ms, rate sampling interval: 11280ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    11.83s     3.26s   17.47s    57.94%
    Req/Sec   214.00      0.00   214.00    100.00%

And now the one with our i18n schema per request:

mike@sleepycat:~$ wrk2 -t4 -c100 -d30s -R2000 --latency "http://localhost:3000/graphql?query=%7B%0A%20%20now%20%7B%0A%20%20%20%20date%0A%20%20%20%20time%0A%20%20%7D%0A%7D"
Running 30s test @ http://localhost:3000/graphql?query=%7B%0A%20%20now%20%7B%0A%20%20%20%20date%0A%20%20%20%20time%0A%20%20%7D%0A%7D
  4 threads and 100 connections
  Thread calibration: mean lat.: 3196.913ms, rate sampling interval: 11296ms
  Thread calibration: mean lat.: 3196.537ms, rate sampling interval: 11288ms
  Thread calibration: mean lat.: 3115.362ms, rate sampling interval: 10493ms
  Thread calibration: mean lat.: 3196.062ms, rate sampling interval: 11288ms
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    11.91s     3.35s   17.81s    57.65%
    Req/Sec   207.00      0.00   207.00    100.00%

So generating our schema/types per request drops us from 214 requests per second down to 207. Clearly it’s not free, but in this little example it’s pretty reasonable and in this world of microservices, there are a fair number of services that aren’t much bigger than this example. That said, a ~3% drop for something so simple is probably something you would want to watch carefully. A larger schema with more imports might well be far more costly. Clearly this little performance test is far from rigorous, but it’s nice to have some vague sense of the impact.

i18n and GraphQL

In the end this blog post is as much a question as it is an answer. I’m pretty certain their are futher refinements to be made here and that there are ways to avoid some or maybe all of the performance penalty highlighted above.
I’m also pretty curious about what it would look like to get a similar thing happening with Apollo Server.
Hopefully this will help other people who are trying to do i18n with GraphQL and maybe surface better options.