Graph migrations

One of the things that is not obvious at first glance is how “broad” ArangoDB is. By combining the flexibility of the document model with the joins of the graph model, ArangoDB has become a viable replacement for Postgres or MySQL, which is exactly how I have been using it; for the last two years it’s been my only database.

One of the things that falls out of that type of usage is a need to actually change the structure of your graph. In a graph, structure comes from the edges that connect your vertices. Since both the vertices and edges are just documents, that structure is actually just more data. Changing your graph structure essentially means refactoring your data.

There are definitely patterns that appear in that refactoring, and over the last little while I have been playing with putting the obvious ones into a library called graph_migrations. This is work in progress but there are some useful functions working already and could use some proper documentation.

eagerDelete

One of the first of these is what I have called eagerDelete. If you were wanting to delete Bob from graph below, Charlie and Dave would be orphaned.

Screenshot from 2016-04-06 10-55-54

Deleting Bob with eagerDelete means that Bob is deleted as well as any neighbors whose only neighbor is Bob.

gm = new GraphMigration("test") //use the database named "test"
gm.eagerDelete({name: "Bob"}, "knows_graph")

alice_eve

mergeVertices

Occasionally you will end up with duplicate vertices, which should be merged together. Below you can see we have an extra Charlie vertex.

extra_charlie

gm = new GraphMigration("test")
gm.mergeVertices({name: "CHARLIE"},{name: "Charlie"}, "knows_graph")

merged_charlie

attributeToVertex

One of the other common transformations is needing to make a vertex out of attribute. This process of “promoting” something to be a vertex is sometimes called reifying. Lets say Eve and Charlie are programmers.

knows_graph

Lets add an attribute called job to both Eve and Charlie identifying them as programmers:

adding_job_attr

But lets say that we decide that it makes more sense for job: "programmer" to be a vertex on it’s own (we want to reify it). We can use the attributeToVertex function for that, but because Arango allows us to split our edge collections and it’s good practice to do that, lets add a new edge collection to our “knows_graph” to store the edges that will be created when we reify this attribute.

adding_works_as

With that we can run attributeToVertex, telling it the attribute(s) to look for, the graph (knows_graph) to search and the collection to save the edges in (works_as).

gm = new GraphMigration("test")
gm.attributeToVertex({job: "programmer"}, "knows_graph", "works_as", {direction: "inbound"})

The result is this:

after_attrTo_vertex

vertexToAttribute

Another common transformation is exactly the reverse of what we just did; folding the job: "programmer" vertex into the vertices connected to it.

gm = new GraphMigration("test")
gm.vertexToAttribute({job: "programmer"}, "knows_graph", {direction: "inbound"})

That code puts us right back to where we started, with Eve and Charlie both having a job: "programmer" attribute.

knows_graph

redirectEdges

There are times when things are just not connected the way you want. Lets say in our knows_graph we want all the inbound edges pointing at Bob to point instead to Charlie.
knows_graph

We can use redirectEdges to do exactly that.

gm = new GraphMigration("test")
gm.redirectEdges({_id: "persons/bob"}, {_id: "persons/charlie"}, "knows_graph", {direction: "inbound"})

And now Eve and Alice know Charlie.

redirected_edges

Where to go from here.

As the name “graph migrations” suggests the thinking behind this was to create something similar to the Active Record Migrations library from Ruby on Rails but for graphs.

As more and more of this goes from idea to code and I get a chance to play with it, I’m less sure that a direct copy of Migrations makes sense. Graphs are actually pretty fine-grained data in the end and maybe something more interactive makes sense. It could be that this makes more sense as a Foxx app or perhaps part of Arangojs or ArangoDB’s admin interface. It feels a little to early to tell.

Beyond providing a little documentation the hope here is make this a little more visible to people who are thinking along the same lines and might be interested in contributing.

Back up your data, give it a try and tell me what you think.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s