Tuesday 12 March 2013

Retrieving paths in Neo4JClient

This post relates to using Neo4J, Neo4JClient in .NET applications and how to retrieve paths from your Neo4J databases and turn them straight into POCO objects.

This is a bit more of an intermediate post and is specifically geared at .NET developers who may be similarly struggling with this issue. It MAY help people using other REST languages into Neo4J, but I make no guarantees there. I will at some point try writing a little more on helping move newer people towards a nirvana of graphy greatness, so keep an eye on the usual places for any news on that.

It is assumed that you already have some basic knowledge of Cypher and how to use Neo4JClient to a basic level, so you'll get the jist of what's going on here. Cameron Tinker @CameronJTinker has written a good Neo4JClient tutorial that may help you get started.

Introduction


Recently, I posted about getting started with Neo4J, which was based on my own starting up experiences. This one takes a significant jump forward from what can only comparitively be called "Hello World!". Now, I'm tackling a problem that's plagued me for quite some time that I only recently was helped to a solution to. That is the not too trivial idea of getting hold of paths in .NET.

The Neo4JClient is a good way of working with Neo4J within a .NET app, supporting all your basic CRUD operations in a pretty .NETty way, keeping your code clean and understandable and best of all, handles mapping query results, inserts and updates to POCO objects. It supports both Gremlin and Cypher, and the latter is what I'll be focussing on again. Really, it's not that I don't like Gremlin, or Gremlins, and it does have a cool logo. But Cypher suits our needs much better here.

The Problem


Ok, so we can put stuff in our database and run queries and get a single type of node out of our database, yada yada. Good, sure. But what if, like me, you needed Neo4J for something more path related. Displaying paths on a UI, etc. Well, I consulted the oracle of all knowledge (well I googled it) in many, many different ways. I read through Neo4JClient's documentation, which exists in the form of it's unit tests. I looked at tutorials. I span around on my chair, thought that was silly, and tried googling again. Nothing, as of this writing.

The Solution


So, I took to StackOverflow in the hopes of attracting the experts's help. I asked about Neo4JClient and paths in general, and with some help, I eventually found that Neo4J itself when asked to return a path in REST was giving back a path alright, but in URIs. An example:
200 OK
{
"columns" : [ "p" ],
"data" : [ [ {
"start" : "http://localhost:7474/db/data/node/1",
"nodes" : [ "http://localhost:7474/db/data/node/1", "http://localhost:7474/db/data/node/4", "http://localhost:7474/db/data/node/22" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/2", "http://localhost:7474/db/data/relationship/294" ],
"end" : "http://localhost:7474/db/data/node/22"
} ], [ {
"start" : "http://localhost:7474/db/data/node/1",
"nodes" : [ "http://localhost:7474/db/data/node/1", "http://localhost:7474/db/data/node/3", "http://localhost:7474/db/data/node/22" ],
"length" : 2,
"relationships" : [ "http://localhost:7474/db/data/relationship/1", "http://localhost:7474/db/data/relationship/114" ],
"end" : "http://localhost:7474/db/data/node/22"
} ] ]

In case you're interested, the above comes from the Dr Who database, running the same query as in my last post, looking for the species of all who the doctor loves. But this time we're returning a path.

Neo4JClient was able to retrieve this, using a class named PathsResult that's relaxing in the Neo4J.APIModels.Cypher namespace. Here's a little mocked up example of using that.

// ... usings
using Neo4jClient.ApiModels.Cypher;

// in method that's retrieving paths, with query as your fluent Cypher object
var results = query.Returns<PathsResult>("path").Results;
// do stuff with your URI based results

Chances are, like me, you're not using Neo4J to show collections of REST URLs though, right? So we first need to get around this.

Getting fully represented paths from Neo4J REST

So I asked again on StackOverflow, and it was suggested that I try the EXTRACT function to retrieve the node IDs of the path in my Cypher queries. This worked to get the IDs, so I decided to try taking this further and started retrieving the nodes themselves using extract, which to my amazement, got around the REST issues. Using a return statement similar to this:
RETURN EXTRACT(n in nodes(p) : n) AS nodes_in_path;

As a brief explanation, extract retrieves parts of a collection in Cypher. The nodes function when used on a path returns just the nodes. So we're extracting just the nodes from the path, simply.

It's not a huge jump to also get the relationships as well, using rels,like this:
RETURN EXTRACT(n in nodes(p) : n) AS nodes_in_path, EXTRACT(rel in rels(p) : rel) AS relations_in_path;

You'll now be retrieving 2 columns, containing the nodes and relationships of your path. That's not bad, maybe Neo4JClient can get its data fix from this?

Go Go POCO

The first step is to make a wrapper POCO to handle retrieving paths in this 2 columned format. Here's an example, though you'd replace MyNodeType and MyRelationshipType with your own objects for the expected Nodes and Relations respectively.

public class PathPoco
{
    public List<Node<MyNodeType>> Nodes { get; set; }
    public List<RelationshipInstance<MyRelationshipType>> Relations { get; set; }
}

A little bit of explanation: Each of these PathPocos represent an individual path. The Nodes list contains the nodes, the relations, unsurprisingly, contains the relationships. We're getting a Node and RelationshipInstance of the parts of the path, so we have access to their reference type information later on, though that is completely up to you. If typing ".Data" to get at the node or relationship goodies gets too irritating, by all means skip the Node<> bit.

Using this, we can get something much more useable in the form of a path from Neo4J.

// using query again as our Cypher fluent object that contains the rest of whatever query we want
var paths = query.Returns<PathPoco>("EXTRACT(n in nodes(p) : n) AS Nodes, EXTRACT(rel in rels(p) : rel) AS Rrelations", CypherResultMode.Projection).Results;

Note: Make sure you give the property names of your PathPoco class as the column aliases, so Neo4JClient knows what to put where.

The Cypher projection part allows you to specify that each column will map to a different property of the returned object, in this case a PathPoco. Nodes to Nodes, Relations to Relations, Face to Face, Back to Back, you get it.

And there, you will have an IEnumerable<PathPoco> containing paths, containing both their nodes and relationships, and these will be nodes and RelationshipInstances of your given types. Remember, there should be 1 more node than relationship, since relationships only exist between the nodes of the path.

Now you've got pretty much all you need to do things with your path in a front end of your choosing. Display it on an ASP.NET view, make some nice WPF flashy coloury Window out of it, or simply print it to the console if that's what floats your boat.

Conclusion


With mastery of getting paths and not just nodes or relationships back from Neo, you'll be able to tackle a much brauder set of use cases. However, this may have to be adjusted for different scenarios, for example in the likely scenario that your path contains different types of nodes and relationships. One untested suggestion for this case is to have different properties for each type and extract each type of node or relationship in a different matching column. I'd be interested to hear if this works or otherwise.

I hope you find this information as useful and helpful as I did when I discovered it.

Good luck, path finders!

3 comments:

  1. It's great to see that you found my blog post and were able to recommend it to beginners! I may use some of your path finding techniques soon as I flesh out the design of my social graph.

    ReplyDelete
  2. Hi Craig. Al from work here. Thought I'd take Dean's suggestion at face value and comment on you excellent blog. Can't say I understood every word - actually, I did, I just didn't get many of the sentences - but humour me when I tell you that only today I mastered Excel MID functions when extracting post-@ email address strings. Are these vaguely analogous to your paths?

    ReplyDelete
  3. By "you" I of course meant "your". And by "mastered" I meant "stumbled through a thicket of misunderstanding to some moderate success that came about more by accident than design with".

    ReplyDelete

As always, feel free to leave a comment.