Explore the graph with Mule and Neo4j

motif

Mule has a very extensive support for data stores, which covers pretty much the whole spectrum of what’s available out there, from key/value stores to document-oriented databases. The only piece that was missing in the puzzle was connectivity to a graph database: with the introduction of the Neo4j connector, the gap is now closed.

Popularized by the advent of social media, the need for efficiently storing, indexing, traversing and querying graphs of objects has become prominent in less than a decade. During this time, Neo4j has risen to the number one graph database on the market, with successful deployments across all types of industries and a strong commitment to open source.

The new connector, presented in this blog, allows Mule users to leverage the incredibly rich API that Neo4j offers with convenient configuration elements. Read on to discover a simple example built with this connector.

Prerequisites

In order to follow this example you’ll need:

Loading sample data in Neo4j

In this example, we will use the dataset that Neo4j made public for the Hubway Data Challenge (Hubway is a bike sharing service which is currently expanding worldwide). We will expose a Cypher query as a basic HTTP resource: this query will let the caller retrieve the nearby Hubway stations and how “hot” they are (ie how much trips started or ended at these stations).

  1. Download the dataset: hubway_data_challenge_boston.zip
  2. Stop your Neo4j server
  3. In case you care, back-up your existing data, ie the /path/to/neo/data directory
  4. Extract the zip file into /path/to/neo/data/graph.db
  5. Start the server again
  6. Browse the Neo4j dashboard and confirm you have ~500K nodes loaded as shown below:

Getting Mule Studio Ready

The Neo4j module doesn’t come bundled with Mule Studio so we have to install it first. For this, we have to do the following:

  1. Open Mule Studio and from “Help” menu select “Install New Software…”. The installation dialog – shown below – opens.
  2. From “Work with” drop down, select “MuleStudio Cloud Connectors Update Site”. The list of available connectors will be shown to you.
  3. Find and select the Neo4j module in the list of available connectors.
  4. When you are done selecting the Neo4j module, click on “Next” button. Installation details are shown on the next page. Click on “Next” button again and accept the terms of the license agreement.
  5. Click on “Finish” button. The Neo4j module is downloaded and installed onto Studio. You’ll need to restart the Studio for the installation to be completed.

Setting up the project

Now that we’ve got Mule Studio up and running, it’s time to work on the Mule Application. Create a new Mule Project by clicking on “File > New > Mule Project”. In the new project dialog box, the only thing you are required to enter is the name of the project: use “neo4j-example” or similar. You can click on “Next” to go through the rest of pages.

The first thing to do in our new application is to configure the Neo4j connector to connect to our local server.

We assume that you have not added security (password protection) nor changed the default port it listens to.

For this, in the message flow editor, click on “Global Elements” tab on the bottom of the page. Then click on “Create” button on the top right of the tab. In the “Choose Global Element” type dialog box that opens select “Neo4j” under “Cloud Connectors” and click OK. In the Neo4j configuration dialog box that follows, set the name to “Neo4j”.

You are done with the configuration. Click “OK” to close the dialog box.

Building the HTTP service flow

It’s time to start building the flow that will expose a Cypher query over HTTP (learn more about the Cypher query language).

Start by dropping an HTTP element on the visual editor and configure it as below:

The Cypher query we want to run is the following:

START n=node:locations('withinDistance:[#[message.InboundProperties.lat],#[message.InboundProperties.lon], 0.5]')
MATCH (t)-[:`START`|END]->n
RETURN n.stationId, n.name, count(*)
ORDER BY count(*) DESC

This query selects the Hubway stations 500 meters or less from the coordinates provided via two HTTP query parameters (lat and lon, extracted with embedded MEL expressions), and sorts them according to the amount of the trips that started or ended there. This way, “hot” stations will come first in the list.

For this, drop one Neo4j element from the “Cloud Connectors” section of the Studio palette. Edit its properties as shown below:

We want to return the results as a JSON array of objects, which will have the id, name and heat members. For this we use an expression transformer to build a list of maps that we will then serialize as JSON. This simple MEL projection performs the transformation: (['id':$.get(0),'name':$.get(1),'heat':$.get(2)] in payload.data) So drop an expression transformer in the flow, right after the Neo4j query element and configure it as shown below:

Finally, drop an “Object to JSON” transformer in flow, after the expression transformer. Your flow should look very much like this:

Your full XML configuration should be very much like this one:

Testing the application

Now it’s time to test the application. Run the application in Mule Studio using Run As > Mule Application.

Now browse to http://localhost:8081/hubway/hotspots?lat=42.353412&lon=-71.044624 in your favorite browser. You should see this JSON structure:

This example covers only one operation of the Neo4j connector, running a Cypher query. There are many more operations available that allow you to create, update and delete nodes and relationships, to traverse these relationships and run graph algorithms. Now that you’ve taken your first steps with this new connector, go on and explore further the world of graph databases!

 

 


We'd love to hear your opinion on this post


7 Responses to “Explore the graph with Mule and Neo4j”

  1. […] Read the full Article […]

  2. When i try to run this example,I am getting the following stacktrace:
    Root Exception stack trace:
    org.mule.api.DefaultMuleException: Received status code: 400 but was expecting: [200]
    at org.mule.modules.neo4j.Neo4jConnector.sendHttpRequest(Neo4jConnector.java:595)
    at org.mule.modules.neo4j.Neo4jConnector.sendRequestWithEntity(Neo4jConnector.java:529)
    at org.mule.modules.neo4j.Neo4jConnector.postEntity(Neo4jConnector.java:463)
    Anything i am missing here.

  3. What happens when you run the cypher query directly in the web console? Use 42.353412 and -71.044624 for the lat and lon parameters.

  4. Also be aware in the above text, WordPress replaces & with & in the test URL. This is obviously wrong. The valid test URL is: http://localhost:8081/hubway/hotspots?lat=42.353412&lon=-71.044624

  5. I tried running the cypher query and got the following stacktrace..

    No index provider ‘spatial’ found. Maybe the intended provider (or one more of its dependencies) aren’t on the classpath or it failed to load.

    StackTrace:
    org.neo4j.kernel.IndexManagerImpl.getIndexProvider(IndexManagerImpl.java:90)
    org.neo4j.kernel.IndexManagerImpl.getOrCreateNodeIndex(IndexManagerImpl.java:314)
    org.neo4j.kernel.IndexManagerImpl.forNodes(IndexManagerImpl.java:300)
    org.neo4j.kernel.IndexManagerImpl.forNodes(IndexManagerImpl.java:294)
    org.neo4j.cypher.internal.spi.gdsimpl.GDSBackedQueryContext$$anon$1.indexQuery(GDSBackedQueryContext.scala:87)
    org.neo4j.cypher.internal.executionplan.builders.IndexQueryBuilder$$anonfun$getNodeGetter$2.apply(IndexQueryBuilder.scala:83)
    org.neo4j.cypher.internal.executionplan.builders.IndexQueryBuilder$$anonfun$getNodeGetter$2.apply(IndexQueryBuilder.scala:81)
    org.neo4j.cypher.internal.pipes.matching.MonoDirectionalTraversalMatcher.findMatchingPaths(MonodirectionalT

  6. Got your example working by installing the neo4j-spatial plugin

  7. Good catch: the prerequisites section was missing from the blog post. Thank you.