Super simple data integration with RESTx: An example

motif

Most people who ever worked in real-world projects agree that at some point custom code becomes necessary. Pre-fabricated connectors, filter and pipeline logic can only go so far. And to top it off, using those pre-fabricated integration logic components often becomes cumbersome for anything but the most trivial data integration and processing tasks.

With RESTx – a platform for the rapid creation of RESTful web services – we recognize that custom code will always remain part of serious data integration tasks. As developers, we already know about a concise, standardized and very well defined way to express what we want: The programming languages we use every day! Why should we have to deal with complex, unfamiliar configuration files or UI tools that still restrict us in what we can do, if it is often so much more concise and simple to just write down in code what you want to have done?

Therefore, RESTx embraces custom code: Writing it and expressing your data integration logic with it is made as simple as possible.

Let me illustrate how straight forward it is to integrate data using just a few lines of clear, easy to read code.

Background

RESTx uses RESTful resources as the most basic building block for data integration. The data you want is accessible via URIs. If your data lives in a database, in the cloud or behindcomb some proprietary API then you can easily write a specialized component, which can retrieve that information. Once that is done, this data is also available through a URI. Therefore, when we discuss data integration, we assume that the resources you need are already available through URIs, provided by RESTx. As we will see, each of those URIs is an easy to use building block for further data integration tasks.

RESTx takes care of data formatting between resources. If you access the data from a RESTx resource in your component code – via the accessResource() method – the data you receive is already transformed into an object of your programming language, for example a Map/dict or list.

Example scenario

Assume you are working in a Fortune-500 company who wishes to monitor what is being written about it on the Internet. It’s easy enough to do a Google search on the company’s name. However, they would like to focus on those results that appear on web-sites for which they know the PR contact already.

Assume further that the company maintains a list of those PR contacts in-house. They know which PR agency takes care of what web-site. So, they can find out who to talk to if an article appears on marketwire.com, for example. This list is maintained in a database somewhere, or maybe a flat text file, it doesn’t really matter.

Overview of the solution architecture

The task

Create a data resource, which displays fresh Google search results IF the page to which it refers has a known PR contact. Annotate the search results with the PR agency’s contact information.

Assumptions

We assume here that we have two resources available, which are provided by RESTx. On one hand, we have a search resource, which returns Google search results for our company name. Such a resource is trivially simple to create with an HTML form provided by RESTx. Then, when we click on this resource’s URI in a browser, RESTx returns a page like this:

Results of from the Google search resource

We can see that Google search results are returned as a list of dictionaries/maps. Each dictionary contains a few elements that describe a single search result.

On the other hand, we also assume we already have a resource that returns the PR contacts for specific web sites. If we click on that resource URI, we get the following result:

Data from the PR contact resource

This, we can see, is a dictionary/map, which is keyed on the URI of the web site. Each entry is a further dictionary/map containing information about the PR contact.

The integration code

To perform the requested task, we need to accomplish these things:

  1. Retrieve data from the Google search resource.
  2. Retrieve data from the PR contact resource.
  3. Process the search results, filtering out those for which we don’t have a PR contact and combining the PR contacts with the search result for the remaining ones.
  4. Return this information to the client.

Components can be written in Java or Python, with more languages to follow soon. Here, we are showing a example. You can easily create a new component template for yourself with the command:

# restxctl component create ResultPRCombiner python

Once you open the component file, you can edit the service methods you like. Assume we create a service method called combined_results(). In good RESTful fashion, a URI should be a noun. Therefore, the service methods in RESTx components are used to implement a sub-resource, rather than an action. Thus, their names should be nouns.

Data from other resources is accessed via the accessResource() method. So, we can now write the following snipped of code in the body of our combined_results() service method:

 1: code, pr_contacts    = accessResource("/resource/PR_contacts/for_websites")
 2: code, search_results = accessResource("/resource/AboutUs/search", params={"num":"50"})
 3:
 4: result = list()
 5: for res in search_results:
 6:     if res['visibleUrl'] in pr_contacts:
 7:         result.append({
 8:              "url"         : res['url'],
 9:              "content"     : res['content'],
10:              "pr_contacts" : {
11:                 "site"    : res['visibleUrl'],
12:                 "contact" : pr_contacts[res['visibleUrl']]
13:             }
14:        })
15: return Result.ok(result)

What’s going on here?

In lines 1 and 2 we retrieve the data from our two pre-defined resources. First for the PR contacts and then for the Google search results. We see that we pass an additional parameter num=50 to Google search result resource, indicating the number of results we want to receive.

In RESTx those two lines are all we need to retrieve this data. The results pr_contacts and search_results are dictionaries and lists, respectively, which we can naturally work with. RESTx takes care to ensure that data is represented in a generic manner that the component code can easily handle.

Using Python’s list comprehension, it is possible to write all the remaining code in just a single line. However, for clarity, it is written here in a slightly more verbose manner. result is created as an empty list in line 4. We then iterate over all results in search_results (line 5). We test if we have a PR contact for that site by checking whether the visibleUrl element of that result appears in our pr_contacts dictionary (line 6). If that is the case, we append a new entry to the result list: A dictionary containing a few elements of the search result, plus the PR contact information. In the end (line 15), the whole result list is returned. Note that this list will automatically be rendered by RESTx as HTML, JSON or other format, depending on the client request.

If we now create a resource for this new component and visit the URI of the service method, we get the following output:

Integrate_3_small

Exactly what was requested! All done in just a few lines of clear and concise code.

Of course, as always with RESTx, these results are also available in other formats. For example, when accessing this resource URI with a client application setting the “Accept: application/json” request header, the resulting list of dictionaries is returned in plain JSON, easy to parse and process:

[
    {
        "content": "MuleSoft, formerly MuleSource, is a provider of software, support, and services ...", 
        "pr_contacts": {
            "contact": {
                "email": "fsample@sampleprinc.com", 
                "lastname": "Frank Sample", 
                "organization": "Sample PR, Inc.", 
                "phone": "(555) 555-1212"
            }, 
            "site": "en.wikipedia.org"
        }, 
        "url": "http://en.wikipedia.org/wiki/MuleSoft"
    }, 
    {
        "content": "Jun 22, 2010 ... MuleSoft, the Web Middleware Company, today announced a partnership with Chariot Solutions...", 
        "pr_contacts": {
            "contact": {
                "email": "loretta@swish-super-pr.com", 
                "lastname": "Loretta Pressrelease", 
                "organization": "Swish & Sons Super PR, Ltd.", 
                "phone": "(444) 555-2121"
            }, 
            "site": "www.marketwire.com"
        }, 
        "url": "http://www.marketwire.com/press-release/MuleSoft-Partners-With-Chariot-Solutions-to-Offer-Mule-ESB-Services-1279797.htm"
    },
    ......
]

Integration with modular resources: Resilient to change

Did you notice that the integration code retrieved Google search results by referring to the /resource/AboutUs resource? It did not access the Google search API directly. Instead, it merely took advantage of a previously created RESTful resource, which might also be used in other contexts.

So, what happens when the company changes its name? Or if someone figures out a better Google query, which excludes false hits, for example? In that case, only the AboutUs resource needs to be updated. This special filter and annotation component we created here and any derived resources remain completely unaffected by this change: We just continue to use the AboutUs resource, which now happens to be new and improved.

With RESTx each created RESTful resource can act as a building block for further integrations and resources. But change does not have to percolate all the way up. The first line of resources can already deal with it, effectively isolating and protecting higher integration levels, user bookmarks and client applications.

Conclusion

Writing integration code with RESTx is very straight forward. Other resources are easy to access with a single function call and whatever data a RESTx resource returns, it always is automatically converted to an object you can directly process in the component’s programming language.

Likewise, output of a service method is just a native language object, which is converted as requested by the client, or passed as-is if the resource is accessed from other component code.

We believe that writing this kind of integration code in most cases is quicker, simpler, more compact, easier to maintain and much more flexible than trying to accomplish something similar merely with configuration files or even a UI-based integration designer.

We invite you to download RESTx and give it a try. Our quick start guide will have you up and running in 5 minutes.

Most people who ever worked in real-world data integration projects agree that at some point custom code becomes necessary. Pre-fabricated connectors, filter and pipeline logic can only go so far. And to top it off, using those pre-fabricated integration logic components often becomes cumbersome for anything but the most trivial data integration and processing tasks.

With RESTx – a platform for the rapid creation of RESTful web services – we recognize that custom code will always remain part of serious data integration tasks. As developers, we already know about a concise, standardized and very well defined way to express what we want: The programming languages we use every day! Why should we have to deal with complex, unfamiliar configuration files or UI tools that still restrict us in what we can do, if it is often so much more concise and simple to just write down in code what you want to have done?

Therefore, RESTx embraces custom code: Writing it and expressing your data integration logic with it is made as simple as possible.

Let me illustrate how straight forward it is to integrate data resources using just a few lines of clear, easy to read code.

Background

RESTx uses RESTful resources as the most basic building block for data integration. The data you want is accessible via URIs. If your data lives in a database, in the cloud or behindcomb some proprietary API then you can easily write a specialized component, which can retrieve that information. Once that is done, this data is also available through a URI. Therefore, when we discuss data integration, we assume that the resources you need are already available through URIs, provided by RESTx. As we will see, each of those URIs is an easy to use building block for further data integration tasks.

RESTx takes care of data formatting between resources. If you access the data from a RESTx resource in your component code – via the accessResource() method – the data you receive is already transformed into an object of your programming language, for example a Map/dict or list.

Example scenario

Assume you are working in a Fortune-500 company who wishes to monitor what is being written about it on the Internet. It’s easy enough to do a Google search on the company’s name. However, they would like to focus on those results that appear on web-sites for which they know the PR contact already.

Assume further that the company maintains a list of those PR contacts in-house. They know which PR agency takes care of what web-site. So, they can find out who to talk to if an article appears on marketwire.com, for example. This list is maintained in a database somewhere, or maybe a flat text file, it doesn’t really matter.

Overview of the solution architecture

The task

Create a data resource, which displays fresh Google search results IF the page to which it refers has a known PR contact. Annotate the search results with the PR agency’s contact information.

Assumptions

We assume here that we have two resources available, which are provided by RESTx. On one hand, we have a search resource, which returns Google search results for our company name. Such a resource is trivially simple to create with an HTML form provided by RESTx. Then, when we click on this resource’s URI in a browser, RESTx returns a page like this:

Results of from the Google search resource

We can see that Google search results are returned as a list of dictionaries/maps. Each dictionary contains a few elements that describe a single search result.

On the other hand, we also assume we already have a resource that returns the PR contacts for specific web sites. If we click on that resource URI, we get the following result:

Data from the PR contact resource

This, we can see, is a dictionary/map, which is keyed on the URI of the web site. Each entry is a further dictionary/map containing information about the PR contact.

The integration code

To perform the requested task, we need to accomplish these things:

  1. Retrieve data from the Google search resource.
  2. Retrieve data from the PR contact resource.
  3. Process the search results, filtering out those for which we don’t have a PR contact and combining the PR contacts with the search result for the remaining ones.
  4. Return this information to the client.

Components can be written in Java or Python, with more languages to follow soon. Here, we are showing a Python example. You can easily create a new component template for yourself with the command:

# restxctl component create ResultPRCombiner python

Once you open the component file, you can edit the service methods you like. Assume we create a service method called combined_results(). In good RESTful fashion, a URI should be a noun. Therefore, the service methods in RESTx components are used to implement a sub-resource, rather than an action. Thus, their names should be nouns.

Data from other resources is accessed via the accessResource() method. So, we can now write the following snipped of code in the body of our combined_results() service method:

 1: code, pr_contacts    = accessResource("/resource/PR_contacts/for_websites")
 2: code, search_results = accessResource("/resource/AboutUs/search", params={"num":"50"})
 3:
 4: result = list()
 5: for res in search_results:
 6:     if res['visibleUrl'] in pr_contacts:
 7:         result.append({
 8:              "url"         : res['url'],
 9:              "content"     : res['content'],
10:              "pr_contacts" : {
11:                 "site"    : res['visibleUrl'],
12:                 "contact" : pr_contacts[res['visibleUrl']]
13:             }
14:        })
15: return Result.ok(result)

What’s going on here?

In lines 1 and 2 we retrieve the data from our two pre-defined resources. First for the PR contacts and then for the Google search results. We see that we pass an additional parameter num=50 to Google search result resource, indicating the number of results we want to receive.

In RESTx those two lines are all we need to retrieve this data. The results pr_contacts and search_results are dictionaries and lists, respectively, which we can naturally work with. RESTx takes care to ensure that data is represented in a generic manner that the component code can easily handle.

Using Python’s list comprehension, it is possible to write all the remaining code in just a single line. However, for clarity, it is written here in a slightly more verbose manner. result is created as an empty list in line 4. We then iterate over all results in search_results (line 5). We test if we have a PR contact for that site by checking whether the visibleUrl element of that result appears in our pr_contacts dictionary (line 6). If that is the case, we append a new entry to the result list: A dictionary containing a few elements of the search result, plus the PR contact information. In the end (line 15), the whole result list is returned. Note that this list will automatically be rendered by RESTx as HTML, JSON or other format, depending on the client request.

If we now create a resource for this new component and visit the URI of the service method, we get the following output:

Integrate_3_small

Exactly what was requested! All done in just a few lines of clear and concise code.

Of course, as always with RESTx, these results are also available in other formats. For example, when accessing this resource URI with a client application setting the “Accept: application/json” request header, the resulting list of dictionaries is returned in plain JSON, easy to parse and process:

[
    {
        "content": "MuleSoft, formerly MuleSource, is a provider of software, support, and services ...", 
        "pr_contacts": {
            "contact": {
                "email": "fsample@sampleprinc.com", 
                "lastname": "Frank Sample", 
                "organization": "Sample PR, Inc.", 
                "phone": "(555) 555-1212"
            }, 
            "site": "en.wikipedia.org"
        }, 
        "url": "http://en.wikipedia.org/wiki/MuleSoft"
    }, 
    {
        "content": "Jun 22, 2010 ... MuleSoft, the Web Middleware Company, today announced a partnership with Chariot Solutions...", 
        "pr_contacts": {
            "contact": {
                "email": "loretta@swish-super-pr.com", 
                "lastname": "Loretta Pressrelease", 
                "organization": "Swish & Sons Super PR, Ltd.", 
                "phone": "(444) 555-2121"
            }, 
            "site": "www.marketwire.com"
        }, 
        "url": "http://www.marketwire.com/press-release/MuleSoft-Partners-With-Chariot-Solutions-to-Offer-Mule-ESB-Services-1279797.htm"
    },
    ......
]

Integration with modular resources: Resilient to change

Did you notice that the integration code retrieved Google search results by referring to the /resource/AboutUs resource? It did not access the Google search API directly. Instead, it merely took advantage of a previously created RESTful resource, which might also be used in other contexts.

So, what happens when the company changes its name? Or if someone figures out a better Google query, which excludes false hits, for example? In that case, only the AboutUs resource needs to be updated. This special filter and annotation component we created here and any derived resources remain completely unaffected by this change: We just continue to use the AboutUs resource, which now happens to be new and improved.

With RESTx each created RESTful resource can act as a building block for further integrations and resources. But change does not have to percolate all the way up. The first line of resources can already deal with it, effectively isolating and protecting higher integration levels, user bookmarks and client applications.

Conclusion

Writing integration code with RESTx is very straight forward. Other resources are easy to access with a single function call and whatever data a RESTx resource returns, it always is automatically converted to an object you can directly process in the component’s programming language.

Likewise, output of a service method is just a native language object, which is converted as requested by the client, or passed as-is if the resource is accessed from other component code.

We believe that writing this kind of integration code in most cases is quicker, simpler, more compact, easier to maintain and much more flexible than trying to accomplish something similar merely with configuration files or even a UI-based integration designer.

We invite you to download RESTx and give it a try. Our quick start guide will have you up and running in 5 minutes.


We'd love to hear your opinion on this post


One Response to “Super simple data integration with RESTx: An example”

  1. […] Super Simple Data Integration with RESTx: An Example […]