How to untie multilevel structures with DataWeave recursive calls

August 10 2020

0 comments

It’s rare for developers to work with flat data structures — instead we often work with multilevel data structures. Normally, XML uses multiple layers of hierarchy. We need to perform changes on all the levels, without knowing how deep into the hierarchy we need to process the entire structure.

Developers typically use recursive calls to solve these types of problems. In this blog post, we will see how to implement simple recursive calls using DataWeave. If you’re interested in learning more about DataWeave, attend our instructor-led training course Anypoint Platform Development: DataWeave 2.0

Recursion in DataWeave

To demonstrate the importance of the recursive call let’s solve a simple task.

We have a multi-layer (six levels deep) numerical array that we want to increment every number on every level by two:

Let’s consider for a moment how we might approach solving this non-trivial task, that’s assuming we don’t know how deep this structure might be.

The structural solution we might consider could be as follows: 

  1. Start iterating over each element in the array.
  2. Check if it is a number.
    1. If so, increment it by two.
    2. Otherwise, pass the element (which is an array in our case) and send it to the beginning of the second step.

The repetition of step two, on each element of the arrays hierarchy, is an implementation of a recursive call.

Let’s now look at how to implement this structural solution using DataWeave code.

DataWeave recursion implementation

To implement recursion in DataWeave we need to create a function that defines the code to apply at each recursive level. Let’s call this function recursiveChange

Within the function, we will define a condition to check if the element is a number. If so, we will increment it by two, otherwise we call the same function again and pass it the element: 

This function is a recursive function that once called with the initial array will recursively iterate over all elements in the array and apply the addition of two to all numerical elements. Next, execute the recursiveChange function:

Figure 1: Execution of the recursiveCharge function

Note that we are assuming the only possible elements in the array are either a number or another array. For more complex data structures, further type checking may be desired.

Recursion depth limit

At this point, you might be asking yourself “How deep can we go and are there any hard limits?” Let’s write a function that demonstrates the limit of DataWeave recursion.

The  function will calculate the sum of a consecutive number series: 

In this calculation, we are looking to find the value of n. The simplest way to implement this is to count down from an input value — and through trial and error — find the limit: 

Let’s try this with a limit of 10.

 Fig. 2 Recursive sum of 10 succeeds

The function executes without issue, so we can conclude that DataWeave can handle a depth of 10 levels. 

Are there any boundaries here? Can we reach a limit of recursive depth? Let’s try 254. Again the function executed without issue:

Fig. 3: Recursive sum of 254 succeeds

Let’s try 255.

 Fig. 4. Stack Overflow exception is thrown with a depth of 256

As we can see from the exception thrown, there is a maximum recursive depth of 256 iterations.

Configure the recursion depth limit

Some questions remain, “Is 256 the absolute limit of the recursion?” and “Can we configure a higher limit?” The answer to both these questions is yes, there are two configuration options available:

  1. Modify the Mule runtime parameter com.mulesoft.dw.stacksize to a higher number. This will require the involvement of operations engineers and may affect other running services. This option is only available where the customer hosts the runtime using a deployment strategy that permits access to the Mule runtime.
  1. Use computer science concepts called “tail recursion.”

DataWeave tail recursion implementation

The idea is to do a “tail call.” This is a subroutine call performed as the final action of a procedure. This is a common concept used by multiple programming languages (Lisp uses it as a compilation technique) to resolve issues like the one we discovered earlier. Tail recursion also exists in DataWeave and can be used instead of the classical recursive call, but it will require some modification to our function’s code.

To modify our example to use a tail recursion, we have to make a subroutine call be in invoke at the very end of the function:

If we call the tailRecursiveSum function with the value 5 it will process as follow:

Fig. 5: Iteration steps of the tail recursion function

When the DataWeave interpreter analyzes this structure, it interprets it as a simple iteration and therefore it will not use a stack to maintain the results of each recursive call, therefore meaning we will not hit any stack boundaries limits.

To provide this, here is an example of a recursive execution with a depth of 10 million:

Fig 6. Tail recursion processing to 10 million records

Summary

In this blog post, we’ve seen just one of many optimization examples that can be performed with DataWeave.

You can learn more DataWeave best practices in the instructor-led Anypoint Platform Development: DataWeave 2.0 course available online or in the classroom. Please visit our training homepage for more information about class availability and to discover other classes we have on offer. 



We'd love to hear your opinion on this post