Data transformation is an inevitable component of connectivity, as most systems don’t speak the same language. Even when the format is similar, as when two RESTful Web APIs exchange JSON payloads, their message structure typically differs, making translation a necessity.
In this the first of a four-part series:
- Getting started with DataWeave Part 2
- Getting started with DataWeave Part 3
- Getting started with DataWeave Part 4
on our new data transformation engine, DataWeave, we will help you learn the basics of this elegant and lightweight expression language. We will first focus our attention on the canonical format which is the immediate result of every expression you execute in DataWeave. In the following posts, we will explore the power of DataWeave expressions and solve some real-world use cases. You can also now view our DataWeave intro webinar available on demand below:
From DataMapper to DataWeave
Those of you familiar with Anypoint Platform will know DataMapper, our first solution for providing a graphical drag and drop approach to data transformation. We learned, based on customer feedback, that users were looking for a more powerful transformation engine. With DataWeave, we coupled the transformation engine with the Mule runtime engine to ensure high performance, adopted a language first approach and designed the language with a simple JSON like syntax. As a result, it costs a fraction of the time to write a DataWeave expression compared to a similar use case in DataMapper.
Development Environment
As part of the design time experience for DataWeave you now have the facility to indicate to DataSense the structure of every message by explicitly setting it in the Metadata tab which is present in the other Message Processors. You should also be mindful of the mime-types of the Message data you have to transform. When this is not set explicitly at design time and when you use any of our Connectors, this will default to application/java. For HTTP requests and responses Mule will look at the Content-Type header and set it accordingly. You should set the mime-type explicitly when you are using the set-variable and set-payload processors. Failing to set the mime-type explicitly for xml content will result in it being interpreted as a mere Java string by the engine. Anypoint Studio presents a DataWeave editor with 3 panes. On the left, DataSense displays the structure of the incoming message, together with any example data you have provided. On the right pane, the expected outgoing structure is displayed as per DataSense as well as the design time result of your transformation which constantly refreshes itself as you write the expression. The middle pane is the transformation pane. This is divided into two sections. The header is where you declare the mime type of the output of your transformation. You can also declare reusable functions and global variables as also namespaces for xml use cases. At a minimum you must declare the %dw 1.0 header and the output mime-type. Below the dotted line is where you author your expression. You only need one expression. There are a number of expression types, but for the majority of your transformations, you will use a semi-literal expression of a DataWeave Object. The same DataWeave transformer can produce multiple outputs from the same incoming Message. You do this by clicking on the plus circle at the bottom right of the transformation pane. Doing so will present you with another transformation pane and you are free to specify a different target (flowVars, etc.) by choosing a different Output from the drop-down at the top of the transformation pane.
Literal Expressions
There are only 3 data types that can result from any DataWeave expression: Simple Types, like Strings and numbers, Arrays and Objects. Each of these can be expressed literally: You will typically use semi-literal object expressions to define your transformation. To understand this, first consider the example object literal on row 2 above. A DataWeave object is a sequence of key:value pairs. The key is a string without the quotes and the value can be either a simple type, an array, or an object. However, you are not obliged to express the keys and values literally. You can use any expression that returns a string for the key and any expression at all for the value. There are a number of different expression types available to you. We will explore more of these in the next parts of this series. For now we concentrate on the combination of literal expressions and variable references.
Variable Reference Expressions
Key to learning DataWeave is an understanding of the normalization of data which occurs whenever variable references are made to the payload or flowVars, sessionVars, recordVars, inboundProperties or outboundProperties. These expressions convert the incoming data into the canonical DataWeave format. Hence, it makes no difference if your incoming data is JSON or XML or CSV or Java, as these will always be resolved to DataWeave simple types, arrays or objects. You do well to think only in terms of DataWeave data types as these set the context for your expressions and are the result of each expression. Here is an example of how an incoming XML document is resolved to an object. Note the convenient learning experience afforded to you when you declare the output to be application/dw. Note how each element in the xml is normalized as a key:value pair in the object. When the content type of an element is complex, then its normalized value is itself an object. Note also that repeating elements are normalized as repeating key:value pairs. Consider the following CSV. It is normalized as an array of Objects with key:value pairs that correspond to the name of the header and column value respectively. Now, take a look at the result of using a semi-literal expression which wraps the same payload in an object which declares both the forecasts and the city for which they were made:
Output Rendering
The DataWeave engine separates the actual transformation process within the canonical format from the final rendering of the same in the output mime type you defined in the header. It does not matter how complex your expression is. If it’s an object with deep nested structure and expressions and operators which combine expressions, all of these must be executed first and return their value before the outer expression returns its value. It is this final value which is rendered in the output mime-type. You need to be mindful of the constraints imposed by your choice of mime-type. Arrays will not be rendered in xml. You should choose to repeat keys instead. Likewise, repeating keys cannot be rendered in Json. You should generate arrays instead. To render an object to XML, in line with the rule that XML documents may only contain one root element, the object may only contain one key:value pair. Its value can, as we have seen, itself be an object of any complexity. Consider the above transformation, modified accordingly to render a valid XML document.
Watch our DataWeave webinar
In the Next Post…
With these basics in mind, our next post, Getting Started with DataWeave: Part 2, will delve into the powerful Selector expression for object and array navigation and we will look into iteration and conditional logic for dynamic data generation.