Improvements to the Batch Module in Mule 4

batch module mule 4

The Mule 4 Release Candidate is here and, with it, the updated version of the batch module!

The batch module was first introduced in Mule 3.5 and it aims to simplify integration use cases where basic ETL functionality is needed. If you’re unfamiliar with it, I recommend you look at this post before continuing. If you are already old friends with batch, then keep reading for a breakdown of all the changes!

HowTo – Extract, Transform, Load (ETL) and Change Data Capture

Data storage flat icon

We recently introduced our HowTo blog series, which is designed to present simple use-case tutorials to help you as you evaluate Anypoint Platform. The goal of this blog post is to give you a short introduction on how to implement a simple ETL (Extract, Transform, and Load) scenario using Mulesoft’s batch processing module.

Anypoint Platform brings together leading application integration technology with powerful data integration capabilities for implementing such a use case.

Batch Module: Obtaining the job instance id in Mule 3.7

A popular request among users of the Batch module is the ability to grab the job instance id in any of a Batch job’s phases. Why is that useful? Well, there could be a number of useful scenarios:

Mr. Batch and the Quest for the right Threading Profile

Sometimes (more often than we think), less concurrency is actually more. Not too long ago, I found myself in a conversation in which we were discussing non-blocking architectures, tuning, and performance. We were discussing that tuning for those models often starts with “2 threads per core” (2TPC). The discussion made me curious about how Mule’s batch module would perform if tested by 2TPC. I knew beforehand that 2TPC wouldn’t be so impressive on batch, mainly because it doesn’t use a non-blocking threading model.

Handle Errors in your Batch Job… Like a Champ!

Fact: Batch Jobs are tricky to handle when exceptions raise. The problem is the huge amounts of data that these jobs are designed to take. If you’re processing 1 million records you simply can’t log everything. Logs would become huge and unreadable. Not to mention the performance toll it would take. On the other hand, if you log too little then it’s impossible to know what went wrong, and if 30 thousand records failed, not knowing what’s wrong with them can be a royal pain.

Near Real Time Sync with Batch

motif

The idea of this blog post is to give you a short introduction on how to do Real time sync with Mule ESB. We’ll use several of the newest features that Mule has to offer – like the improved Poll component with watermarking and the Batch Module. Finally we’ll use one of our Anypoint Templates as an example application to illustrate the concepts.

What is it?

Near Real time sync is the term we’ll use along this blog post to refer to the following scenario:

“When you want to keep data flowing constantly from one system to another”

Batch Module Reloaded

With Mule’s December 2013 release we introduced the new batch module. We received great feedback about it and we even have some CloudHub users happily using it in production! However, we know that the journey of Batch has just begun and for the Early Access release of Mule 3.5 we added a bunch of improvements. Let’s have a look!

Support for not Serializable Objects

A limitation in the first release of batch was that all records needed to have a Serializable payload.