Don’t let your DataMapper streaming be out of control

motif

As Nicolas pointed out in “7 things you didn’t know about DataMapper“, it’s not a trivial task to map a big file to some other data structure without eating up a lot of memory.

If the input is large enough, you might run out of memory: either while mapping or while processing the data in your flow-ref

Enabling the “” function in DataMapper makes this a lot easier (and efficient!).

just enable "streaming"

But just doing this doesn’t let you decide how many records at a time you want to get passed on to the next processor: in the worst-case-scenario, you might end up with just one line at a type processing. If your next processor is a database, you will have as many queries as lines in your file.

There is, however, a little trick to gain fine-grained control on how many lines are being processed: setting the batchSize property of the foreach:

If you want to see this in action, go grab the example app, import it into studio and start playing around 😉


We'd love to hear your opinion on this post