Using Mule for ETL

June 18 2009

6 comments
motif

The Mule framework provides all the extract/transform/load (ETL) tools you need for connecting to data sources, extracting and transforming data, and passing it along on any number of channels.

For example, you can easily poll JDBC data sources and the file system, use a transformer to convert the data between formats, and then send it to a location such as a database, file, queue, email, or FTP server. There are several predefined transformers with Mule, but it’s also simple to create your own. You can also use filters to filter out invalid messages or to consume only a certain type of file in the input folder. You can even split the message into multiple messages, such as splitting one file into several rows.

A recent webinar by MuleSource core developers Dirk Olmes and Travis Carlson walks you through the example of extracting data from a file, transforming it using a custom transformer, splitting it into separate lines, and then inserting each line into a database. It then shows how to do the same thing in reverse, pulling data from a database, transforming it, and saving it to a file. The webinar includes details on parsing character delimited fields, writing a transformer to generate a Map, and sending Maps to endpoints.

You can view the 38-minute webinar and download the complete example files here.

Note: the documentation links above require login, but registration is free and only take a few moments.


We'd love to hear your opinion on this post


6 Responses to “Using Mule for ETL”

  1. Is it possible to use wild cards in the file address path if you are trying to monitor multiple sub-directories? For example instead of having to use multiple endpoint definitions:

    <endpoint
    address=”file:///parentdirectory/subdirectory1″

    <endpoint
    address=”file:///parentdirectory/subdirectory2″

    use something like:

    <endpoint
    address=”file:///parentdirectory/*”

  2. You have to specify each directory separately. However, wildcards are supported in filenames when using the wildcard filter to filter out incoming files and when specifying the filename pattern to use when writing files. See the file transport page (https://docs.mulesoft.com) for details.

  3. Thank you Jackie for your quick response even thought it was not the answer I was hoping for – we are using the wild card feature for monitoring file names in a single directory such as *.xml and that works great!

    Is it possible to request in a future release to be able to use wild cards within the file address path? If you have 99+ subdirectories you are monitoring, that would sure help instead of having to define them each – also, when new sub-directories are added, no code changes would be required.
    regards,
    Bill

  4. Good suggestion! I’ve created request MULE-4415 in JIRA, our public issue-tracking system (http://www.mulesource.org/jira/browse/MULE).

    Thanks,
    Jackie

  5. Hi, I wanted to download the example code for the demo. When I click on the link, I’m being redirected to a new page. Can you please post a new link for the example code?

    this is the link that doesn’t work for me:
    https://www.mulesoft.com/lp/whitepaper/api/etl-elt-tool-data-integration

    • Apologies, Kevin. You may view the webinar at and download the example code at We’ve updated this post with the correct URL.