Using in-memory database to help with flat file integration

Even today, when integrating with critical backend systems (e.g. order fulfillment system and shipping management system), it is quite common to encounter data being transferred in flat file format. Often the records to be transferred will need to be sorted and de-duped before they can be inserted into another system. For example, there could be multiple update entries for the same item and the sequence of the update calls is important. Here we would like to show you an example of how to leverage Mule’s built-in support for in-memory databases to make it really easy to perform such tasks when working with flat files.

Out of the box, Mule supports HSQL, H2, and Apache Derby. The example here will use Apache Derby but the configuration is essentially the same for all three.

First thing you will need to do, is to configure your JDBC connection. Note that the use of Apache Derby JDBC driver, also, where you put the name of the database (e.g. “blogdemo” in this case.):

<spring:bean id="datasource" name="datasource" class="org.enhydra.jdbc.standard.StandardDataSource" destroy-method="shutdown">
     <spring:property name="driverName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
     <spring:property name="url" value="jdbc:derby:memory:blogdemo"/>
</spring:bean>

In addition, you will need to initialize the database by implementing the Spring InitializingBean interface. When Mule starts, the initialization process will create the tables you plan to use. For our example, we will just create couple tables to illustrate the process:

package org.mule.blogdemo.init;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Statement;
import org.springframework.beans.factory.InitializingBean;

public class DBInitialization implements InitializingBean {
	
	public void afterPropertiesSet() throws Exception {
		
		String dbURL = "jdbc:derby:memory:blogdemo;create=true";
		Connection conn = null;
		try {
			Class.forName("org.apache.derby.jdbc.EmbeddedDriver").newInstance();
			// Get a connection
			conn = DriverManager.getConnection(dbURL);
			Statement stmt = conn.createStatement();
			stmt.executeUpdate("CREATE TABLE table1 (Name VARCHAR(255), External_ID__c VARCHAR(255), Value__c INTEGER, Last_Modified TIMESTAMP)");
			stmt.executeUpdate("CREATE TABLE table2 (somecolumn VARCHAR(20), somecode VARCHAR(80))");
		} 
		catch (java.sql.SQLException sqle) {
			sqle.printStackTrace();
			throw sqle;
		}
	}
}

With the corresponding element in your flow:

<spring:bean id="dbinitialization" name="dbinit" class="org.mule.blogdemo.init.DBInitialization"/>

These should be global elements of your application:

From here on, you can just use your in-memory database like any regular relational database in your application:

<jdbc-ee:connector name="apache-derby" dataSource-ref="datasource" validateConnections="false" queryTimeout="-1" pollingFrequency="0" doc:name="Database">
     <jdbc-ee:query key="insertRecord" value="INSERT INTO table1 (Name, External_ID__c, Value__c, Last_Modified) VALUES (#[map-payload:name], #[map-payload:externalid], #[map-payload:value], #[map-payload:lastmodified])"/>
     <jdbc-ee:query key="selectRecord" value="SELECT Name, External_ID__c, Value__c FROM table1 ORDER BY Last_Modified"/>
</jdbc-ee:connector>

For example, if your requirement calls for upserting records from a csv input file into Salesforce.com in sequence based on last modified time:

name,externalid,value,lastmodified
order1,1,10,2012-09-22 13:40:00.00
order1,1,1,2012-09-22 10:00:00.00
order1,1,5,2012-09-22 11:00:00.00

With in-memory database, the flow can be simplified down to just a handful of steps:

<flow name="mainFlow" doc:name="mainFlow" processingStrategy="synchronous">
     <file:inbound-endpoint path="src/test/resources" moveToDirectory="src/test/archives" responseTimeout="10000" doc:name="Input File"/>
     <jdbc-ee:csv-to-maps-transformer ignoreFirstRecord="true" doc:name="CSV to Maps" mappingFile="src/main/resources/mules-csv-format.xml"/>
     <foreach doc:name="Foreach">
      	<logger message="#[payload]" level="INFO" doc:name="csv record" />
        <jdbc-ee:outbound-endpoint exchange-pattern="one-way" queryKey="insertRecord" queryTimeout="-1" connector-ref="apache-derby" doc:name="insert-record"/>
     </foreach>
     <jdbc-ee:outbound-endpoint exchange-pattern="request-response" queryKey="selectRecord" queryTimeout="-1" connector-ref="apache-derby" doc:name="select records"/>
     <sfdc:upsert config-ref="salesforce" externalIdFieldName="External_ID__c" type="Demo_Object__c" doc:name="upsert.record.in.salesforce">
         <sfdc:objects ref="#[payload]"/>
     </sfdc:upsert>
</flow>

With the records now stored in a relational database table, all the SQL query functionality can be utilized to process the data to meet your requirements.

Finally, please don’t forget to include Apache Derby library in your project. Support for Derby is out of the box but the driver still needs to be added to the project manually: