Mule 4 Beta is out and we have a completely new File Connector, which replaces the old transport. This connector greatly enhances Mule’s capability to handle files and folders on a locally mounted File System.
It’s main features include:
– The ability to read files or fully list directories’ contents on demand, unlike the old transport (which only provided a polling inbound endpoint)
– Top level support for common File System operations such as copying, moving, renaming, deleting, creating directories, and more
– Support for locking files on the file system level
– Advanced file matching functionality
This connector was designed to be fully consistent with the FTP Connector. The same set of operations are available on both connectors in Anypoint Exchange and they behave almost the same (the new FTP connector will be covered in an upcoming post!)
The File Connector does not necessarily require configuration, although it’s good practice to define one. The most important config parameter is the working directory or working dir. It is the path to a directory that is considered the root of every relative path used with this connector. If the path is provided, it will default to the value of the user.home system property. If the system property is not set, then the connector will fail to initialize.
The config also allows users to specify the default encoding to use when writing files. This will default to Mule runtime’s default encoding. If you use any of the connector’s operations without referencing a configuration, then the operation will behave by using the default values.
Here’s how an example config looks like:
One of the most requested features for the new connector is the ability to read a file at any given time of the flow, unlike the old transport which can only read files as a result of inbound endpoint polling.
The syntax to read a file is:
The processor above reads the file in the given path. It returns a MuleMessage with the following attributes:
- An InputStream as payload
- A FileAttributes instance
If the file does not exist, you will get a FILE:ILLEGAL_PATH error.
The supplied path MUST point to an actual file. It’s not possible to read a directory, in such cases, you will get a FILE:ILLEGAL_PATH error.
Mime Types and encoding
The File Connector does its best to auto determine a file’s mimeType based on its extension. However, there are cases in which that best guess is not enough, and the user will have first-hand knowledge of the file’s content. In such cases, you can force that mimeType to a particular value by using the outputMimeType parameter.
The same process works for encoding. By default, the connector will assume that the runtime’s default encoding matches the one in the file. You can set this by using the outputEncoding parameter.
Why MimeTypes matter
With DataWeave as the default expression language in Mule 4 Beta, DataWeave expressions are embeddable inside operations that generate payloads and other values. Having the correct mimeType set helps DataWeave auto-assign types and, in turn, also generate the correct outputs, while improving the user experience by maximizing the use of DataSense’s functionality.
The File Connector now supports a file system lock on the file when it is being read. You can do this by simply setting the lock parameter to true (defaults to false). When enabled, this feature asks the operating system to lock the file, which prevents any other process (or Mule flow) from accessing that file when the lock is held. The lock will be automatically released when one of the following things happen:
- The Mule flow, which locked the file, ends
- The file content has been fully read
Also, take into account that the file might already be locked by somebody else, in which case the connector won’t be able to lock it and you will get a FILE:FILE_LOCK error.
By default, this operation will only list the contents of the given directory, without going into any sub-folders at the root level of the Directory Path and without reading any file that is inside a subdirectory. To enable recursive listing, the Recursive parameter should be on True. If a sub-directory is found and recursive was set to True, then the files contained in that subdirectory will be listed immediately after the subdirectory.
In combination with the file matcher, this capability makes it possible to use this connector in tandem with other Mule elements such as the <scheduler> to do “watermark-like” use cases.
In this example, we will list the contents of a folder and handle regular files and subdirectories differently. We do so by using the list operation, which lists all the files and folders in a given Directory Path. This path could be absolute or relative. If the path is relative, then it will be relative from the Config’s Working Directory. The list operation returns a List of messages, where each message represents an item in the directory.
Another common scenario, when listing files, is the need to filter files that match certain criteria.
For that, a <file:matcher> element exists. This element defines the possible criteria that can be used to either accept or reject a file. This is how you can define the matcher:
All of the attributes above are optional and are ignored if not provided. They are all related to each other under an AND operator.
A good thing about the file matcher is that it can either be a reusable element as a named top-level element or it can also be used as an inner element proprietary to a particular component.
Here are the two ways of using it:
1. Example of top level, reusable matcher
2. Example of inner, not reusable, matcher
This operation leverages the new repeatable streams functionality introduced in Mule 4 Beta. It returns a list of messages––each one representing one of the found files. Each of those messages holds a stream to the found file, and that stream is repeatable by default.
For more information, read about how automatic streaming works in Mule 4 Beta.
This operation writes a given Content into the given Path on demand. In principle, this seems pretty straightforward, but it does contain a number of usability features to help with the most common use cases.
Embedded DW transformations
When used in its default form, the connector will write whatever is in the message payload:
But what happens if the payload is not in csv format and you actually need to make a transformation? Previously, you had to place a DW transformation before the write operation, which caused the message payload to change and impacted the operation placed after the write operation.
To avoid this undesired impact, you can now place the transformation inside the write operation:
Now, the transformation can be used for generating the content that will be written and has no side effect on the message in transit.
Writing into directories
Consider the following:
If any of the a, b, or c directories do not exist, this operation fails by default. However, by setting the createParentDirectories to true, the connector will automatically create any missing directories.
Writing to existing files
There are 3 types of file write modes, which become increasingly important when trying to write to an existing file:
- OVERWRITE: If the file exists, then overwrite it completely
- APPEND: If the file exists, then write at the end of it
- CREATE_NEW: This means that the operation should result in a new being created. If the file is already there, then you will get an exception
This operation also supports locking, in a similar fashion to the read operation. The main difference is that the lock will be automatically released once the write operation finishes.
Copying & Moving
The connector also provides the ability to copy and move files or directories on demand.
Take a special look at the targetPath and renameTo parameters. The targetPath is the path to the directory in which the file is going to be copied or moved to. This path MUST point to a directory.
In some cases, you want to also rename the target file as part of the operation. This operation allows you to automatically do so by also providing the optional renameTo parameter. This parameter must be a file name, not a path. If this attribute is not provided, then the original file name will be kept.
This is a straightforward operation that simply deletes a file:
This operation supports renaming both files and directories.
This operation simply creates a directory of a given name. If the reason for creating the directory is to immediately write, copy, or move contents to it, then use the write, copy or move operations with createParentDirectories=true instead.
Try Mule 4 Beta Now!
The great news is that all of these new features are ready for you to try––Mule 4 Beta is already out! Download Mule 4 Beta today. For more detail on the connector, please check out the technical reference.
This connector is also available in the new Flow designer product, part of Anypoint Platform’s Design Center.
On a Personal Note: Going Full Circle
The new File and FTP connectors are two connectors that I hold pretty close to my heart. When I first became involved with MuleSoft, I was not an engineer, but as a mere user.
Over the past six years, I’ve had the privilege of being a full-time engineer on MuleSoft’s Core Runtime team–– a team that allowed me to fix all the bugs and quirks that used to really annoy me (plus a lot of other things). File, FTP, and SFTP connectors are the last three things I used to complain about that I didn’t have the chance to fix, until now!
In a way, these connectors mean going full circle for me, so I hope you enjoy using them as much as I do.