Reading Time: 16 minutes

“Logs are like car insurance. Nobody wants to pay for it, but when something goes wrong everyone wants the best available” – Pablo Kraan

The phrase above fully explains why logs are important and why we need to be able to log as much information as possible without impacting performance. Because logging usually implies an I/O operation, it's a naturally slow operation.

The Problem

Before 3.6, logging operations were done using synchronous I/O. This means that the thread processing your message has to wait for the log message to be fully handled before it can continue.

latest report
Learn why we are the Leaders in API management and

Because Mule is an application container, logging is not just about doing I/O fast. It also has additional requirements:

  • Log separation: each application/domain needs to keep its own log file
  • Each application/domain needs to be able to specify its own logging config.
  • If an app/domain doesn't provide a config, a default one needs to be applied
  • The performance cost of logging should be minimum

The Solution

The solution revolves around the concept of asynchronous logging. So, if the logging I/O operation happens in a separate thread, the actual processing of your message doesn't have to wait for it to complete.

To meet all of these requirements we migrated our log infrastructure to log4j 2.0 since this framework provides:

  • Fastest asynchronous logging capabilities we could find. Because it uses the LMAX Disruptor library, it greatly outperforms other traditional asynchronous loggers such as the ones provided by Logback.
  • Ability to specify synchronous loggers
  • Extensibility to implement Mule's logging logic
  • Highly active development effort
  • Responsive community

Performance Impact

So, why is this so cool? We did performance tests in which we used an application which logs around 1 million messages, using an increasingly higher amount of threads on each run (these threads were responsible for generating messages; the more threads, the more concurrent messages). We repeated this test on version of Mule using 4 different logging setups:

  • Using the log implementation of Mule 3.5
  • Using Logback
  • Using log4j2
  • Doing no logging at all

These are the results:

This chart shows how much latency was added as we added more concurrent messages. The blue line are the results of using Mule 3.5. The green line represent the latency of doing no logging at all. The read line are the results using the new logging infrastructure. Hey wait? Where's the red line?! Look closely, it's almost on top of the green line, which means that now logging doesn't add any significant latency.

Now let's consider this:

In this other chart, we see how many TPS (transactions per second) were delivered as the concurrency increased. Again, we see our new asynchronous logging infrastructure performing very closely to not logging at all and showing a huge improvement from our prior solution.

As you can see, the performance improvement is really notorious.

Configuration

Log4j2 supports configuration through XML, Json and YAML files. We're only going to be supporting XML files. This is because Json and YAML support are built using a version of the Jackson library which is not compatible with the one that is already bundled into Mule. Since upgrading would represent a backwards compatibility issue, for now we're only sticking to the XML version of it which is by far the most popular as far as our user base tells us.

When an application is deployed, Mule will look for a config following a child first pattern like this:

  • Look for a file called log4j2-test.xml in the application classpath
  • Look for a file called log4j2.xml in the application classpath
  • Look for a file called log4j2-test.xml in the domain classpath
  • Look for a file called log4j2.xml in the domain classpath
  • Look for a file called log4j2-test.xml in MULE_HOME/conf
  • Look for a file called log4j2.xml in MULE_HOME/conf
  • Apply default configuration.

The default config resides at MULE_HOME/conf/log4j2.xml. That default config defines the standard loggers in exactly the same way as Mule has until the 3.6 release. Only difference is that according to this config, all loggers (including the root one) are asynchronous. You can override this config at a domain or application.

As a sample of how the new configuration file looks like, here's the default configuration for Mule 3.6:

Migrating the log config from older versions

As you probably noticed, the .properties format is no longer supported. This is not an arbitrary decision. It was taken based on the fact that:

  • Log4j2 no longer supports it in the first place. In Log4j2, loggers and categories have a more hierarchical approach which doesn't map well to the flat properties structure.
  • We considered building support for it ourselves but the result was a config which wasn't compatible with the actual properties in the first place, because we had to add the concept of asynchronous loggers. On top of that, we soon hit all the roadblocks as the log4j2 team and finally concluded that it was not worth it.

This is not a problem for every Mule user. Some users don't configure logging at all. Others did it with the XML variant of Log4j 1.2, which is also not compatible with the current one but way closed to be. Users which used the traditional properties file approach got the worst part of it.

If you're affected by this then our sincere apologies. We usually don't break backwards compatibility on minor versions. But in this case, as you can see on the performance section of this post, the advantages were so great in comparison to the cons that we decided to go for it. If you have issues updating your configuration please find more information on the Log4j2 configuration guide or contact us and we'll do our best to help you.

Handling exceptions on Asynchronous loggers

The disadvantage of asynchronous logging is error handling. There're two situations to account for:

  1. In a synchronous logging schema, if the logger fails to write to disk, the exception would raise on the same thread that's currently processing the Mule . If logging is critical for you, then you can rollback the transaction. In asynchronous logging, the thread doing the processing won't be aware of any issues writing to the disk, so you won't be able to rollback anything.
  2. Because the actual writing of the log gets differed, there's a chance that log messages might never make it to disk if Mule were to crash before the buffers are flushed.

To mitigate this issues, we're providing the following mechanisms.

  1. ExceptionHandler. By default, Mule will register an LMAX ExceptionHandler that will log to the console and to logs/mule_ee.log any issues dumping log events to disk. This is the simplest approach possible that would at least let you know that something is wrong. You can provide your own exception handler by setting the system property ‘AsyncLoggerConfig.ExceptionHandler' to the canonical name of a class implementing the interface.
  2. Unfortunately this is simply part of the performance-reliability trade-off. If this is an issue for you, then you have no choice but to configure your loggers to be synchronous. Notice that you're not forced to choose between all synchronous or all asynchronous. You can have a mix.

Logs as audit trails

There're some users that use the application logs as audit trails. I'm personally not convinced about that being a great idea, however, we know they're running applications behaving like that and that's totally valid. If that's your case, then you can't afford to loose any log message and thus you should configure your application to always use synchronous loggers.

Configuration reloading

By default, Mule will poll modified config files every 60 seconds to check for changes. If any of those files indeed changed, the logger config will be modified on the fly. You can customize this interval by setting the monitorInterval attribute in the root element (check log4j2 manual for further reference).

Compatibility with older Logging

Theslf4j 1.7.7as a logging mechanism for every new feature. However, we still have a lot of legacy code using a mix of jul, apache commons logging and log4j1.2. Additionally, applications might be using cloud connectors or custom made components which also show this mixture. Although we highly encourage any user to migrate to slf4j as a the facto logging mechanism for Mule applications, we're still going to support that case. Slf4j bridges are included in the Mule distribution to make sure that regardless of the framework of your choice, log4j2 ends up handling every log event with a centralised configuration. However, we do leave it up to the user to make sure not to package any logging library on their applications/extensions to avoid classpath issues between such libraries and our bridges.

Wrapping up

This improvement enabled the Anypoint Platform with bleeding edge, state of the art logging capabilities with greatly improve performance in comparison to previous versions of the product. If you find yourself wanting to know more about log4j2 and what you can do with it, you can visit the log4j2 home page following this link. I hope you really benefit from this improvement! Remember that logging is like car insurance, don't worry about the rate anymore, let the Mule take the wheel!