Reading Time: 22 minutes

You wouldn't necessarily be very excited about reliable, graceful app server restarts — unless you go to restart your server and doesn't restart, or unless the restart script corrupted your webapp data. There are times when a reasonably fast, fully reliable restart is a very important feature. Some examples:

  • You found that your webapp has a new memory leak, and you just fixed it in development, just finished testing it, and you're about to deploy the fixed version. But, first, you want to undeploy and restart the server to be completely sure the memory leak code is gone. While you're doing this, your server is offline, and you want to get it serving again as soon as possible, so you run the restart command.. but it doesn't stop. It stays running, and while you spend time trying to figure out why, your webapp is undeployed.
  • You have more traffic on your site, and now your memory utilization is climbing, and you've decided you should increase your Tomcat's heap memory allocation. You make the configuration change, and you run the restart command, which runs and happily completes, but Tomcat doesn't budge — it's still running. You spend the next hour or two trying to figure out why.
  • You wrote a shell or batch script that changes your web site in a way that it also has to restart Tomcat to make all the right changes take effect. Your script runs Tomcat's stop command, and then Tomcat's start command. But, after using it a few times you find that the script isn't successfully restarting Tomcat like that either due to an error. You spend lots of time looking for the cause of the problem…

If stock Tomcat restarts could both integrate well with the operating system, and also be fully reliable, it would save you time in cases like these, and it would allow you to automate more. We've cases like these with stock Tomcat, and we have improved server restarts as part of Tcat Server.

Let's also define what we mean by reliable server restarts: When the administrator tells the server to restart, the server gracefully shuts down as quickly as reasonably possible, and immediately starts a new Tomcat JVM only after the original Tomcat JVM process is completely gone. When the administrator tells the server to stop, the restart script stops the JVM reliably. If for any reason the original Tomcat JVM was hung up on something during a stop operation and wasn't going to exit, the restart script handles this case and (after waiting the appropriate/configurable timeout) kills the JVM process to make sure that it does not stay hanging and broken. When the restart script exits, it returns an exit code that reliably denotes whether the start, stop, or restart operation was successful. This exit code is a 100% reliable indicator of what the Tomcat JVM was doing at the time the restart script exited, so the exit code can be used by another script. The restart script also outputs textual information such that the person running the script can reliably see what is going on with the server.

Stock Tomcat Restart Shell Scripts

latest report
Learn why we are the Leaders in API management and iPaaS

When you download Apache Tomcat by itself, you get some scripts in the bin directory that can start Tomcat and stop (shut down) Tomcat:

~/ $ cd 

apache-tomcat-

6.0.29/bin
bin $ ls
bootstrap.jar                 digest.bat        startup.sh
catalina.bat                  digest.sh         tomcat-juli.jar
catalina.sh                   setclasspath.bat  tomcat-native.tar.gz
catalina-tasks.xml            setclasspath.sh   tool-wrapper.bat
commons-daemon.jar            shutdown.bat      tool-wrapper.sh
commons-daemon-native.tar.gz  shutdown.sh       version.bat
cpappend.bat                  startup.bat       version.sh

These scripts allow you to run Tomcat in debug mode if you'd like, run Tomcat with the Java security manager sandbox enabled, but that's about it. It's kept very simple. On Windows, the catalina.bat file has just about all of the script code, while on all non-Windows OSs (such as Linux, MacOS, and Solaris), it's in catalina.sh. The startup.sh and shutdown.sh or bat scripts are just wrappers around the catalina script.

These scripts do not implement any sort of a server restart, but only a stop and a start. There is almost no reliability built into these scripts, probably to keep them very simple. This makes it easier for the developers to maintain the scripts — they're smaller, and are attempting to do very little. Start the server as simply as reasonably possible, and be able to stop the server. But, as we know from experience, starting and stopping isn't all that simple to do reliably. There is also no support for starting Tomcat when your operating system boots.

Stock Linux Tomcat Package Init Scripts

On Linux, the way to integrate a software server package into the operating system's service controls is to write an “init script” — a shell script that supports common service operations such as start, stop, restart, and status. Every Linux distribution seems to have its own set of init scripts for just about every software server package. This is partially because the init system is full of shell scripts that have been heavily rewritten for each Linux distribution, and thus they may be (and often are) incompatible with each other. So, there is a Red Hat Enterprise Linux init script for Tomcat, and there is an Ubuntu init script for Tomcat, and they are very different implementations of Linux init scripts. In fact, they support different sets of service features, from one Linux distribution to another.

The RHEL, CentOS, and Fedora Tomcat 6 Init Script

The newest RHEL, CentOS, and Fedora Tomcat 6 init script (these Linux distributions are all mainly the same codebase) implements start, stop, restart, status, and version. It also supports starting Tomcat at server boot time. This init script does not call catalina.sh at all — it simply replaces catalina.sh. Since it replaces catalina.sh, it doesn't necessarily implement nor offer all of the features catalina.sh does. For every upstream functional change done to catalina.sh by the Tomcat developers, this init script must also be changed accordingly, or else it becomes incompatible. But, this init script was written with more attention on better integration with Linux than catalina.sh has, better reliability on stops, and this init script implements a restart feature.

This Red Hat / Fedora script also implements some reliability in its stop method:

$SU - $TOMCAT_USER -c "${TOMCAT_SCRIPT} stop" >> $TOMCAT_LOG 2>&1
RETVAL="$?"
if [ "$RETVAL" -eq "0" ]; then
    count="0"
    if [ -f "/var/run/${NAME}.pid" ]; then
        read kpid < /var/run/${NAME}.pid
        until [ "$(ps --pid $kpid | grep -c $kpid)" -eq "0" ] || 
              [ "$count" -gt "$SHUTDOWN_WAIT" ]; do
            if [ "$SHUTDOWN_VERBOSE" = "true" ]; then
                echo "waiting for processes $kpid to exit"
            fi
            sleep 1
            let count="${count}+1"
        done
        if [ "$count" -gt "$SHUTDOWN_WAIT" ]; then
            if [ "$SHUTDOWN_VERBOSE" = "true" ]; then
                echo "killing processes which didn't stop after $SHUTDOWN_WAIT seconds"
            fi
            kill -9 $kpid
        fi
        log_success_msg

That first line runs an additional Java VM with a very small program that tries to connect with the Tomcat JVM to tell it to shut down. This is actually not necessary on Linux (due to the existence of IPC signals), and is resource wasteful, and rather slow. Next, it loops, waiting for the Tomcat JVM process to shut down after having received the stop request. If the $SHUTDOWN_WAIT number of seconds has elapsed and the JVM process is still running, this init script simply uses kill -9 on your Tomcat JVM, instantly and ungracefully killing the JVM, so your web application is not allowed any more time to save any important data, and close files in a consistent state. Data loss can happen with this init script. And, this init script's ability to even know which JVM process is the right process to kill hinges on just a small process ID file on the filesystem, whose numeric contents can, at times, be the wrong process ID. Also, this init script is specific to the Red Hat / CentOS / Fedora line, and cannot be run on anything else. Lastly, this init script also does not attempt to support the new Tomcat 7.

The Debian and Ubuntu Tomcat 6 Init Script

The newest Debian, Ubuntu init script implements start, stop, restart, and status. It also supports starting Tomcat at server boot time. It does not try to replace catalina.sh, it calls catalina.sh, and supports the runtime features that Tomcat users expect. It offers running Tomcat as a non root user while binding to privileged TCP ports, such as port 80. It implements reliable server stops, and therefore reliable restarts as well. The init script monitors the process after sending this signal, and will exit right away when Tomcat's JVM exits.

start-stop-daemon --stop --pidfile "$CATALINA_PID" 
        --user "$TOMCAT6_USER" 
        --retry=TERM/20/KILL/5 >/dev/null
if [ $? -eq 3 ]; then
        PID="`cat $CATALINA_PID`"
        log_failure_msg "Failed to stop $NAME (pid $PID)"
        exit 1
else
        rm -f "$CATALINA_PID"
fi

If Tomcat's JVM does not exit, the init script waits for 20 seconds, monitoring the JVM process. At the end of the 20 seconds, the init script sends a SIGKILL to the JVM and waits up to 5 additional seconds for the JVM to quit. If the JVM process still will not quit (kernel IO hang problem, or similar), the init script prints a failure message saying that Tomcat will not shut down. This is quite a bit more clever, graceful, and fault tolerant. Well, okay, I actually wrote this code.. but still! On the disadvantage side, this init script is very specific to recent Debian and Ubuntu distribution internals, and cannot be run on anything else. This init script does not support installing and running it as a non-root user. And, it does not attempt to support the new Tomcat 7.

Tcat Server's Linux Init Script

When you install Tcat Server on any Linux, you automatically get Tcat's init script. It implements start, stop, restart, status, and boot-time starts. Like the Debian and Ubuntu Tomcat 6 init script, it calls catalina.sh instead of replacing it, preserving all of the stock Tomcat runtime features users expect. This init script, however, is not specific to any Linux distribution.. it is multi-distro. It is thoroughly tested on RHEL, Ubuntu, CentOS, OpenSUSE, and Fedora — the same init script runs happily on all of these Linux distributions, including recent versions and versions as old as RHEL 4.x (many years old now). As such, Tcat's init script can be copied to a completely different Linux distribution and will run there without incident. This comes in handy if/when your company decides to switch to a different Linux distribution, or decides to run more than one Linux distribution.

Here are a few of the features that Tcat Server's init script implements:

  • Fully reliable restarts, and graceful shutdowns, similar to the Debian and Ubuntu init script — this protects against data corruption and makes scripting restarts deterministic and reliable.
  • Installing and running the init script as a non-root user. For production scenarios where the Tomcat administrator is not granted root priviliges on the server, all of the features of the init script are still supported and working.
  • Installing multiple copies of Tcat in the same operating system installation, and operating them each independently.

The Tcat init script does not rely on process ID files on the filesystem being present nor having the process ID of the Tomcat being commanded. Instead, Tcat's init script marks each Linux JVM's process with a marker that is uniquely identified, and thus the correct JVM process is always found and handled. Any other JVM processes running in the system are left safely untouched.

For maximum compatibility with the Tomcat you want to use, Tcat's init script fully supports the stable releases of Tomcat 6 and Tomcat 5.5 with all of the above features, plus Tcat's init script fully supports the new Tomcat 7 betas that are available (as of this writing).

Tcat Server REST API Group Restarts

Because Tcat's restarts are reliable, Tcat's console also offers a REST API that you can use to trigger restarts. That includes both individual servers and groups of servers. For example, if you're writing a shell script that needs to restart one or more servers, you can simply add this to your script:

curl --basic -u admin:admin --data-binary 'http://localhost:8080/console/

api

/servers/local$76d926b3-2282-4025-b5a7-443ea1c4fc7b/restart'

.. where your Tcat console username is ‘admin', and your password is ‘admin' (this is just an example.. make sure to use a different password in real life!) and where ‘local$76d926b3-2282-4025-b5a7-443ea1c4fc7b' is the server identifier that you can get by listing your servers like this:

$ curl --basic -u admin:admin http://localhost:8080/console/

api

/servers

Or, of course, you can trigger a restart of a server by invoking Tcat's init script on the same machine, such as:

$ sudo /etc/init.d/tcat6 restart

Either way, you can be confident that when you invoke a Tcat restart, your server will be gracefully shut down, and started back up again!