Galaxy on EC2 in one hour!


We have been running Galaxy successfully on our in-house servers and laptops for demo purposes for some time now and decided that having a running image of Galaxy on Amazon’s EC2 was the next logical step. Galaxy in the cloud gives us the opportunity to expose a running instance to a much wider audience than might otherwise interact directly with the product.

So, you are probably wondering just how complicated all this cloud technology really is. And how long did it take to get Galaxy running on Amazon’s EC2 service? The truth is that the Amazon tool set has come a long way in a very short time. In fact, it took me less than an hour to provision a lightweight instance of Fedora 6, install a few missing utilities, set a couple of environment variables, and launch Galaxy!

Here are the basic steps, followed by the nitty-gritty details for each section:

  1. Install and configure ElasticFox, an EC2 plugin for Firefox (10 minutes).
  2. Pick an existing EC2 AMI to boot. Selected a lightweight install of Fedora 6 (10 minutes).
  3. SSH into the machine and install prerequisites via yum and rpm (20 minutes).
  4. Set environment variables such as PATH and JAVA_HOME (10 minutes).
  5. Download and launch the Galaxy Enterprise standalone JAR (10 minutes).

Step 1
ElasticFox is a plugin for Firefox that allows you to easily provision and manage EC2 instances. I provided ElasticFox with all my private Amazon details such as my Account ID, AWS Access Key, and AWS Secret Key. Next, I created a keypair that allows me to ssh into the box without knowing the root password of the instance. Finally, I created a security group that enabled two services that I’m going to need: ssh and http.

Step 2
The instance that I selected is ami-78b15411 and is known as ‘marcins_cool_public_images/fedora-core-6’. I found this image by sorting the public AMIs by rating and then combing through the list for something lightweight and mainstream. You can find the list here:

Step 3

The next step was to ssh into the box and install some missing utilities and programs that are required for Galaxy:

  • galaxy-ee-web-standalone-1.0.jar – Enterprise version of Galaxy
  • apache-maven-2.0.9-bin.tar.bz2 – Maven repository for artifacts
  • jdk-6u7-linux-i586.rpm – Java JDK
  • wget – file transfer utility
  • vim – text editor
  • tar, bzip2, unzip, zip – compression utilities

Step 4

Now, I just have to place a couple of important variables into my environment so that my shell knows where to look for the Java and Maven programs:

Step 5

The final step is to launch the Galaxy Java application, wrapping it in the ‘nohup’ command so that it will continue to run even after I exit my shell:

After giving the process a minute to start up, I fired up my web browser and attempted to connect to the public DNS name of my instance. Success! Galaxy in the cloud in less than 60 minutes! =)
login: admin
password: admin

For more information on Galaxy, check out this screencast or the Galaxy Home Page.

We'd love to hear your opinion on this post