Tuesday, January 11, 2011

Failsafe and manual management of kernels on EC2

On 10.10 and later, Ubuntu images use the Amazon provided pv-grub to load kernels that live inside the image. The selected kernel is controlled by /boot/grub/menu.lst. This makes it possible to install a new kernel via 'dpkg -i' or 'apt-get dist-upgrade' and then reboot into the new kernel.

The file /boot/grub/menu.lst is managed by grub-legacy-ec2 package. The program 'update-grub-legacy-ec2' is called on installation of Ubuntu kernels through files that are installed in /etc/kernel/postinst.d and /etc/kernel/postrm.d.

By default, as with other Ubuntu systems, the kernel with the highest revision will be the behavior will be automatically selected as the default, and selected on the next boot. Because EC2 images is read-only, you may want to manually manage your selected kernel. This can be done by modifying /boot/grub/menu.lst to use the grub "fallback" code.

I'll launch an instance of the current released maverick (ami-ccf405a5 in us-east-1 ubuntu-maverick-10.10-i386-server-20101225). Then, on the instance, create hard links to the default kernel and ramdisk so even on apt removal, they'll stick around, and then change /boot/grub/menu.lst to use those kernels.


sudo ln /boot/vmlinuz-$(uname -r) /boot/vmlinuz-failsafe
sudo ln /boot/initrd.img-$(uname -r) /boot/initrd.img-failsafe


Then, copy the existing entry in /boot/grub/menu.lst to a new entry above the automatic section. I've changed/added:


# You can specify 'saved' instead of a number. In this case, the default entry
# is the entry saved with the command 'savedefault'.
# WARNING: If you are using dmraid do not use 'savedefault' or your
# array will desync and will not let you boot your system.
default saved

...<snip>...

# Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST

# this is the failsafe kernel, it will be '0' as it is the first
# entry in this file
title Failsafe kernel
root (hd0)
kernel /boot/vmlinuz-failsafe root=LABEL=uec-rootfs ro console=hvc0 FAILSAFE
initrd /boot/initrd.img-failsafe
savedefault

title Ubuntu 10.10, kernel 2.6.35-24-virtual
root (hd0)
kernel /boot/vmlinuz-2.6.35-24-virtual root=LABEL=uec-rootfs ro console=hvc0 TEST-KERNEL
initrd /boot/initrd.img-2.6.35-24-virtual
savedefault 0


And then update grub to store that the first kernel is the 'saved', which for grub 1 (or 0.97) modifies /boot/grub/default.


sudo grub-set-default 0
sudo reboot


Now, a reboot will boot into the failsafe kernel (which we can verify by checking /proc/cmdline) and see 'FAILSAFE'. Then, to test our "TEST-KERNEL", run:


sudo grub-set-default 1
sudo reboot


After this reboot, the system come up into "TEST-KERNEL" (per /proc/cmdline) but /boot/grub/default will contain '0', indicating that on subsequent boot, the FAILSAFE will run. In this way, if your kernel failed to boot all the way up, you can then just issue:


euca-reboot-instances i-15b77779


And you'll boot back into the FAILSAFE kernel.

The above basically allows you to manually manage your kernels while letting grub-legacy-ec2 still write entries to /boot/grub/menu.lst.

I chose to use hardlinks for the 'failsafe' kernels, so that even on dpkg removal, the files would still exist. Because the 10.10 Ubuntu kernels have the EC2 network and disk drivers built in, you'll still be able to boot even after a dpkg removal of the failsafe kernel or an errant 'rm -Rf /lib/modules/2*'

Friday, January 7, 2011

Using euca2ools rather than ec2-api-tools with EC2

The Ubuntu UEC Images that Ubuntu produces on EC2 are in every way fully supported, "Official Ubuntu". As with other official releases, access to source code for security and maintenance reasons affects our decisions on what is included.

In the UEC images, the most notable packages left out are 'ec2-api-tools' and 'ec2-ami-tools'. I personally use the ec2-api-tools and ec2-ami-tools quite frequently and Amazon has done a great job with them. However, the license and lack of source code prevents them from being in Ubuntu 'main'.

Fortunately
a.) There are packages made available in the Ubuntu 'multiverse' component.
b.) The euca2ools package is installed by default and provides an almost drop in replacement for the ec2-api-tools and ec2-ami-tools.

I think that many users of EC2 aren't aware of the euca2ools, so I'd like to give some information on how to use them here.

The ec2-api-tools use the SOAP interface and thus use the "EC2_CERT" and "EC2_PRIVATE_KEY". The euca2ools sit on top of the excellent boto project. Boto uses the AWS REST api, which means authentication is done with your "Access Key" and "Secret Key". As a result, configuration is a little different. (Note, bundling images, you still need the EC2_CERT and EC2_PRIVATE_KEY for encryption/signing).

Configuration for euca2ools can be done via environment variables (EC2_URL, EC2_ACCESS_KEY, EC2_SECRET_KEY, EC2_CERT, EC2_PRIVATE_KEY, S3_URL, EUCALYPTUS_CERT) or via config file. I personally prefer the configuration file approach.

Here is my ~/.eucarc that is configured to operate with the EC2 us-east-1 region.

CRED_D=${HOME}/creds/aws-smoser
EC2_REGION="${EC2_REGION:-us-east-1}"
EC2_CERT=${CRED_D}/cert.pem
EC2_PRIVATE_KEY=${CRED_D}/pk.pem
EC2_ACCESS_KEY=ABCDEFGHIJKLMNOPQRST
EC2_SECRET_KEY=UVWXYZ0123456789abcdefghijklmnopqrstuvwx
EC2_USER_ID=950047163771
EUCALYPTUS_CERT=/etc/ec2/amitools/cert-ec2.pem
EC2_URL=https://ec2.${EC2_REGION}.amazonaws.com
S3_URL=https://s3.amazonaws.com:443


Things to note above:
  • euca2ools sources the ~/.eucarc file with bash, and then reads out the values of EC2_REGION, EC2_CERT, EC2_PRIVATE_KEY, EC2_ACCESS_KEY, EC2_USER_ID, EC2_URL, S3_URL. This means that you use other bash functionality in the config file as I've done above with 'EC2_REGION'. This allows me to do something like:

    EC2_REGION=us-west-1 euca-describe-images

  • If there is no configuration file specified with '--config', then those values will be read from environment variables

  • Amazon's public certificate from the ami tools is included with euca2ools in ubuntu, and located in /etc/ec2/amitools/cert-ec2.pem

  • Many of the euca2ools commands will run significantly faster than the ec2-api-tools. The reason for slowness of the ec2-api-tools is their man java dependencies (please correct me if I'm wrong).
  • Your ~/.eucarc file contains credentials and therefore it should be protected with filesystem permissions (ie 'chmod go-r ~/.eucarc').
Hopefully this will make it easier for you to use euca2ools with EC2 on Ubuntu.