Tuesday, December 14, 2010

Ubuntu Natty Narwhal Cluster Compute Instances

Some time ago, Amazon announced two new instance types aimed at high performance computing. The new types differ from Amazon's previous offerings in that

  • They use Xen "hvm" (Hardware Virtualization Mode) rather than 'paravirtualization'
  • Only priviledged accounts can create images of the 'hvm' virtualization-type.

The result is that there are very few public images for cluster compute nodes, and up until today, there were no Ubuntu images.

I'm happy to announce that you can now run Official Ubuntu images on cluster compute instance types. From today forward we will be publishing daily builds of Natty Narwhal builds.

These images are identical to the other Ubuntu images. For AMI ids, you can browse the list at http://uec-images.ubuntu.com/server/natty/current/, or use more machine friendly data at http://uec-images.ubuntu.com/query.

There is one known bug (bug 690286) that prevents you from using ephemeral storage on the CC nodes.

If you've got a couple dollars burning a whole in your pocket, you can try one out with:

qurl="http://uec-images.ubuntu.com/query/"
ami_id=$(curl --silent "${qurl}/natty/server/daily.current.txt" |
    awk '-F\t' '$11 == "hvm" && $7 == "us-east-1" { print $8 }')
ec2-run-instances --key mykey --instance-type cc1.4xlarge "${ami_id}"

Tuesday, December 7, 2010

lvm resizing is easy

I have a local mirror of the ubuntu archive, using some scripts based on the Ubuntu Wiki. When I set up "/archive" on my local mirror, I used lvm. The reason for that was primarily so that I could use sbuild with lvm.

Since then, 2 things have happened:
  • sbuild has gained the ability to use aufs rather than LVM snapshots. The solution is much lighter weight, and doesn't require some lvm space sitting around waiting to be used.
  • The ubuntu archive has grown in size from ~ 250G to ~ 400G right now.

So, it was time for me to resize the filesystem that had my archive up to accommodate. For my own record, and possibly others, I thought I'd share what I did.


$ sudo pvscan
PV /dev/sdb VG smoser-vol1 lvm2 [931.51 GiB / 315.57 GiB free]
PV /dev/sda1 VG nelson lvm2 [148.77 GiB / 44.00 MiB free]
$ sudo lvscan
ACTIVE '/dev/smoser-vol1/smlv0' [585.94 GiB] inherit
ACTIVE '/dev/smoser-vol1/hardy_chroot-i386' [5.00 GiB] inherit
ACTIVE '/dev/smoser-vol1/lucid_chroot-i386' [5.00 GiB] inherit
ACTIVE '/dev/smoser-vol1/karmic_chroot-i386' [5.00 GiB] inherit
ACTIVE '/dev/smoser-vol1/karmic_chroot-amd64' [5.00 GiB] inherit
ACTIVE '/dev/smoser-vol1/lucid_chroot-amd64' [5.00 GiB] inherit
ACTIVE '/dev/smoser-vol1/hardy_chroot-amd64' [5.00 GiB] inherit
ACTIVE '/dev/nelson/root' [142.65 GiB] inherit
ACTIVE '/dev/nelson/swap_1' [6.07 GiB] inherit


I had 2 physical volumes, sdb and sda1. 'sdb' had my old sbuild snapshots on it, and also some free space. So, I deleted sbuild snapshots with:


$ sudo lvremove /dev/smoser-vol1/hardy_chroot-i386 \
/dev/smoser-vol1/lucid_chroot-i386 /dev/smoser-vol1/karmic_chroot-i386 \
/dev/smoser-vol1/karmic_chroot-amd64 /dev/smoser-vol1/lucid_chroot-amd64 \
/dev/smoser-vol1/hardy_chroot-amd64


Then, resized the 'smlv0' volume that had my '/archive' on it up to the largest that I could on that physical volume:


$ sudo vgdisplay smoser-vol1
VG Name smoser-vol1
System ID
Format lvm2
<snip>
VG Size 931.51 GiB
...
$ sudo lvresize /dev/smoser-vol1/smlv0 --size 931.51G
Rounding up size to full physical extent 931.51 GiB
Extending logical volume smlv0 to 931.51 GiB
Logical volume smlv0 successfully resized


Then, just resize the ext4 filesystem on that volume:

$ grep archive /proc/mounts
/dev/mapper/smoser--vol1-smlv0 /archive ext4 rw,relatime,barrier=1,data=ordered 0 0
$ sudo resize2fs /dev/mapper/smoser--vol1-smlv0
resize2fs 1.41.11 (14-Mar-2010)
Filesystem at /dev/mapper/smoser--vol1-smlv0 is mounted on /archive; on-line resizing required
old desc_blocks = 37, new_desc_blocks = 59
Performing an on-line resize of /dev/mapper/smoser--vol1-smlv0 to 244190208 (4k) blocks.

The filesystem on /dev/mapper/smoser--vol1-smlv0 is now 244190208 blocks long.


That last operation did take probably 30 minutes, but in the end, I now have:


$ df -h /archive/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/smoser--vol1-smlv0
917G 544G 327G 63% /archive

Wednesday, November 3, 2010

Using Ubuntu Images on AWS "Free Tier"

[Update 2011-01-20]


There are now official Ubuntu AMIs that fit into the Free Tier disk requirements. You can get a list of the AMIs for 10.04 or 10.10.

This article is still useful as documentation, but is not necessary if you only want to use Ubuntu on Amazon's Free Tier.

Amazon AWS recently announced an AWS Free Usage Tier. The summary of which is that new AWS customers can run a t1.micro instance 24x7 for the next year and pay nothing (or at least very little).

There are various restrictions on what you get for free, but the most interesting to the Ubuntu images is:
10 GB of Amazon Elastic Block Storage, plus 1 million I/Os, 1 GB of snapshot storage, 10,000 snapshot Get Requests and 1,000 snapshot Put Requests*

The Official Ubuntu Images have a 15GB root filesystem. What that means is if you're using any of our official images (10.04, 10.10), then you will be charged for 5GB of provisioned storage per month. In the us-east-1 region that would be $0.50/month. In other regions, that would be $0.55/month.

This issue has been raised on the AWS Discussion Forums, but it seems like Amazon is not willing to budge.

Similarly, bug 670161 was opened requesting "10GB root partition for EBS boot AMIs on EC2". If you're interested in following this discussion, subscribe yourself to that bug. I will make sure that it is kept up to date.

I don't want to comment right now on whether or not we will release future EBS root AMIs of 10.04 and and 10.10 with a 10GB filesystem instead of a 15G filesystem. What I do want to discuss is how you can create your own AMI that has a 10G (or smaller) root filesystem, which will perform otherwise identically to the official images.

If you want to use Ubuntu on the Amazon Free Tier *right now*, then you can follow these instructions, which assume you have the ec2-api-tools correctly configured on your laptop. And you have a keypair named "mykey" available in the target region.

In the code (shell prompt) snippits below, '$' prompt indicates command run on my laptop. '%' prompt indicates command run on the ec2 instance. lines beginning with a '#' are comments.

Launch an instance to work with:

# us-east-1 ami-548c783d canonical ebs/ubuntu-maverick-10.10-amd64-server-20101007.1
$ ec2-run-instances --region us-east-1 --instance-type t1.micro \
--key mykey ami-548c783d
$ iid=i-1855ea75
$ zone=$(ec2-describe-instances $iid |
awk '-F\t' '$2 == iid { print $12 }' iid=${iid} )
$ echo ${zone}
us-east-1d
$ host=$(ec2-describe-instances $iid |
awk '-F\t' '$2 == iid { print $4 }' iid=${iid} )
$ echo ${host}
ec2-174-129-61-12.compute-1.amazonaws.com


create a volume in correct zone of the desired size to attach to the instance. Change '10' to '5' if you wanted a 5GB root filesystem.

$ ec2-create-volume --size 10 --availability-zone ${zone}
$ vol=vol-c64d55af
$ ec2-attach-volume --instance ${iid} --device /dev/sdh ${vol}


Then, ssh to ubuntu@${host}, and download the uec reference image and extract it. Below, I've downloaded the i386 image for maverick. You could browse through at http://uec-images.ubuntu.com/releases/10.10/release/ and find an amd64 image or a 10.04 base image.

% sudo chown ubuntu:ubuntu /mnt
% cd /mnt
% url=http://uec-images.ubuntu.com/releases/10.10/release/ubuntu-10.10-server-uec-i386.tar.gz
% tarball=${url##*/}
% wget ${url} -O ${tarball}
% tar -Sxvzf ${tarball}
maverick-server-uec-i386.img
maverick-server-uec-i386-vmlinuz-virtual
maverick-server-uec-i386-loader
maverick-server-uec-i386-floppy
README.files
% img=maverick-server-uec-i386.img
% mkdir src target


create target filesystem, mount the attached volume, and copy source filesystem contents to target filesystem using rsync.

% sudo mount -o loop,ro ${img} /mnt/src
% sudo mkfs.ext4 -L uec-rootfs /dev/sdh
% sudo mount /dev/sdh /mnt/target
# the rsync could take quite a while. for me it took 22 seconds.
% sudo rsync -aXHAS /mnt/src/ /mnt/target
% sudo umount /mnt/target
% sudo umount /mnt/src


Now, back on the laptop, snapshot the volume.

$ ec2-create-snapshot ${vol}
$ snap=snap-b97dfdd3
# now you have to wait for snapshot to be 'completed'
$ ec2-describe-snapshots ${snap}
SNAPSHOT snap-b97dfdd3 vol-c64d55af completed 2010-11-03T17:31:52+0000 100% 950047163771 10


Turn the contents of that volume into an AMI. Note, you must set 'arch', 'rel', and 'region' correctly. Then, we use that information to get the aki associated with the most recent released Ubuntu image.


$ rel=maverick; region=us-east-1; arch=i386; # arch=amd64
$ [ $arch = amd64 ] && xarch=x86_64 || xarch=${arch}
$ qurl=http://uec-images.ubuntu.com/query/${rel}/server/released.current.txt
$ aki=$(curl --silent "${qurl}" |
awk '-F\t' '$5 == "ebs" && $6 == arch && $7 == region { print $9 }' \
arch=$arch region=$region )
$ echo ${aki}
aki-407d9529
$ ec2-register --snapshot ${snap} \
--architecture=${xarch} --kernel=${aki} \
--name "my-ubuntu-${rel}" --description "my-ubuntu-${rel}"
IMAGE ami-4483742d
$ ami=ami-4483742d


Clean up your instance and volume

$ ec2-detach-volume ${vol}
$ ec2-terminate-instances ${iid}
$ ec2-delete-volume ${vol}


And now run your instance

$ ec2-run-instances --instance-type t1.micro ${ami}
$ ssh ubuntu@
% sudo apt-get update && sudo apt-get dist-upgrade
# if you got a new kernel (linux-virtual package), then you will
# need to reboot
% sudo reboot


Now, your newly created image has filesystem contents that are identical to those of the official Ubuntu images, but with a 10G filesystem.

Once you've launched your image, you can actually clean up the snapshot and the AMI id that you launched. To do that:

ec2-deregister ${ami}
ec2-delete-snapshot ${snap}


The cost of the above operations will probably be on the order of pennies, and will remove the costs you would have incurred due to having 15G root volume.

create image with XFS root filesystem from UEC Images

A post was made to the ec2ubuntu google group asking if Ubuntu had any plans to create images XFS root filesystems.

The Official Ubuntu Images for 10.04 LTS (lucid) and prior have an ext3 root filesystem. For Ubuntu 10.10 (maverick) and the development builds of 11.04, the filesystem is ext4.

The selection of ext3 or ext4 is because ext4 is the filesystem selected by default on an Ubuntu install from CD or DVD. This selection is carried over to our Ubuntu images for UEC or EC2. The 10.04 images really should have been ext4, but the change didn't get in for that release.

Ubuntu fully supports the XFS filesystem, it simply wasn't chosen as the default. The -virtual kernel has filesystem support available as a module, and the xfsprogs package is in the main archive.

So, just as you can get full support for the Ubuntu images using ext4, you can get full support from Ubuntu (and paid support from Canonical) by using xfs as your root filesystem. You will simply have to create your own images.

Luckily, due primarily due to the fact that the Ubuntu images are downloadable at http://uec-images.ubuntu.com, the process for creating a XFS based ebs image is trivial.

In the code (shell prompt) snippits below, '$' prompt indicates command run on my laptop. '%' prompt indicates command run on the ec2 instance. lines beginning with a '#' are comments.

Launch an instance to work with:

# us-east-1 ami-688c7801 canonical ubuntu-maverick-10.10-amd64-server-20101007.1
$ ec2-run-instances --region us-east-1 --instance-type m1.large \
--key mykey ami-688c7801
$ iid=i-bcc679d1
$ zone=$(ec2-describe-instances $iid |
awk '-F\t' '$2 == iid { print $12 }' iid=${iid} )
$ echo ${zone}
us-east-1d
$ host=$(ec2-describe-instances $iid |
awk '-F\t' '$2 == iid { print $4 }' iid=${iid} )
$ echo ${host}
ec2-174-129-61-12.compute-1.amazonaws.com



create a volume in correct zone of the desired size to attach to the instance.

$ ec2-create-volume --size 10 --availability-zone ${zone}
$ vol=vol-c64d55af
$ ec2-attach-volume --instance ${iid} --device /dev/sdh ${vol}


Then, ssh to ubuntu@${host}, and download the uec reference image, extract, and get necessary packages:

% sudo chown ubuntu:ubuntu /mnt
% cd /mnt
% url=http://uec-images.ubuntu.com/releases/10.10/release/ubuntu-10.10-server-uec-i386.tar.gz
% tarball=${url##*/}
% wget ${url} -O ${tarball}
% tar -Sxvzf ${tarball}
maverick-server-uec-i386.img
maverick-server-uec-i386-vmlinuz-virtual
maverick-server-uec-i386-loader
maverick-server-uec-i386-floppy
README.files
% img=maverick-server-uec-i386.img
% mkdir src target
% sudo apt-get install xfsprogs


create target filesystem, mount the attached volume, and copy source filesystem contents to target filesystem using rsync.

% sudo mount -o loop,ro ${img} /mnt/src
% sudo mkfs.xfs -L uec-rootfs /dev/sdh
% sudo mount /dev/sdh /mnt/target
% sudo rsync -aXHAS /mnt/src/ /mnt/target
% sudo umount /mnt/target
% sudo umount /mnt/src


Above, you could have mounted /proc and /sys into /mnt/target, chrooted into it and done a dist-upgrade. I left that out for simplicity.

Now, back on the laptop, snapshot the volume.

$ ec2-create-snapshot ${vol}
$ snap=snap-b97dfdd3
# now you have to wait for snapshot to be 'completed'
$ ec2-describe-snapshots ${snap}
SNAPSHOT snap-b97dfdd3 vol-c64d55af completed 2010-11-03T17:31:52+0000 100% 950047163771 10


Turn the contents of that volume into an AMI. Note, you must set 'arch', 'rel', and 'region' correctly. Then, we use that information to get the aki associated with the most recent released Ubuntu image.


$ rel=maverick; region=us-east-1; arch=i386; # arch=amd64
$ [ $arch = amd64 ] && xarch=x86_64 || xarch=${arch}
$ [ $arch = amd64 ] && blkdev=/dev/sdb || blkdev=/dev/sda2
$ qurl=http://uec-images.ubuntu.com/query/${rel}/server/released.current.txt
$ aki=$(curl --silent "${qurl}" |
awk '-F\t' '$5 == "ebs" && $6 == arch && $7 == region { print $9 }' \
arch=$arch region=$region )
$ echo ${aki}
aki-407d9529
$ ec2-register --snapshot ${snap} \
--architecture=${xarch} --kernel=${aki} \
--block-device-mapping ${blkdev}=ephemeral0 \
--name "my-${rel}-xfs-root" --description "my-${rel}-xfs-description"
IMAGE ami-4483742d
$ ami=ami-4483742d


Clean up your instance and volume

$ ec2-detach-volume ${vol}
$ ec2-terminate-instances ${iid}
$ ec2-delete-volume ${vol}


And now run your instance

$ ec2-run-instances --instance-type t1.micro ${ami}


ssh to your instance, verify that it is in fact xfs:

% grep uec-rootfs /proc/mounts
/dev/disk/by-label/uec-rootfs / xfs rw,relatime,attr2,nobarrier,noquota 0 0


Now, your newly created image has filesystem contents that are identical to those of the official Ubuntu images.

Some notes on the above:
  • Many people believe that transition to btrfs as the default filesytem is inevitable, possibly even for the 12.04 LTS release. Doing this on EC2 would require that amazon release btrfs support in a pv-grub kernel.
  • Outside of creating a 'xfs' filesystem, the steps above are very generic "create a custom EBS root image" instructions. In fact, the process outlined above is used for the actual publishing of ebs images via the ec2-publishing-scripts (see ec2-image2ebs)
  • .
  • The process above will work using the maverick based images. Lucid images are not likely to work out of the box because they do not boot with a ramdisk. Where maverick images use pv-grub to load the kernel and ramdisk from inside the image, Lucid kernels are loaded by xen directly and Canonical did not publish ramdisks for the lucid release.

Friday, October 22, 2010

UDS-N Call for participation

The Ubuntu Developer Summit (UDS) is the event in which the Ubuntu community discusses and plans the upcoming Ubuntu release. UDS Natty begins Monday, October 25th (this monday) outside of Orlando, FL, USA. If you're in the Orlando area, this event is free, and open to anyone.

If you've not yet made plans to attend physically, then its unlikely that you'll be present in the rooms. However, the Canonical IS does an outstanding job of making remote participation possible. For more information, on how you can participate remotely, read the Remote Participation document. In short, you join an IRC channel, listen to a live high quality audio stream from the room, and can see edits to a live gobby document.

The comprehensive list of all sessions is available through the summit schedule, or a filtered list of only the 'Cloud track'.

Some of the sessions that I personally am interested (ok, interested *and* leading) in are:
  • cloud-server-n-cloud-images: Here we'll discuss how we can make the Ubuntu Images on EC2 or UEC better. If you've used them, then we're interested in your feedback.
  • cloud-server-n-image-rebundle: Here we'll discuss possible improvements in the rebundling process. Ie, how can you take one of the Ubuntu images, and customize it and turn it into your own AMI. This is a common operation, and unfortunately, one that has some sticking points. If you have other ideas for "cloud utilities" this is the place to bring them up.
  • cloud-server-n-desktop-images: Ubuntu has made desktop images available on ec2 in a "tech preview" like manner for 2 releases. We've not fully supported these images, but a version of the cloud images with a remote-desktop interface is a common request. If you want to see what it might look like, try out the free Edubuntu demo using NX
  • cloud-server-n-cloud-init: Here we'll discuss improvements to cloud-init or cloud-config. If you've customized images via user-data, then you've used cloud-init. How can we make it better ?
  • cloud-server-n-ubuntu-trial: We threw together awstrial fairly quickly, and with the 10.10 release, we allowed anyone interested to Try Ubuntu Free. We got a great response and we see lots of ways this can be used. Come and let us know what you think

There are loads of other interesting sessions. If 'Server' or 'Cloud' isn't your thing, there are other tracks that might peek your interest.

How to rebundle Ubuntu 10.10 (Maverick Meerkat) EBS root

There was a post on the ec2ubuntu list regarding a problem the poster had with re-bundling a Ubuntu 10.10 instance. The poster followed another blog entry. I responded to the post, but figured that a blog entry might reach a larger audience.

The simple summary of my long winded post is this:
    To rebundle an EBS root image, use ec2-create-image.

If you've launched one of the Official Ubuntu 10.10 Images and modified it a bit, the best way to create a new AMI with the modifications in it is to:

1.) stop the instance (do not terminate it)
2.) "create-image"
3.) wait for your new AMI to become available and run your new instance

Note, that all of the above steps can be done from the EC2 console as well as by using the command line tools. I don't think I previously realised how nice this api call is.

The reason that the title of this post includes 'Ubuntu 10.10' rather than just "Ubuntu", is that releases of 10.04 and earlier do not utilize pv-grub. If you're re-bundling one of those images, you have to take additional steps to get a new kernel upgrade. You really should take those steps, as kernel upgrades include important security fixes, and newly created AMIs should always be created with the most recent kernel available.

Friday, October 8, 2010

Try out Ubuntu Server 10.10 on EC2 for FREE!

I'm mostly disappointed that it wasn't my idea. Dustin deserves all the credit. The implementation (awstrial) was a fairly straightforward programming exercise. I don't mean at all to discount the work of the others who contributed to the awstrial project, and I had a blast in my first django experience, but the idea was the brilliance.

What idea ? The idea to allow anyone to try out Ubuntu 10.10 on EC2 for free for 55 minutes. No hardware is needed. While you're waiting for your desktop or server ISO to download, you can give a test drive to the server. You'll have 170Gb disk, 2G memory, and a very high speed internet connection with local LAN access to Ubuntu mirrors.

You do not need an AWS account or a credit card.

Here is how it works:

  • Sometime on Sunday, October 10, 2010 Ubuntu 10.10 (Maverick Meerkat) will be released.
  • At that point, and a limited time following, you'll be able to go to 10.cloud.ubuntu.com and launch an instance.
  • wait 3 minutes or less
  • ssh to the instance
  • do something
  • 55 minutes after launch, your instance will be terminated

In order to take part in this, you'll need to have a launchpad.net or Ubuntu Single Singon account. If you don't have one, then create one. If you're reading this, you probably have ssh keys in a file called ~/.ssh/id_rsa.pub, so go to https://launchpad.net/~YOUR_LOGIN/+editsshkeys and paste those keys in.

Now, back to the 'do something' item above. What should you do?

Heres some suggestions, but I'm certain that you're more creative than I am.
  • Check if a bug you opened (or were annoyed by) in 10.04 LTS is still present. If it was fixed, then make sure the bug in launchpad is marked 'Fix Released'.
  • Take Postgres 9.0for a spin, thanks to Martin Pitt's PPA builds
  • run 'rm -Rf /' just to see what happens.
  • Find something that is broken. Open a bug, using 'ubuntu-bug'.
  • Hack on awstrial

Whatever you do, I hope you enjoy taking the "Official Ubuntu Image" for a spin. Do something cool, blog about it, tell people how easy it was.

Monday, September 27, 2010

Using Policies in AWS Identity and Access Management

For a project I'm working on here at work, I wanted to create a AWS user that could only launch instances, but could not write to S3, query SDB

{
 "Statement":[ {
    "Effect":"Allow",
    "Action":["ec2:RunInstances","ec2:RebootInstances",
              "ec2:GetConsoleOutput", "ec2:DescribeInstances" ],
    "Resource":"*"
  }, {
    "Effect":"Allow",
    "Action":["ec2:StopInstances","ec2:StartInstances",
              "ec2:TerminateInstances"],
    "Resource":"*"
  }, {
    "Effect":"Allow",
    "Action":["ec2:DescribeImages"],
    "Resource":"*"
  }, {
    "Effect":"Deny",
    "NotAction":["ec2:*"],
    "Resource":"*"
  } ]
}


I had hoped that I could allow this user to create his own security groups and keypairs (for launching instances with 'ec2-run-instances --key', and that I could also allow him to modify or delete those items as well. Unfortunately, I was not able to figure out how to do this. What I had hoped I could do was something like was something like:
{
  "Effect":"Allow",
  "Action":["ec2:*SecurityGroup*"],
  "Condition" :  {
     "StringLike": {
        "ec2:groupName":"foouser*"
     }
  },
  "Resource":"*"
},
{
  "Effect":"Allow",
  "Action":["ec2:*KeyPair*"],
  "Condition" :  {
     "StringLike": {
        "ec2:keyName":"foouser*"
     }
  },
  "Resource":"*"
}

The 'keyName' is an atribute of the [Add,Delete,Describe]KeyPair API calls, and 'groupName' is an attribute of the [Add,Delete]SecurityGroups and AuthorizeSecurityGroupIngress API calls as described in the EC2 API

My goal was to limit the user ('foouser') to manipulating SecurityGroups or KeyPairs that begain with 'foouser'. This would be a clear indication to other users of the account when they came across them.

However, the 'Condition' syntax isn't as "open" as that (couldn't think of a better term for than 'open'). I can think of reasons as to why it would be difficult or undesireable to make Conditions function like I wanted, but it would have been nice.

The IAM EC2 documentation indicates that EC2 only supports the following Condition types: aws:CurrentTime aws:EpochTime aws:SecureTransport aws:SourceIp and aws:UserAgent.

It seems to me that SecurityGroups and keypairs are an essential piece of Using EC2, but it seems like these are stuck at the account level, with no ability to limit them at the user level.

Another thing that I would like to do is give the user the ability to launch / stop / start / terminate her own instances, but not other users of the account. If I truly try to use IAM to split up my account usage, say with 'Development' and 'Production' users or groups, this is essential. When I use the 'Development' user I want to be protected from an accidental reboot or terminate of a 'Production' instance.

For example, I test our Official Ubuntu images. The testing scripts launch several instances. While they're running, I'll often be doing development, and also have an instance running. I would like my 'development' work to not accidentally terminate (or otherwise affect) my test runs. As it is right now, a 'euca-describe-instances' will show me all instances in either account, just waiting for me to copy and paste wrong and terminate one.

It is quite possible that I've missed something, if so, please let me know.

Wednesday, September 22, 2010

Playing with AWS Access Identity Management

Today I finally found some time to play with AWS Identity and Access Management. If you hadn't seen the announcement, or aren't familiar, the IAM tools basically allow you to create, manage, and limit multiple AWS accounts under a single account.


There are 2 reasons that immediately spring to mind for when you should use this:

  • If you're sharing an single AWS account between multiple people, then using this is almost required.
  • you want to use some AWS facility from inside an EC2 instance. Here, it just seems scary to put the entire keys to your account onto a remote machine.
To get started, I walked through the Getting Started guide. I downloaded the IAM Tools, and set them up as described. On Ubuntu, that consisted of:



$ wget http://awsiammedia.s3.amazonaws.com/public/tools/cli/latest/IAMCli.zip
$ unzip IAMCli.zip
$ vi my-account-creds.txt
$ cat my-account-creds.txt
AWSAccessKeyId=ABCDEFGHIJKLMNOPQRST
AWSSecretKey=zyxwvutsrqponmlkjihgfedcbazyxwvutsrqponm
$ export AWS_CREDENTIAL_FILE=my-account-creds.txt
$ export AWS_IAM_HOME=$PWD/IAMCli
$ export PATH=$AWS_IAM_HOME:$PATH JAVA_HOME=/usr


Then, I created a user and admin group as described in the guide:


$ iam-groupcreate -g admins
$ cat AdminGroupPolicy.txt
{
   "Statement":[{
      "Effect":"Allow",
      "Action":"*",
      "Resource":"*"
      }
   ]
}

$ iam-groupuploadpolicy -g admins -p AdminGroupPolicy -f AdminGroupPolicy.txt
$ iam-usercreate -u smoser -g Admins -k -v
TSRQPONMLKJIHGFEDCBA
mnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
arn:aws:iam::950047163771:user/smoser
AHDAIABBGZ3Q31XMUE4AN


The first line of the iam-createuser output is the AWSAccessKeyId, and the second is the AWSSecretKey. I quickly added those to a file my-user-creds.txt as shown above, and set AWS_CREDENTIAL_FILE=my-user-creds.txt .

That's all it took. Now I have a set of credentials that I can use, and if they're lost or stolen, I can revoke them with the (now safely locked up) account credentials.

At this point, I could use the euca2ools with a config file like:

$ cat my-user-eucarc
AWSAccessKeyId=ABCDEFGHIJKLMNOPQRST
AWSSecretKey=mnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
EC2_SECRET_KEY=${AWSSecretKey}
EC2_ACCESS_KEY=${AWSAccessKeyId}
EC2_USER_ID=950047163771
EC2_URL=https://ec2.amazonaws.com
S3_URL=https://s3.amazonaws.com:443
EC2_CERT=/etc/ec2/cert-ec2.pem
$ euca-describe-instances --config my-user-eucarc


Additionally, the above will also suffice as an AWS_CREDENTIAL_FILE for the iam-tools.

Thats great, but for one reason or another, I end up using the ec2-api-tools for a large amount of my work. Those tools require a private key and certificate. So, I had to go about creating one for my new user. Thanks to Nate@AWS in an EC2 Forum Post, that was easy also.


$ openssl version
OpenSSL 0.9.8o 01 Jun 2010
$ openssl genrsa 1024 > pk.pem
$ openssl req -new -x509 -nodes -sha1 -days 730 -key pk.pem -out cert.pem
# follow prompts here
$ iam-useraddcert -u smoser -f cert.pem
$ export EC2_PRIVATE_KEY=$PWD/pk.pem EC2_CERT=$PWD/cert.pem
$ ec2-describe-instances ...


Now my ec2-api-tools are functional. I have to admit to not completely understanding the implications of self signing a certificate here and uploading it. However, as I was authenticated to do the upload (via https and the given credentials) and only my user will use that signing key, I don't know what harm there could be.

Now I have the following TODOs:
  • Post about creating an IAM Policy
  • package the IAM tools for Ubuntu multiverse

Updates:

  • 2010-10-12: update case in AWS_IAM_HOME string ('IamCli' -> 'IAMCli')

Thursday, September 9, 2010

running Ubuntu on an Amazon "micro" instance

Amazon announced today a new instance type. The "micro" instance type (t1.micro) has 613 MB of memory and can be used with AMIs of either x86 or x86_64. The cost for either is only $0.02 per hour.

That means that you can try out Ubuntu on EC2 for 2 measly cents. The official images of Ubuntu 10.04 LTS (Lucid Lynx) are perfect for this. If you're more adventurous and want to try out 10.10, please try out a daily build.

I think this smaller instance size really makes EC2 available to a lot more people. The minimum price for running an instance for a month goes from something like $70 (31 * 24 * 0.095) to less than $15. If you add to that the fact that you can shut down the instance and not be charged for unused CPU time, then start it back up when you need it, it can be really cheap. If you happen to need more CPU power, you can even shut it down, and start it back up with more resources!

So what are you waiting for!

$ MY_LAUNCHPAD_ID="smoser"
$ printf "#cloud-config\n%s\n" "ssh_import_id: [${MY_LAUNCHPAD_ID}]" > user-data.txt
$ ec2-run-instances --region us-east-1 --user-data-file=user-data.txt ami-1234de7b


A few things to note:
  • Bug 634102 has to be worked around or you will not be able to reboot your instance. There are cloud-init debs available in my personal PPA with fixes. 10.10 won't have the issue, and I'll work on getting the fix backported to 10.04.  You can easily fix that by running:

    arch=$(uname -m)
    [ "$arch" = "x86_64" ] && ephd=/dev/sdb || ephd=/dev/sda2
    sudo sed -i.dist "\,${ephd},s,^,#," /etc/fstab

  • One interesting feature of this instance type is that it will run images that are x86 or x86_64 arch. Previously each instance type only ran a single arch.
  • Above, I launched the instance with cloud-config syntax that will pull in my keys from launchpad rather than using public ssh keys stored in EC2. Thats why I didn't need to pass '--key <mykey>'

edits

  • I fixed the ami id above, I had listed a instance-store image.  Also, a thing to note is that with t1.micro instances you *have* to use EBS images.
  • I fixed the 'sed' command above, and fixed the price for m1.small (0.095, not 0.95)

Wednesday, July 28, 2010

Verify SSH Keys on EC2 Instances

Like every server, every EC2 instance should have a unique ssh fingerprint. On "real servers" this fingerprint is generated at first installation of the openssh-server package. On EC2, instead, it is done on first boot of an instance. This is because each instance is a byte for byte copy of a registered image.

What this means to you, is that when you launch an instance and then connect with ssh, you'll see something like:


$ ssh -F /tmp/smoser/foo ec2-67-202-47-56.compute-1.amazonaws.com
The authenticity of host 'ec2-67-202-47-56.compute-1.amazonaws.com (67.202.47.56)' can't be established.
RSA key fingerprint is f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54.
Are you sure you want to continue connecting (yes/no)?


The ssh client is informing you that you are connecting to a host that you do not have ssh keys stored for. In short, it cannot confirm the identity of 'ec2-67-202-47-56'. There could be a "Man in the Middle" who is attempting to trick you. Just as with "real servers", you should identify that remote system via an out of band method. To do this outside of EC2, you might call a hosting provider up and ask them to verify the fingerprint that you see. On EC2, the only out of band transport is the ec2 console.

In order to provide you with the fingerprint that you need, the ssh fingerprint is written to the console when it is booted. You can see this with ec2-get-console-output.

As seen with the results of Eric's poll on alestic.com, this is a very little known or used piece of information. Over 50% of alestic.com voters have "never verified the fingerprint".


$ euca-get-console-output i-72bf1518 | grep ^ec2:
ec2:
ec2: #############################################################
ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
ec2: 2048 f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54 /etc/ssh/ssh_host_rsa_key.pub (RSA)
ec2: 1024 28:f3:ef:a6:86:05:50:33:76:16:24:32:56:14:06:13 /etc/ssh/ssh_host_dsa_key.pub (DSA)
ec2: -----END SSH HOST KEY FINGERPRINTS-----
ec2: #############################################################


Note that the ssh fingerprint reported on the console matches the one that ssh client asked me to confirm above. So, I now know that the host I've connected to is the one that I just started.

Putting this all together, lets say you have booted a new EC2 instance, with instance-id i-72bf1518 and hostname ec2-67-202-47-56.compute-1.amazonaws.com.

First we will use ssh-keyscan to get the fingerprint that is being reported by the remote host, and store that in a shell variable 'fp'


$ iid=i-72bf1518
$ ihost=ec2-67-202-47-56.compute-1.amazonaws.com

$ ssh-keyscan ${ihost} 2>/dev/null > ${iid}.keys
$ ssh-keygen -lf ${iid}.keys > ${iid}.fprint
$ read length fp hostname id < ${iid}.fprint
$ echo $fp
f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54



This fingerprint should also appear on the console output of the instance. If it doesn't, then something is wrong. So, we'll get the console output, and grep through it looking for the fingerprint:


$ euca-get-console-output ${iid} > ${iid}.console
$ grep "ec2: ${length} ${fp}" ${iid}.console
ec2: 2048 f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54 /etc/ssh/ssh_host_rsa_key.pub (RSA)


We've now verified that the host we're connecting to is the host we just launched, so we can connect safely. Now you can clean out any old occurences of that host in known_hosts and tell the ssh client that this is a "known_host"


# remove existing entries in ~/.ssh/known_hosts for this host
$ ssh-keygen -R "${ihost}"

# hash the output of the known hosts file. This prevents someone
# from reading known_hosts as simple list of remote hosts you have
# access to in the event that one of your keys was compromised.

$ ssh-keygen -H -f ${iid}.keys

# Add the key to your known_hosts
$ cat ${iid}.keys >> ~/.ssh/known_hosts

# remove the temporary files we created
$ shred -u "${iid}."*


There, we've now verified that the remote host is the instance we started and told the ssh client about it.

Unfortunately, console output on ec2 is only updated approximately every 4 minutes. So, you can't run through this process until you have console output to check.

Updates
  • [2010-09-22]: fix mismatched use of 'iid' and 'ihost'

Thursday, May 20, 2010

Easily test or demo Ubuntu Enterprise Cloud (UEC) in EC2 instance

Consider the following items:
  • recent efforts by Robert Collins, Dustin Kirkland and others have enabled running all components of UEC on a single system.
  • while performance suffers greatly, Eucalyptus can easily be made to use qemu rather than kvm for virtualization without hardware virtualization extensions
  • Ubuntu server images are quickly launchable on EC2

Now, re-read those, but do so while thinking about wanting to provide people with an easy way to test UEC.

Thats right, if you have $0.40 per hour, and an EC2 account, you can play with UEC. No real hardware required.

This nested virtualization is obviously not going to provide you with the worlds fastest performing cloud, but it will provide a functional system for test or demo purposes.

I've put some copy-and-paste shell code in commands.txt in a bzr branch uec-on-ec2.

  • check out the bzr branch: lp:~smoser/+junk/uec-on-ec2

    bzr branch lp:~smoser/+junk/uec-on-ec2

  • follow 'commands.txt', copy and pasting its content bit by bit in a root shell (run 'sudo -s').
  • Publish and run an instance (do this as 'ubuntu' user):

    $ uec-publish-tarball fastboot-amd64-0.11.tar.gz fastboot-amd64-0.11 amd64
    # the above creates an emi, see its output
    $ ( umask 066 ; euca-add-keypair mykey > mykey.pem )
    $ euca-run-instances --addressing private --key mykey emi-4D6C12BF
    # soon this will enter 'running' state and have an IP address associated with it.
    $ ssh -i mykey ubuntu@${IPADDR}


Your instance should now be functional.

There are a couple things that could be cleaned up on this, patches are very welcome:
* ideally all of the setup could be done from a '#!' user data script, I've just not worked out all the timing yet.
* separate the node out, allowing for multiple nodes
* fix the requirement of private addressing for functional nodes
* fix the issue with the NC being eventually discovered 3 times (with each of its 3 IP addresses)

That said, you should be able to test this and try out UEC on EC2 in a m1.large for less than $0.40 per hour.

Sunday, May 9, 2010

UDS Maverick: Call for Participation

The Ubuntu Developer Summit (UDS) is the event in which the Ubuntu community discusses and plans the upcoming Ubuntu release. UDS Maverick begins Monday, May 10 (tomorrow) in Brussels.

If you've not yet made plans to attend physically, then its unlikely that you'll be present in the rooms. However, the Canonical IS does an outstanding job of making remote participation possible. For more information, on how you can participate remotely, read the Remote Participation document. In short, you join an IRC channel, listen to a live high quality audio stream from the room, and can see edits to a live gobby document.

The comprehensive list of all sessions is available through the summit schedule, or a filtered list of only the Ubuntu Server sessions.

Below is short, self centered, list that you might find interesting. The Ubuntu Community would love to have your participation.

  • Running cloud images outside UEC or EC2 (Tuesday 11:00 UTC+1): The UEC images that we produce to run in EC2 or UEC are ready-to-go filesystem image of Ubuntu Server. It seems that these images might also serve a more general purpose as a live demo of ubuntu server, or a very convenient starting point for customizing your own. Here we'll discuss other ways these images could be used, and what would need to be done to make that possible.
  • Improvements for cloud-init (Tuesday, 09:00 UTC+1) : This session will cover ways in which the cloud images can be made more user friendly. If you've ever booted one of the Ubuntu images, or re-bundled one, I'd like to know what we could do to the images to make that easier.
  • Handling kernel upgrades in EC2 and UEC (Wednesday 11:00 UTC+1): When a user starts a UEC/EC2 instance, they specify they have the option of specifying the kernel/ramdisk to use with it or use the default associated with the instance. Afterwards, the instance has no ability to modify that initial selection. We'll discuss ways that we could improve the user experience by making that limitation more clear, or possibly providing ways to overcome it.
  • Improve cloud libraries in ubunt (Thursday, 10:00 UTC+1): Throughout the 10.04 release, we packaged some popular libraries for interaction with AWS. We'd like to continue that work in Maverick. If you have suggestions on libraries you use that are not present in Ubuntu, please let us know
  • Utilties for easier interaction with UEC or EC2 (Tuesday, 10:00 UTC+1): If you have ideas on utilities that would make your life using Ubuntu on EC2 or UEC easier, please attend this session and let us know.
  • server-maverick-conffiles-and-puppetThis should be an interesting session discussing how Ubuntu can improve the management of conffiles. Soren Hansen has sent an email with more information
  • Discussion on plans for vmbuilder (17:10 UTC+1): Many people have used vmbuilder to build virtual machine images. In this session we'll discuss where vmbuilder is and where it is going.

If you have input for any session, but are not able to attend in real time, feel free to send me input at smoser at sign ubuntu dot com, and I'll try to make sure it gets brought up.

Oh, and if you're reading this on Sunday, May 9, don't forget Mother's Day.

Update 2010-05-10: Added times for sessions and used the titles rather than blueprint names so you can find them on the schedule

Wednesday, May 5, 2010

Ubuntu 10.04 LTS - Lucid Lynx available in all EC2 Regions

I'm a bit late to the party. Ubuntu 10.04 LTS (Lucid Lynx) was released almost a week ago at this point. Multitudes of others have blogged, reviewed, used, etcetera.

That said, I wanted to announce it here, and point out a few things. To get a list of AMI ids for Ubuntu 10.04 LTS Server, look at either:

Now, some miscellaneous items I wanted to mention:
  • 10.04 LTS was released on EC2 in all 4 regions (us-east-1, us-west-1, eu-west-1, and the new ap-southeast-1) at the same time as all other Ubuntu releases. The 10.04 LTS images were available on the ap-southeast-1 region less than 24 hours after it was officially announced.
  • The published EC2 images do not have a ramdisk associated with them. This is by design. The kernel has enough smarts built in to find the root filesystem and boot the system. The end result is that we only have to update and manage 2 pieces instead of 3, and the instances should boot faster.
  • In addition to 10.04 images, we also populated the ap-southeast-1 region with the latest released versions of 8.04 and 9.10 images.

Thursday, April 22, 2010

Upgrading an EBS Instance

Update 20110323: If you are reading this article, you almost certainly should be reading my Migrating to pv-grub kernels for kernel upgrades. The process described here will still work for 10.04 images, but the process described there is ultimately much easier. If you are using 10.10, or 11.04 images, you do not need to do anything, simply 'apt-get update && apt-get dist-upgrade && reboot' to get a new kernel.

For the majority of the existence of EC2 there was no way to change the kernel that an instance was using. With the addition of EBS instances, that changed. I wanted to explain how you can take advantage of that EBS feature by upgrading a Ubuntu 10.04 LTS (Lucid Lynx) instance launched from a Beta-2 AMI to the Release Candidate. This same basic process should also allow you to upgrade across a release, perhaps from a 9.04 Alestic instance to Ubuntu 10.04 LTS.

In reality, if you're hoping to upgrade you're kernel and EBS instance, its because you already have one running and need to upgrade. But for the sake of this excercise, we'll launch a new instance based on the Beta-2 image in the us-east-1 region and then connect to it.

$ ec2-run-instances --key mykey ami-4be50b22
# wait a bit
$ ec2-describe-instances | awk '-F\t' '$1 == "INSTANCE" { print $4 }'
ec2-184-73-101-171.compute-1.amazonaws.com
$ ssh -i mykey.pem ubuntu@ec2-184-73-101-171.compute-1.amazonaws.com

Now, on the instance we'll go ahead and do the upgrade. Whenever I'm working on ec2, I like to use GNU screen to protect against lost network.
% screen -S upgrade

This is the same basic process as upgrading any Ubuntu system. First update and then upgrade. Here, I've run '--dry-run' to point out that there would be kernel upgrades.

Also, the ec2 images suffer from a bug where grub will be installed and prompt you for some information even though its of no value. Because I know what those prompts will be, I'm go ahead and set them here so you're not interrupted during the dist-upgrade.

% sudo apt-get update
% sudo apt-get dist-upgrade --dry-run | grep "linux.*ec2"
  libparted0debian1 linux-image-2.6.32-21-virtual linux-image-2.6.32-305-ec2
  libpolkit-gobject-1-0 libpython2.6 libss2 libudev0 linux-ec2 linux-firmware
  linux-image-ec2 linux-image-virtual linux-virtual locales module-init-tools
Inst linux-image-2.6.32-305-ec2 (2.6.32-305.9 Ubuntu:10.04/lucid)
Inst linux-ec2 [2.6.32.304.5] (2.6.32.305.6 Ubuntu:10.04/lucid) []
Inst linux-image-ec2 [2.6.32.304.5] (2.6.32.305.6 Ubuntu:10.04/lucid)
Conf linux-image-2.6.32-305-ec2 (2.6.32-305.9 Ubuntu:10.04/lucid)
Conf linux-image-ec2 (2.6.32.305.6 Ubuntu:10.04/lucid)
Conf linux-ec2 (2.6.32.305.6 Ubuntu:10.04/lucid)

% echo grub-pc grub2/linux_cmdline string | sudo debconf-set-selections
% echo grub-pc grub-pc/install_devices_empty boolean true | sudo debconf-set-selections

% sudo apt-get dist-upgrade
..
94 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 84.1MB of archives.
..

Above, If you were upgrading from a previous release to 10.04, then you would use 'do-release-upgrade' from the 'update-manager-core' package.

At this point we've got 2 kernels installed, the new one and the old one. Unsurprisingly, we're booted into the old one.

% dpkg-query --show | grep "linux.*ec2"
linux-ec2       2.6.32.305.6
linux-image-2.6.32-304-ec2      2.6.32-304.8
linux-image-2.6.32-305-ec2      2.6.32-305.9
linux-image-ec2 2.6.32.305.6
% uname -r
2.6.32-304-ec2

Above, we can see that the 2.6.32-305.9 version of the kernel is the newest one. Its installed locally, but to boot it we have to find the aki of the version that is published by Ubuntu to ec2. The Ubuntu kernels are registered in ec2 such that you can correlate the dpkg version to the registered aki. We're going to query all the images, save that output to a file and then search for results that are owned by the Canonical user, and match our version string and arch.

$ owner=099720109477; # this is the canonical user's id
$ ver=2.6.32-305.9; arch=i386
$ ec2-describe-images --all > /tmp/images.list
$ awk '-F\t' '$4 == o && $3 ~ v && $8 == a { print $2, $3 }' \
   a=${arch} "o=${owner}" "v=${ver}" /tmp/images.list
aki-1f02ec76 099720109477/ubuntu-kernels-milestone/ubuntu-lucid-i386-linux-image-2.6.32-305-ec2-v-2.6.32-305.9-kernel
aki-d324caba 099720109477/ubuntu-kernels-testing/ubuntu-lucid-i386-linux-image-2.6.32-305-ec2-v-2.6.32-305.9-kernel

That shows us that we have 2 kernels available matching that explicit version. One is "testing", and one is "milestone". These are actually the same thing. We label the kernels differently so the user easily knows what is testing and what is "released". The released kernel version will be labeled with 'ubuntu-kernels'. The RC kernel version gets labeled "ubuntu-kernels-milestone".

In order to change the kernel, we have to stop the instance, modify the 'kernel' attribute, and then start it up again.
$ ec2-stop-instances i-23453048
$ ec2-modify-instance-attribute --kernel aki-1f02ec76
kernel   i-23453048  aki-1f02ec76

$ ec2-start-instances i-23453048
# wait a bit
$ ec2-describe-instances | awk '-F\t' '$1 == "INSTANCE" { print $4 }'
ec2-184-73-116-205.compute-1.amazonaws.com

So, in theory, we should have booted into our nice and shiny-new kernel. Lets test that theory:
$ ssh ubuntu@ec2-184-73-116-205.compute-1.amazonaws.com 'uname -r'
2.6.32-305-ec2

There you have it! This process can be applied to upgrading the RC to Release (which likely won't have a kernel change), or, eventually to upgrading your 10.04 LTS instance to a Maverick one.

Monday, March 29, 2010

Introducing cloud-init's cloud-config syntax

Since this is my first post on a brand new blog, *and* I'm fairly new to the Ubuntu community, I figure I should introduce myself. My name is Scott Moser, I've been a member of the Ubuntu Server team for the past 9 months or so. The majority of my time has been focused on Ubuntu's cloud efforts, both on ec2 and on the Ubuntu Enterprise Cloud (UEC).

Thats enough personal introduction, now on to the content.

The cloud-init package provides "first boot" functionality for the Ubuntu UEC images. It is in charge of taking the generic filesystem image that is booting and customizing it for this particular instance. That includes things like:
  • setting the hostname
  • putting the provided ssh public keys into ~ubuntu/.ssh/authorized_keys
  • running a user provided script or otherwise modifying the image
If you have used the Official Ubuntu Images for Hardy or Karmic, you may be aware that the above functionality was previously provided by ec2-init. The cloud-init is package is largely a "cloud agnostic" replacement for ec2-init. The AWS specific portion of the name didn't fit with UEC at the moment, and seemed limiting for the future. We hope to have it working on other cloud offerings as well.

Setting hostname and configuring a system so the person who launched it can actually log into it are not terribly interesting. The interesting things that can be done with cloud-init are made possible by data provided at launch time called user-data.

ec2-init, cloud-init, and the Alestic images support customization through user-data in one very simple yet effective manner. If the user-data starts with '#!', then it will be stored and executed as root late in the boot process of the instance's first boot (similar to a traditional 'rc.local' script). Output from the script is directed to the console. For example:

$ cat ud.txt
#!/bin/sh
echo ========== Hello World: $(date) ==========
echo "I have been up for $(cut -d\  -f 1 < /proc/uptime) sec"

$ ec2-run-instances ami-a908e7c0 --key mykey.us-east-1 \
   --user-data-file=ud.txt
# wait now for the system to come up and console to be available

$ ec2-get-console-output i-97fc7afc | grep --after-context=1 Hello
========== Hello World: Mon Mar 29 18:05:05 UTC 2010 ==========
I have been up for 28.26 sec

The simple approach shown above gives a great deal of power. The user-data can contain a script in any language where an interpreter already exists in the image (#!/bin/sh, #!/usr/bin/python, #!/usr/bin/perl, #!/usr/bin/awk ... ).

For many cases, the user may not be interested in writing a program. For this case, cloud-init provides "cloud-config", a configuration based approach towards customization. To utilize the cloud-config syntax, the supplied user-data must start with a '#cloud-config'. For example:

$ cat cloud-config.txt
#cloud-config
apt_upgrade: true
apt_sources:
- source: "ppa:smoser/ppa"

packages:
- build-essential
- pastebinit

runcmd:
- echo ======= Hello World =====
- echo "I have been up for $(cut -d\  -f 1 < /proc/uptime) sec"

$ ec2-run-instances ami-a908e7c0 --key mykey.us-east-1 \
   --user-data-file=cloud-config.txt

Now, when the above system is booted, it will have:
  • added my personal ppa
  • run an upgrade to get all updates available
  • installed the 'build-essential' and 'pastebinit' packages
  • printed a similar message to the script above

The 'runcmd' commands are run at the same point in boot that the '#!' script would run in the previous example. It is present to allow you to get the full power of a scripting language if you need it without abandoning cloud-config.

Note, that in this case the fairly large amount of output to the console from 'apt-get upgrade' ended up scrolling our 'Hello World' message off the ec2-console buffer, so it didn't appear there. That is something that will need to be addressed in lucid+1.

For more information on what kinds of things can be done with cloud-config, see doc/examples in the source.

cloud-init supports a couple other formats of user-data which provide more customization possibilities. I hope to write another blog entry covering those other formats soon.