Charles Engelke's Blog

December 19, 2012

Learning about AWS CloudFormation

Filed under: Uncategorized — Charles Engelke @ 5:05 pm
Tags: , , ,

I’ve been using Amazon Web Services as the infrastructure for some products for a while now. A big advantage of running in the cloud is being able to automate creating, updating, and destroying servers. So far, we’ve been doing this by writing scripts. Now it’s time to move up to the next level of sophistication and use their CloudFormation service instead. That not only supports automated launching and provisioning an new servers, it supports automatically creating a whole bunch of interconnected services all at once. And it’s declarative, specifying where you want to end up, instead of procedural, specifying how to get there.

CloudFormation looks pretty simple at first, but I’ve found out that it really isn’t. You need to handle a lot of details, and the documentation isn’t always clear (to me), nor even always complete (as far as I can tell). And there aren’t enough complete examples. So, as I learn about it I’m going to blog about what I discover.

I’ll start with the simplest case I need handled: launching and provisioning a single server. I have to write a template specifying what I want CloudFormation to do, and then use CloudFormation to create the stack defined by that template.

CloudFormation templates are documented in the Working With Templates section of the AWS CloudFormation User Guide. Each template is a JSON document representing an object (basically, key/value pairs). The general format of that JSON object is:

{
   "key1": "value1",
   "key2": "value2",
   ...
   "keyN": "valueN"
}

The order of the key/value pairs is irrelevant. Another JSON document with the same pairs in a different order would be considered to represent the same object. Note that the keys are quoted strings, separated from the values with a colon. Key/value pairs are separated (not terminated) with commas. And values can be quoted strings, as shown here, or numbers, or JSON objects themselves. They can also be arrays, which are comma-separated lists of values enclosed in square brackets. Don’t worry too much about the details; we’ll see all of this in the examples.

CloudFormation template are JSON objects with some of the following keys:

  • AWSTemplateFormatVersion
  • Description
  • Parameters
  • Mappings
  • Resources
  • Outputs

All of these keys are optional except for Resources. Resources are the things that CloudFormation is going to create for you, so if there are none, there’s not much point to having a template anyway.

Although the AWSTemplateFormatVersion is optional, and there’s only ever been one version declared so far, I’m always going to include it. The only legal value for it is “2010-09-09”. The Description is also optional, but again, I’ll always include it to help me keep track of what I’m trying to do. So my template is going to start taking shape:

{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "Create a basic Linux machine",
    "Resources": {
        something needs to go here!
    }
}

I need to fill in the Resources section with at least one key/value pair. The key is going to be my logical name for the resource. It can be just about anything (I haven’t pushed the limits though), because CloudFormation doesn’t care. I’ll just call this NewServer. The value is always an object with Type and Properties keys. The possible types are listed in the Template Reference section of the User Guide. To create an EC2 instance, use a Type of AWS::EC2::Instance.

The Properties object contains different possible keys for different resource types. The possible keys for AWS::EC2::Instance are listed in that section of the Template Reference in the User Guide. Only two keys are required: ImageId, which is the ID of the AMI to use for the new instance, and InstanceType, which tells what kind of instance to launch. Actually, in my experience I’ve found I can omit the InstanceType and it defaults to m1.small, but that may be a bug, not a real feature. The documentation says InstanceType is required, so I always include it.

I want to launch a standard 64-bit, EBS-Backed Amazon Linux instance in the US-East-1 region. According to the Amazon Linux AMI web page, the ImageId is ami-1624987f. I’ll save money by using a t1.micro instance. Putting all this together, I get the following template:

{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "Create a basic Linux machine",
    "Resources": {
"NewServer": {
"Type": "AWS::EC2::Instance",
"Properties": {
"ImageId": "ami-1624987f",
"InstanceType": "t1.micro"
} }
    }
}

Now to create this stack. Log in to the AWS Management Console and select CloudFormation. (If you’ve never used it before, you’ll be walked through a few sign-up steps to verify your identity. A few minutes later, you’ll be able to use the console.) It currently looks like this:

Image

I made sure I was in the right region (N.Virginia showing in the upper right corner), clicked Create New Stack, then filled in the blanks. I put my template in a file called cf.json, and selected it for upload:

Image

Then I clicked Continue. I had the option to enter some tags, which would be applied to the stack and to every resource it created. I just clicked Continue. Finally, I had a confirmation box:

Image

I clicked Continue, and my stack started building. I closed the acknowledgment window and looked at the console. The upper part showed all my stacks. There was only the one I just created. When a stack is selected, the bottom part shows its properties. I selected the Events tab for the screen capture below:

Image

Eventually, CloudFormation finishes, either successfully or with an error. In that latter case, it will usually roll back all the steps it took automatically. Otherwise you can click Delete Stack to get rid of everything it created.

In this case, everything worked. The Resources tab lists everything that was created. That’s just the NewServer resource, which is an AWS::EC2::Instance. It also shows me the ID of that instance. If I want to log in to that server I’ll have to look up its address in the EC2 section of the console. However, I’m not going to have much luck with that because I did not specify a key pair when creating the machine, so it’s impossible for anyone to connect via ssh.

Oops.

KeyName was an optional property I could have specified, but didn’t. The reason it’s optional is that you very well might want to create an instance nobody could log in to. That’s not true in our case, so I fixed it. First, I cleaned up the stack I created that I can’t use. I selected it in the console and clicked Delete Stack. The stack and every resource it created was destroyed. Next, I went back to the template and specified a KeyName value. It had to already exist as a Key Pair in the US-East-1 region. I happened to have one there named cloudformation, so I used it. The updated template:

{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "Create a basic Linux machine",
    "Resources": {
"NewServer": {
"Type": "AWS::EC2::Instance",
"Properties": {
"ImageId": "ami-1624987f",
"InstanceType": "t1.micro",
"KeyName": "cloudformation"
} }
    }
}

Repeating the steps above I got a running Linux machine. This time, that machine was associated with the cloudformation key pair, so I could  log in via ssh. Success!

Instead of the console, I could have used the cfn-create-stack command line tool. Or I could have written a program that invoked the REST API for CloudFormation. Each method looks about the same to AWS, and gets the same result.

But what’s the point? I could have created this instance directly with the EC2 console, or command line tools, or REST API. And it would have been at least as easy. Easier, in fact, in my opinion. That’s because I haven’t tapped into the real powers of CloudFormation yet:

  • Provision created servers with specified packages, files, software, etc.
  • Create (and manage) multiple resources that work together

I’ll get started on those more useful, and more interesting, things in my next post. But before I go, I’d better remember to go back to the CloudFormation console and delete the stack I created, so I don’t keep paying for that server.

Advertisement

December 4, 2011

The Bookshelf Project – Using Amazon Web Services from JavaScript

Filed under: Uncategorized — Charles Engelke @ 8:08 pm
Tags: , , ,

Many years ago, I got frustrated with using Amazon’s “save for later” shopping cart function to keep track of books I probably wanted to buy someday. The problem I was trying to solve was that I’d find out about an upcoming book by one of my favorite authors months before publication and I didn’t want to forget about it. I could have just preordered the book, but back then there was no Amazon Prime so I always preferred to bundle my book orders to save on shipping. So I’d add the book to the shopping cart and tell it to save it for later. But (at least back then) Amazon was willing to save things in your cart for only so long, and my books would often disappear from the cart before they were published.

I’m a programmer, and Amazon had an API (application program interface), so I did the obvious thing: wrote a program to solve my problem. It was just for me, so I wrote the simplest thing that could possibly work, figuring I’d improve it some day. It was a simple Perl CGI script that I ran under Apache on my personal PC. It used the (then very primitive) Amazon Web Service to look up the book’s information given an ISBN, and saved its data in a tab delimited text file.

That was a long time ago, probably very soon after Amazon introduced its first web services. And I’m still using it today with almost no changes. But I’m no longer happy with it, for several reasons:

  • It only recognizes the old 10 digit ISBN format, not the newer 13 digit one.
  • It can’t find Kindle books at all.
  • It runs only on a PC running an Apache webserver.
  • The data is available on only that device.

The cloud has spoiled me. I want this program to run on any of my web-connected devices, and I want them all to share a common data store. Hence this project.

“Run on any of my web-connected devices” pretty much means running in a browser, so I’ll have to write it in HTML and JavaScript. I’ll use HTML5 and related modern technologies so I can store data in the browser so I can see my saved book list even when off-line.

I know HTML and JavaScript but I’m no expert, so I’m going to build this incrementally, learning as I go. Step 1 will be to get a web page that just uses Amazon Web Services (AWS) to look up the relevant information given an ISBN. And right away, that’s going to require a detour. As a security measure, web browsers won’t let a web page connect (in a useful enough way) to any address but the one hosting the web page itself. My web page isn’t going to be at the same address as AWS, so it seems this is a hopeless task.

There is a way out, called Cross Origin Resource Sharing (CORS). The target web site can tell the web browser that it’s okay, it’s safe to let a “foreign” web page access it. Modern browsers support CORS, so I should be okay. Unfortunately, AWS doesn’t (yet) support CORS, so that’s out. Foiled again!

But there is a stopgap. I can create a Chrome Web Application. That’s pretty much just a normal web page, except that it can tell the web browser to allow access to foreign services. And that’s just what I will do, starting in my next blog post. That will take a while, but after that’s done, I can explore various directions to take it:

  • Maybe AWS will support CORS soon, in which case I’ll be able to use almost the exact same solution on any modern web browser, even on tablets and phones.
  • I can always write server-side code to “tunnel” the web service requests through my server on the way to AWS. That works, but I think it’s inelegant.
  • I might try creating an HP TouchPad application, which uses the same kinds of technologies as the web, but to create native apps. I find that approach very appealing, even though the TouchPad is more-or-less an orphan device now. I’ve got one, and this would be an excuse to develop for it.
  • Tools like PhoneGap let you wrap a web application in a shell to allow it to run as a native app on various mobile platforms. I think they allow operations that normal browsers block, such as CORS. I could find out, anyway.

So I’ve got a lot of potential things to learn and try. First up: creating a Chrome web application, in many steps. If it comes out nice, I’ll even try publishing it in the Chrome Web Store.

July 1, 2011

Relative Performance in Amazon’s EC2

Filed under: Uncategorized — Charles Engelke @ 11:58 am
Tags: , ,

I’ve been using Amazon’s Elastic Compute Cloud for several years now, but a lot more lately. And one thing that has always confused me is the relative benefits of using Elastic Block Store (EBS) versus instance store.  I’ve seen some posts on this, but they all set up sophisticated RAID configurations. What about some simpler guidance for a regular developer like me?

Well, I don’t have the answers, but I have a little bit of new data. I’m updating a site that starts by loading a 1.25GB flat file into MySql, then creating three indexes, then traversing that table to create a second, much smaller table.  Dealing with those 10 million rows is pretty slow, so I decided to see what difference it made using EBS or the instance store. While I was at it I tried different size machines. The results, shown in minutes to complete the task, are summarized in the table below:

Size  EBS  Instance
t1.micro 635
m1.large 56 66
m2.xlarge 47 49
m2.4xlarge 42 40
c1.xlarge 49 49

The t1.micro machine size is only available in EBS, and it got about 90% of the way through (finished creating all three indexes) then died.

This seems to show that (for this kind of operation) EBS performed noticeably but not enormously better than the instance store, but the difference shrank as available memory increased. Also, “larger” machines didn’t help much once there was enough memory available. Not surprising, since this is a single-threaded operation.

Blog at WordPress.com.