I’ve been using Amazon’s Elastic Compute Cloud for several years now, but a lot more lately. And one thing that has always confused me is the relative benefits of using Elastic Block Store (EBS) versus instance store. I’ve seen some posts on this, but they all set up sophisticated RAID configurations. What about some simpler guidance for a regular developer like me?
Well, I don’t have the answers, but I have a little bit of new data. I’m updating a site that starts by loading a 1.25GB flat file into MySql, then creating three indexes, then traversing that table to create a second, much smaller table. Dealing with those 10 million rows is pretty slow, so I decided to see what difference it made using EBS or the instance store. While I was at it I tried different size machines. The results, shown in minutes to complete the task, are summarized in the table below:
The t1.micro machine size is only available in EBS, and it got about 90% of the way through (finished creating all three indexes) then died.
This seems to show that (for this kind of operation) EBS performed noticeably but not enormously better than the instance store, but the difference shrank as available memory increased. Also, “larger” machines didn’t help much once there was enough memory available. Not surprising, since this is a single-threaded operation.