Update: This post is made of fail. I trusted other peoples' benchmarks instead of doing my own. If you want details, go read Don MacAskill's butthurt response on the SmugMug blog.
The fun part about running a virtualized server environment on a heterogeneous hardware setup is that you can play word games with the specifications. Let's take, oh, I don't know, Amazon EC2 for example. This article is going to be all "science", but the takeaway is this: in Amazon's EC2 environment, you only get half of the CPU performance you would expect.
When EC2 launched, the specifications for the machine were "the equivalent of a 1.7GHz x86 processor". Crappy by the day's standards, but only a dime per hour. Fine.
As EC2 developed, Amazon came up with the idea of the "Compute Unit" to describe the power you get out of the instances. From the documentation:
Processes Run Half As Fast As You Think They Should
I figured this little nugget out last week. I had failed a piece of code to the point where an infinite loop would happen in an edge case. It took a while to debug this because looking at
At the same time, I noticed another metric, the CPU
I am not the only one to notice this, either. There are threads on the AWS forums by other users who are seeing their code run half as fast as it should, given the specifications for a Compute Unit. These posts are met by dismissive replies from Amazon employees. Great job.
How They Can Get Away With This
I think that many EC2 users are more I/O bound than CPU bound. If you make a simple Rails app backed by MySQL, chances are you're not going to consistently burn the CPU, so you won't even notice the slowdown. However, if you have some work that is CPU bound, this restriction becomes painfully obvious.
They also get to mince words with the equivalent-to metric. You don't actually get 1.7 billion clock cycles per second. This comparison is made by considering the machine as a whole: disk controller speed, memory bus speed, and evidently to a much lesser extent, CPU speed.
After doing Uncov, it's hard for me to tell the difference between a swindle and incompetence.
The fun part about running a virtualized server environment on a heterogeneous hardware setup is that you can play word games with the specifications. Let's take, oh, I don't know, Amazon EC2 for example. This article is going to be all "science", but the takeaway is this: in Amazon's EC2 environment, you only get half of the CPU performance you would expect.
When EC2 launched, the specifications for the machine were "the equivalent of a 1.7GHz x86 processor". Crappy by the day's standards, but only a dime per hour. Fine.
As EC2 developed, Amazon came up with the idea of the "Compute Unit" to describe the power you get out of the instances. From the documentation:
One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. This is also the equivalent to an early-2006 1.7 GHz Xeon processor referenced in our original documentation.I won't get nitpicky about how they think that 2006 megahertz are slower than 2007 megahertz; I just want to show how nebulous the specification is.
Processes Run Half As Fast As You Think They Should
I figured this little nugget out last week. I had failed a piece of code to the point where an infinite loop would happen in an edge case. It took a while to debug this because looking at
top in the EC2 instance only showed 40-50% CPU usage.At the same time, I noticed another metric, the CPU
%st, hovering around 50%. This stands for "CPU time stolen", and you'll notice that as your CPU usage rises, so does steal-time. What exactly does "stolen" mean? Time is stolen when your instance requests CPU time but the Xen virtualizer chooses to give that CPU time to something else, such as somebody else's instance.I am not the only one to notice this, either. There are threads on the AWS forums by other users who are seeing their code run half as fast as it should, given the specifications for a Compute Unit. These posts are met by dismissive replies from Amazon employees. Great job.
How They Can Get Away With This
I think that many EC2 users are more I/O bound than CPU bound. If you make a simple Rails app backed by MySQL, chances are you're not going to consistently burn the CPU, so you won't even notice the slowdown. However, if you have some work that is CPU bound, this restriction becomes painfully obvious.
They also get to mince words with the equivalent-to metric. You don't actually get 1.7 billion clock cycles per second. This comparison is made by considering the machine as a whole: disk controller speed, memory bus speed, and evidently to a much lesser extent, CPU speed.
After doing Uncov, it's hard for me to tell the difference between a swindle and incompetence.