February 2008 Archives

A Magic Elixir

|
piglet.jpgIn software, there are no silver bullets.  In internal combustion engine mechanics, however, there are plenty. And I just discovered one.

It's called Sea Foam, and it will cure what ails 'ya.

My wife's first motorcycle was a Honda Rebel 250.  She upgraded too late in the season and couldn't sell the starter before winter showed up.  Winter time is a dead zone for the used motorcycle market in the San Francisco Bay Area, so the Rebel sat in the parking garage for 5 months.  Being lazy, I didn't properly store it. 

We went to fire it up yesterday to prepare it for its 15 minutes of Craigslist fame, and it wouldn't turn over.  Gasoline, if left long enough, will degrade into a mucky varnish that cakes the inside of your carburetors.

I poured half a can of Sea Foam into the tank and let it sit for a few minutes.  I cranked it again and it made a few pathetic putts.  A few more cranks, a few more putts, but after about 5 tries, the Rebel roared to life.

A six dollar bottle of some petroleum distillate has the same end effect as a three hundred dollar carburetor job.

I am detecting much win in this sector.


Amazon EC2 Is Half As Fast As It Should Be

|
Update: This post is made of fail.  I trusted other peoples' benchmarks instead of doing my own.  If you want details, go read Don MacAskill's butthurt response on the SmugMug blog.

The fun part about running a virtualized server environment on a heterogeneous hardware setup is that you can play word games with the specifications.  Let's take, oh, I don't know, Amazon EC2 for example.  This article is going to be all "science", but the takeaway is this: in Amazon's EC2 environment, you only get half of the CPU performance you would expect.

When EC2 launched, the specifications for the machine were "the equivalent of a 1.7GHz x86 processor".  Crappy by the day's standards, but only a dime per hour.  Fine.

As EC2 developed, Amazon came up with the idea of the "Compute Unit" to describe the power you get out of the instances.  From the documentation:

One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. This is also the equivalent to an early-2006 1.7 GHz Xeon processor referenced in our original documentation.
I won't get nitpicky about how they think that 2006 megahertz are slower than 2007 megahertz; I just want to show how nebulous the specification is.

Processes Run Half As Fast As You Think They Should


I figured this little nugget out last week.  I had failed a piece of code to the point where an infinite loop would happen in an edge case.  It took a while to debug this because looking at top in the EC2 instance only showed 40-50% CPU usage.

At the same time, I noticed another metric, the CPU %st, hovering around 50%.  This stands for "CPU time stolen", and you'll notice that as your CPU usage rises, so does steal-time.  What exactly does "stolen" mean?  Time is stolen when your instance requests CPU time but the Xen virtualizer chooses to give that CPU time to something else, such as somebody else's instance.

I am not the only one to notice this, either.  There are threads on the AWS forums by other users who are seeing their code run half as fast as it should, given the specifications for a Compute Unit.  These posts are met by dismissive replies from Amazon employees.  Great job.

How They Can Get Away With This


I think that many EC2 users are more I/O bound than CPU bound.  If you make a simple Rails app backed by MySQL, chances are you're not going to consistently burn the CPU, so you won't even notice the slowdown.  However, if you have some work that is CPU bound, this restriction becomes painfully obvious.

They also get to mince words with the equivalent-to metric.  You don't actually get 1.7 billion clock cycles per second.  This comparison is made by considering the machine as a whole: disk controller speed, memory bus speed, and evidently to a much lesser extent, CPU speed.

After doing Uncov, it's hard for me to tell the difference between a swindle and incompetence.

Good Persai Writeup on Slate

|
A controlled experiment to test its learning ability:

http://www.slate.com/id/2184810/

Core Dumps Disabled By Default In Ubuntu

| | Comments (0)
Enable them with this command:

ulimit -c unlimited

The Road To Hell Is 64 Bits Wide

|
Java is an awesome language because you get to ignore hard stuff like memory allocation.  Write once, run anywhere.  Sweet, where do I sign up?

The privilege of not having to manage memory comes at a cost: you aren't allowed to question how the JVM works.  Move along, coder.  Keep making those objects.  Don't ask how much memory things take up. In fact, to keep you from getting curious, we're not even going to have a sizeof function.

How My Complacency Made Me Fail


When you're coding in Java, it's easy to buy into this mentality.  You never really have to worry about how much space anything takes up, and if you get an OutOfMemoryError, just give the JVM more memory.  Problem solved.

But there are times when you need to be conscious of how the JVM actually works.  For example, when you're trying to squeeze every bit of performance out of the crappiest of machines, such the small Amazon EC2 instances (where Persai is hosted).

Here's a real world example of how I got burned:

Persai does a lot of work with high-dimensionality, sparse vectors.  To save space, we compact the vectors.  Since most of the values in the vectors are zero, we simply do not store them.  What we store amounts to a list of the nonzero element indices and the corresponding values.  This is our basic data structure:

class sparseNode {
    public int index;
    public double value;
}
So a vector is an array of sparseNode objects. Sounds easy enough, and for a while, it was.  That is until I was tasked with storing as many of these in memory as I could.

Where's the fail here?  How big is an int primitive in Java? 32 bits, right?  Sort of.  The Java specification says that an implementation must provide 32 bits of workable space for the programmer using an int, but makes no mention of how how the virtual machine must store this variable.

In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary.  On top of this, every object has a 2-word header in memory.  The JVM's word size is usually the platform's native pointer size.  Alright, two words for the object header, one word for the int, two words for the double.  That's 5 words: 160 bits.  Because of the alignment, this object will occupy 192 bits of memory.  Effectively, the int value is taking 64 bits!  In an array of these things, I've wasted N times 32 bits.  Figure, a typical vector is about 200 elements long, so that's 800 bytes out the window for each one.

This fun fact would have been good to know when I was doing my initial back of the envelope calculation of how many vectors I can fit in a gigabyte of memory.

Yes, I know I should be complaining about the same thing when using C structs.  But you know what?  When you learn C, you are introduced to many harsh realities.  When you learn Java, you are introduced to XML.  They protect you from the hard things.  Live and learn, I guess.

The Fix

After knowing this, it was easy to rescue myself.  Java primitive arrays fall to the same alignment issue, but in our case, they can help solve the problem.  Instead of representing a vector as an array of objects, we'll represent a vector as an object of arrays.

class sparseVector {
  public int[] indicies;
  public double[] values;
}
This way, we're going to lose at most 4 bytes per vector with the alignment of the int[] array.  This sure beats the ~800 byte loss with the other solution.

Persai Has Launched An Ad System

| | Comments (0)
We have been working hard on our ad targeting technology these past couple of weeks, and have finally launched it on Persai.com.  It turns out that our content recommendation system can do ads, too.

Check out the announcement on the Persai blog.

How To Write A Takedown

| | Comments (0)

Because of Uncov, there have been a few butthurt responses to Persai.  People want to tear us apart because we spent so much time tearing Web 2.0 down.  That's fine, I applaud the effort, but if you are going to write a takedown, do it right.  Halfassing it is just an embarrassment to the both of us.

With that in mind, I think a first hand demonstration in writing a proper takedown would help bloggers.

1. Have a Hook

You, the blogger, are supposed to entertain me, the reader.  Invoke a visceral reaction to get my attention.  A funny title or good image macro will do.  One of my favorite (and also most time-consuming) parts about Uncov was finding or making all of the image macros.  Selecting an image for a post was the first step; the rest of the post would just follow from the image.
what-is-fail.jpg
I mean look at that thing.  That dog failed hard.  That's some funny shit.  Doesn't it make you want to read this post?

2. Remind Me Again, Why Am I Reading This?

Okay, you've hooked me in with a good image macro.  Now don't lose me.  Whether or not I come back to your blog depends now on your writing style.   You're a brilliant programmer, which probably means you suck ass at writing.  You know how you encourage that guy whose only programming skill comes from a cursory examination of a "PHP in 24 Hours" book, as long as he isn't working on your project?  Yeah, well, writers feel the same way about you.  Don't take it personally, you can't be good at everything.  I, on the other hand, can.

Define your own style.  It's OK to be influenced by others, but don't supplant.  These writers are your teachers.  For me, they were Maddox, Tucker Max, and a few lesser knowns.  The most important thing is that you have a style.  This style has to keep me interested.

3. Go Over the Top

Be mean.  Be meaner than you ever thought you could be.  After all, it's the internet.  Nobody really takes you seriously.

Your goal is a takedown, not a gentle harangue.  Instead of "I'm not sure this product will get any traction", say "the founders of this company should consider suicide".  Pussyfooting around like that will make your takedown lame.

4. Get Your Facts Right

I have been burned by this before.  For the most part, your readers don't give a shit because you're entertaining them, but it's personally embarrassing to screw this up.  I got a lot of my facts wrong in a takedown of Middio (which, by the way, is now dead, thank you very much).  I was ashamed of this, but then I remembered how awesome I was for making fun of a kid who was in high school.

5. Linkbait

Last but not least, you should linkbait your target.  Yeah, you want them to read your takedown, but more importantly, you want them to whine in the comments.  Why?  Because the target's whining is at least if not more entertaining than your article.
 
Protip: if you are the subject of a takedown, the worst thing you can do is try to defend yourself in the comments.  It makes you look like you care what some shithead blogger has to say about you.  If you do care that much about the critics, maybe starting a company isn't for you.


tl;dr

If you're going to write a takedown, do it right.  Make it funny.  Don't suck at writing.  Most importantly, though: don't take yourself too seriously, because your readers don't, either.

I Detect FAIL.

| | Comments (0)
At Uncov, I took a lot of time to write about fail.  For lack of a better word, I described the horrific ignorance of reason and systemic absence of programming ability that runs rampant in Web 2.0 simply as fail.

While developing Persai, we have furthered the study of fail, and developed it into a management tool.

The Failboard

My two cofounders and I started a failboard where we keep tallies of eachother's fail.  A whiteboard, a dry erase marker, and unrelenting fits of ball-busting are all you need to implement this strategy at your organization.

When one member of your team fails at something, it is tallied on the board.  The other members of the team then harass the failer about his programming ability, sexual prowess, and general competence.

The Grand Unified Theory of Fail

What is fail?  What does a person have to do to deserve this treatment?  It has no formal definition, but loosely, a fail is some bit of negligence, misunderstanding, unwillingness to read documentation, or just plain unforgivable stupidity that leads to a problem.  Fail is amplified by the magnitude of the problem and the ease with which the error could have been detected.

For example, pushing a CSS change to a production website that doesn't quite work in IE is not a fail because it's just some lame HTML problem.  There's no gap in understanding here, it's just programmer laziness, or aversion to busywork.

Anatomy of a Fail

I will pull an example of something that is a fail from my experience with Persai.  This fail was my own, and my balls were busted within an inch of their lives in retribution.

We keep vectors in a sparse format to conserve memory, because most of the elements are zeroes.  Iteration on these sparse vectors requires some tricky bookkeeping of indices, but if you're paying attention, it's not that hard.

I was writing a simple math library for our sparse vector format.  The first method I wrote was a dot product, and it performed admirably.  If you're not familiar with the dot product, it's the sum of the products of the vector elements.  So, if you have a zero in some position in one of the vectors, it's not even worth your time to carry through the multiplication with the other vector if it has a nonzero element in that position, because it adds nothing to the sum.

Because of this, you can play fast and loose with the bookkeeping in a dot product.  I hammered this method down, and had gotten the general idea of how to iterate over two sparse vectors and match up the indices.

The next method I went to write was a vector sum.  Since I had gotten the bookkeeping right in the dot product method, I copied-and-pasted the loop from dotprod() to sum().  Copy-paste is strongly correlated with fail.

Of course, I never unit tested this, just rolled it out.  I know what I'm doing.  I've got a degree in math.  How could I possibly screw this up?  A vector sum is a pretty fundamental operation, and many other higher-level vector math operations we use depend on it.  Namely, there's some complicated math behind how we bootstrap a recommendation system based on relatively little signal.  I had pushed this new library out to our development setup and just assumed it worked.

I created a new Persai interest about Facebook, seeding it with the word "facebook".  It started giving recommendations about pasta recipes.  I tried to pass the blame off to my co-founders, trying to come up with some cock-and-bull story about how the documents are parsed and so on.  I dug into the code a little bit, down through the math, and had that moment of realization: I had failed.  Usually, you can tell when a teammate fails because they will be looking at an Eclipse screen and just out of the blue mutter "Oh good Lord".

Remember that whole bit about being able to skip the multiply when a vector has a zero in that position?  That doesn't work with addition, because even though 0x = 0, it turns out that 0 + x = x, not zero.  FAIL.

This error was inexcusable, and I paid dearly for it, mostly with pride.

The Failboard in Practice


After a while using the failboard, you start to get a sixth sense for fail.   Some code path that makes use of a synchronized() block is slowing down at odd times? A bit of UTF-8 encoded text goes through a couple of programs and comes out garbled? ConcurrentModificationException?  The sources of these problems will shortly end up as tallies on the board.

It gets more powerful when you can preemptively detect fail, not in your own code, but in your teammates'.  Believe me, if you punish teammates severely enough, it will happen less, because everyone develops a nose for fail.


Hermetic RPC Unit Testing With Thrift and jMock

| | Comments (0)
Unit testing is a pain in the ass.  I will admit it, I hate doing it.  More often than not, you just write a few obvious JUnit tests that you know will pass and say you're finished.

Testing code that makes RPC calls is especially discouraging.  You'll say "I can't unit test it, it needs to set up an RPC server and that's too complicated for JUnit", or, if you're like me, you won't even make up an excuse.

Of course, this laziness comes back to bite you when the code goes into production, the RPC server throws a one-in-a-million exception, and your entire service bites the dust because you never tested that execution path.

So, given that you don't like to be woken up at 3AM by sysops when you have been out drinking all night, let's unit test our RPC clients.  Let's do it without having to start up an RPC server when the test runs, and it would be nice to be able to have fine-grained control over the RPC methods.

She's Thrifty - She's Just My Type

This is the Thrift RPC definition we will be using for this example:

service MyRPCService {
  i64 getDocidForUrl(1: string url),
}
Simple. We'll be looking up a 64-bit integral document identifier for a given URL. Our client code will make a decision about the state of the document given that identifier.

This is the class we will be testing:

public class ProgramToTest {

	// class constants
	private static final int RPC_SERVER_PORT = 3141;
	private static final String RPC_SERVER_HOST = "rpcserver.teddziuba.com";
	private static final long DOCID_IS_OLD_IF_LESS_THAN = 1000;
	public static enum DocumentStatus { OLD, NEW, UNKNOWN };
	
	// instance variables
	private MyRPCService.Iface myRpc;
	private TSocket socket;
	
	public ProgramToTest() {}
	
	private void init() throws TTransportException {
		socket = new TSocket(RPC_SERVER_HOST, RPC_SERVER_PORT);
		TProtocol protocol = new TBinaryProtocol(socket, true, true);
		myRpc = new MyRPCService.Client(protocol);
		socket.open();
	}
	
	public Enum getDocumentStatus(String documentUrl) {
		try {
			long docId = myRpc.getDocidForUrl(documentUrl);
			if (docId < DOCID_IS_OLD_IF_LESS_THAN) {
				return DocumentStatus.OLD;
			}
			return DocumentStatus.NEW;
		} catch (TException e) {
			return DocumentStatus.UNKNOWN;
		}
	}
	
	public void finished() {
		socket.close();
	}
	
}
If I were still in CS class in college, I would get dinged for having multiple return statements, but the best part about being a grown up is that when I want a cookie, I can have a cookie.


The getDocumentStatus is really the only thing we need to test, as clients of this class will be responsible for dealing with a TTransportException if the socket initialization fails. The unfortunate part about testing that method is that it makes an RPC call. Sockets. Exceptions. Icky. Even though it's easier to say screw it and go have a beer, remember: you gotta do what you gotta do.

Making a Mockery

JMock is a clever unit testing library that makes mock objects really easy.  If you're new to mock objects, read more about them here.  The basic idea is that we will make an object that "mocks" the behavior of the RPC server, but without doing any I/O.  That way, we have complete control over the operations of the server, and can actually test how your client code interacts with that one-in-a-million exception.

We'll be mocking out the MyRPCService.Iface interface that is autogenerated by Thrift, and defining our own behavior for it. If you've got some experience with JMock, this should be pretty straight forward, and if not, then you'll catch on quick. JMock's syntax focuses on making the testing conditions human readable.

Prepare The Class For Testing

Since we will be providing the ProgramToTest class with a mocked version of this interface, we need to add a constructor to the class for testing only:

public ProgramToTest(MyRPCService.Iface testOnlyIface) {
	this.myRpc = testOnlyIface;
}
JUnit.  We In It.

We'll test the low-hanging fruit first.  Using our mock to control the return value of the RPC call, we can make sure the logic works:


@Test
public void testHandlesOldDocid() throws TException {
	final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
	ProgramToTest testObject = new ProgramToTest(mockedRpc);
		
	final long rpcCallReturnValue = 100L;
	final String testUrl = "http://www.teddziuba.com/";
		
	context.checking(new Expectations() {
		{
			one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
			  will(returnValue(rpcCallReturnValue));
		}
	});
				
	assertEquals(ProgramToTest.DocumentStatus.OLD,
                               testObject.getDocumentStatus(testUrl));
}
That is pretty cool.  Without a whole lot of effort, we've managed to make a unit test for a method that depends on an RPC server.  This test does not require any network I/O and runs very quickly.  It can be run in a self-contained environment, like an automated test server.  I call this kind of test hermetic, because nothing outside of the test code can affect its outcome.

We can also use JMock to test what happens when an exception is thrown.  If a Thrift RPC server throws an exception somewhere in its handler method and that exception is not caught server-side, it will be thrown up to the client as a TException.  To simulate this, we simply change one line of the test expectations:


@Test
public void testHandlesException() throws TException {
	final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
	ProgramToTest testObject = new ProgramToTest(mockedRpc);
		
	final TException rpcException = new TException("something awful has happened.");
	final String testUrl = "http://www.teddziuba.com/";
	
	context.checking(new Expectations() {
		{
			one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
			  will(throwException(rpcException));
		}
	});
				
	assertEquals(ProgramToTest.DocumentStatus.UNKNOWN,
                                testObject.getDocumentStatus(testUrl));
}
Go And Do Likewise

JMock is an incredibly useful library.  If you're a lazy tester like me, it beats the pants off of subclassing.  Now, you have no excuse for leaving RPC calls untested.

About this Archive

This page is an archive of entries from February 2008 listed from newest to oldest.

March 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Categories

Pages