January 2010 Archives

Break My Concentration and I Break Your Kneecaps

a-handgun-is-like-an-atm-machine-and-convincing-argument-all-in-one.jpg
I own a good set of headphones that fully enclose my ears. I am not an audiophile, I just don't like to hear other people talk at me.  When I am staring at my Emacs windows with headphones on, it generally isn't a physical cue that I am looking for conversation. In fact, when I am that deep into thinking out a problem and I get interrupted, I think about the anti-workplace-violence clause in the employee handbook, and how a poorly lit parking lot probably doesn't qualify as "company property".

Interrupting a thinking programmer is a sucker punch to productivity's kidney. Of course it's still important to keep open communication channels, especially in a small team. I don't mind answering questions and helping out, so long as it's not an immediate context switch for me, i.e. I'll help you if I don't have to speak.

Instant messaging is a decent first attempt, but it's only person-to-person communication. (And no, group-IM never fucking works right) Programming teams need group chat.  White-label Twitter clones like Yammer are okay, but I feel icky using a product that is hailed as a technological advance for supporting the ability to identify topics by prefixing a word with a pound sign. That, and I want to keep an eye on the conversation as I work, and my attention isn't on my IM client or browser when I'm coding. It's on Emacs.

The answer, of course is IRC.

My team recently grew, and four of us need to communicate constantly. I set up an IRC server and brought people in. One non-programmer who needed to be in the loop had never used IRC, but caught on quickly. Productivity is up, as is communication. The developer chat channel is right in front of me as I work, as a window in Emacs:

at-the-crunchies-i-got-drunk-and-started-heckling-people-who-used-to-be-important.pngThink of developer communication like I/O. There's blocking and nonblocking. When somebody talks to me as I work, my programming train of thought needs to block. With inline chat like you see above, I can answer questions when I have spare cycles. Since the conversation is integrated into my development environment, I don't need to look around at other applications, and there's no popup notification bouncing around like a Jack Russell terrier who got into my Adderall supply. Also since it's Emacs, it's not vim. If you use vim, /quit #life.

Collaboration technology doesn't need to be re-invented every six years. The stuff we had in the eighties works just fine.

Options for Parallel Compression

when-a-couple-gets-a-dog-its-like-saying-we-want-a-baby-but-dont-want-to-go-to-jail-if-it-dies-by-accident.jpgAt Milo, I pretty frequently need to pull data down from production to my workstation to test some new code. That's what happens when you raise a Series A round - you can't live-edit production data anymore. I think it's in the term sheet somewhere.

Anyhow, I was pulling down a 14GB MySQL database dump today. Trying to compress it through plain Jane gzip was pretty slow, so I looked for some parallel options. The server I was pulling from has 16 cores, so I figured I could make use of them.  Anyhow, here's what I found:

  • pbzip2 - Parallel BZIP2: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.
  • pigz - Parallel GZIP: Parallel implementation of GZIP written by Mark Adler (guy who co-authored zlib and gzip, so you can be reasonably confident he has his shit together).
On the 14GB database dump, both are faster than vanilla GZIP. Because Hacker News and Reddit both love this shit, here are the timing stats:

  • Plain gzip, default compression level: 11 minutes, 58 seconds. Resultant file is 2.3GB.
  • pbzip2, default compression level: 8 minutes, 48 seconds. Resultant file is 1.7GB.
  • pigz, default compression level: 1 minute, 33 seconds. Resultant file is 2.3GB.
Again this was on a 14GB database dump file, on a 16-core machine, with Intel solid state disks.

If any readers know of other parallel compression schemes I can try, e-mail me and let me know. I will post stats here.

I Love the GPL (Except When it Applies to Me)

if-red-wine-and-hybrid-cars-were-made-from-animals-there-would-be-no-more-vegans.jpg
Boy do I love free software. It is usually pretty high quality, I don't have to pay for it, and I feel completely justified in criticizing the maintainers on public mailing lists for not supporting the exact features I need.  Of course I'm not going to send patches back, because it's just way easier to bitch and moan.

Also, since my software product is a web service, I have exactly zero obligation to contribute anything back to the community, ever. Sure, I may use some GPLed software, but shit, actually following the spirit of the copyleft? Don't they know this is a business, not a charity? Fuck that noise.

I came up in the salad days of Slashdot, when the cast of villains and henchmen included Microsoft, SCO, and anyone else who wanted to turn a dime from software. We believed in the GPL, that a viral copyleft clause was good for humanity. That is, until we left academia and had to pay the rent.

Since the world appears to be moving toward software as a service (against my sage advice, mind you), it is blisteringly easy to be a champion of the ideals behind open source and free software, but still pussyfoot around when it comes to execution.  What I'm talking about is the loophole in the GPL that exempts application service providers from having to release their derivative works under the same license as the libraries.

The pedantic reader who is going to talk shit will point out the difference between open source and free software. So, before you write a blog post that nobody's going to read, allow me to demonstrate.

Open Source: I want to let others use my code in whatever manner they please, and not be bound by an anti-commercial license.

Free Software: I found a loophole in my student loan documentation that lets me defer payments for decades, so long as I stay in the Ph.D. program!

If anything good comes out of Web 2.0, it's the malignant tumor on the GPL's kidney, still wrongly diagnosed as a urinary tract infection.

Back in the Slashdot days, we all thought that the fate of free software would be decided by a landmark court decision, that if the ideals of the GPL were to die, they would wind up meeting a ceremonious end like the cabinet members of a government overthrown in a military coup. But no - the free software ideal will die by the hands of a thousand poseurs, all who want the notoriety of contributing to open source, but none who are convicted enough to release any of their business's core code under a free license.

The copyleft will share the same fate as the hippie movement, now only a shell of its former self supported by college age kids who hang out in the Haight-Ashbury and smoke pot all day, and at night, drive their Lexuses over the Golden Gate, back to Marin County. But you will take off that damn Che Guevara shirt before you come back into my house, young man.

Look at all of the open source software in modern use. The vast majority of it is licensed under terms without a copyleft clause. The BSD license, Apache license, MIT license, and a handful of others are the most prevalent. In some places, the GPL still kicks around, but since we are application service providers, we are all free to ignore it.

The Affero General Public License, a version of the GPL that closes the service-provider loophole, is almost nowhere to be found. The only new-hotness software I know of that is licensed under Affero is MongoDB, and even they have a chickenshit implementation - they have structured the code such that the 99% case of a web application using Mongo is effectively bound by the Apache license.

Affero-licensing your project is a fatal defect if you want it to be used. Since the current flow of the software industry has effectively neutered the GPL, the only serious chance the copyleft has is the Affero license, and that sure-as-shit ain't gonna happen.

The toll on the Golden Gate Bridge is now six dollars.