September 2008 Archives

We Read Digg And Reddit So You Don't Have To

|
We put out some new features in Pressflip yesterday.  We've expanded the service to be a general "persistent search" that takes advantage of metadata.

Collins, can we skip the technicals, please?  What we've come up is an applications platform, so you can, for example, add the Reddit application and get Reddit.com content matched to your saved searches in Pressflip.  You'll see a Reddit widget next to results that come from Reddit, like this, a result to my saved search about cars:

reddit-example.png
We've got more applications: Digg, Twitter, Fark, Topix, and a few others, along with the standard load of news, blogs, and magazines that we've always been indexing.  There are some more in the pipe, but this is the general direction of our product.

Our thinking is that there is a ton of data out there, some of it searchable, some of it not.  If we can connect silos of data to your saved searches, then we can provide a better experience than say, Google Alerts.  By taking advantage of APIs provided by various services, we can provide richer and more unified results.

To illustrate this new platform better, I have come up with this awesome graphic:

pf-conftent-sources.pngAs we add more content sources, we'll have a really unique saved search experience.

So, check out pressflip.com and see for yourself.

OpenSocial, OpenID, and Gears

|
At El Reg. This one has put more buttache than usual into my inbox.

Java subList Gotcha

|
yogi.jpg
Don't ever try to do anything sneaky when you're programming.  It will always bite you in the ass.  If you still want to be sneaky, read the documentation.

Last week, we had a problem with one of our processes hanging and burning 100% CPU.  The first time it happened we chalked it up to mysteries of the universe and restarted the process (a time-honored startup tradition), but the second time, I actually got off my ass and investigated.

Through the miracle of jstack, I could look at the stack trace of a currently running Java process.  This is what I found:



"main" prio=10 tid=0x0a030800 nid=0x10a6 runnable [0xb7d7d000..0xb7d82218]
   java.lang.Thread.State: RUNNABLE
     at java.util.SubList$1.nextIndex(AbstractList.java:713)
     at java.util.SubList$1.nextIndex(AbstractList.java:713)

...snip... about 100 lines

     at java.util.SubList$1.hasNext(AbstractList.java:691)
     at java.util.SubList$1.next(AbstractList.java:695)
     at java.util.SubList$1.next(AbstractList.java:696)

...snip... about 100 lines

     at com.pressflip.pipeline.standard.deduper.ShingleDupeDetector.
dedupBatch(ShingleDupeDetector.java:139) at com.pressflip.pipeline.standard.deduper.DeduperPipelineStep.
innerProcess(DeduperPipelineStep.java:115) ... and right down to the main() from here.

The suspect line in all this is ShingleDupeDetector.java:139, which is one of those how-the-hell-are-you-hanging-on-this lines:

for (Integer x : someCollectionOfIntegers) {

So what the shit, right?

I was using this collection as a cache of sorts, where on every run, I chopped some data off the front of it and added some data to the back, keeping the collection size constant.  To accomplish this, I used the subList method on java.util.List, something like this:


someCollectionOfIntegers = someCollectionOfIntegers.subList(fromIndex,
                                                    someCollectionOfIntegers.size());
someCollectionOfIntegers.addAll(incoming);

Well it turns out that subList didn't do what I thought it did.  I assumed that I just got a new List that contained the elements in the given range of the original.  Oh no, subList returns a view of the original list where only elements in the given range are addressable.  A look at AbstractList.java's source reveals this:

 
public List<E> subList(int fromIndex, int toIndex) {
        return new SubList<E>(this, fromIndex, toIndex);
}

And the SubList object keeps a reference to this, as well as an offset to know where iteration starts, so as I updated the "cache", iterating over it became recursive.  Oh, balls.  That's why it's running slow.

More Of This Chrome Business

|
The latest installation of my column at The Register went live this morning, it expands a bit more on the Chrome horseshit.

Best reader comment so far: "Chrome is good but it's not Jesus."

A Web OS? Are You Dense?

|
People are calling Google Chrome a "Web Operating System" and a "Cloud Operating System".  Some are even calling it a Windows killer.

I think it's time to nip this horseshit in the bud, before it gets out of hand.

How Does Arringtons Know What Operating Systems Is?


He doesn't.  It is TechCrunch's official position that Google Chrome will compete full on with Microsoft Windows, and computers will be sold with Chrome only, having the Windows layer "stripped out".  I am not shitting you, he actually said that.  Yeah, I get where the argument is going about web apps being more dominant than desktop apps.  That prediction is a crock of shit.  A 2007 survey found that 73% of Americans have never even heard of Google Docs, and 94% have never tried an online office suite.  Yeah, desktop apps aren't going anywhere.

But I'm not here to talk shit on Web 2.0 today.  I'm going to present a glimpse of the hole that the incompetent programmers are digging for us.

When Times Were Simple

Let's have a look at the application stack that we all know and love: programs compiled to run in an environment with a C library.

normal-cropped.gif

Fuck me, life is good.

Making It Easier On Programmers

I first learned to program in C++ and then later on I learned Java in college.  I thought the whole Java Runtime Environment thing was kind of weak, but if it means I don't have to manage memory, that's cool.  Same goes for Python, Ruby, and whatever else has its own VM or interpreter.

runtime-cropped.gif

This situation is pretty agreeable, and lets us prototype applications rapidly.  Sure, there's a small trade-off with execution speed, but they have multi-gigahertz processors nowadays.  No big deal.

Making It Easier On Idiots

After a while, everybody wanted to be a programmer.  Since programming is actually kind of hard, many of these folk landed in PHP and HTML, hence the explosion of webapps.  As such, the browser became a feeble example of a "runtime".

Now, with Google Chrome being lauded as a Web Operating System, the stack gets way bigger.  This is what it looks like on my computer, considering I run Linux and Google hasn't released their Operating System for the Linux Operating System (that makes sense, doesn't it?)

chrome-os2.gif

Users have pretty basic needs when it comes to computers.  They want word processing, spreadsheets, communications, and games.  These needs have not changed much since the advent of the personal computer.  So, when your Aunt asks why her 1.2GHz computer isn't fast enough to run an online word processor that has the same fucking features as the 1987 version of Corel WordPerfect, you don't have an answer for her.  There is no justification.

The "Web Operating System" just highlights how much journalists don't know about computers.




Some Cockbite Is Impersonating Me On Twitter

|
1217880579840.jpghttp://twitter.com/teddziuba

That's not me.  Anybody know who it is?  tjdziuba@gmail.com.

Arrington, Are You A Fuckin' Idiot Or Something?

|
"[Google] Chrome is nothing less than a full on desktop operating system that will compete head on with Windows."

-Michael Arrington, TechCrunch
http://www.techcrunch.com/2008/09/01/meet-chrome-googles-windows-killer/

Why do people continue to take news from a person who fundamentally does not understand how computers work?  Because he has new content for people to read, every day.  People will read what you tell them to read, as long as you keep telling them to read it.

What an absolute infant.

OpenID Is Why I Hate The Internet

|
1219330836370.jpgI've been farting around with Jeff Atwood's StackOverflow for a few weeks now as a beta tester.

Everything was all well and good until I had to figure out how to use OpenID.  I've been watching the development of this shit from the sidelines for a while (well, if reading something about OpenID blah blah blah on TechCrunch and saying, aw, that's cute, then getting back to work counts).  I understand the problem that OpenID is trying to solve, but the approach is way too, uh, how to put this, San Francisco.

What I mean to say is that the pathologically-idealist, pedantic approach to universal authentication makes to too hard for users to understand.

A Problem That Doesn't Need Solving

Alright, here's the ideal scenario.  I have one set of credentials for everything I use.  I can use the same username and password pair for Facebook, my blog, my e-mail, whatever.

We have had a solution to this problem for decades: using the same God damned username and password for every website that needs them.  Users will forever continue to do this no matter what cutsie shit Diffie-Hellman key exchange you come up with.

But for the benefit of the doubt, let's try OpenID as a normal user.  I am visiting a website that uses OpenID for authentication, and I don't have an OpenID account.  OK, how do I get one?

Well, that's easy.  Just pick one of these OpenID providers that you trust and head over there!

OK, I pick VeriSign.  I've seen their name before with stuff that has to do with security.  I went to Verisign's website and entered in all my information.  It gave me the name of some website, http://teddziuba.pip.verisignlabs.com/, am I supposed to go to that website to log in to your website?

No, no no.  That's the address of the OpenID provider that you're supposed to blah blah a bunch of smart talk that makes me really sound like I know what I'm talking about and make the user feel small for not realizing how fucking awesome this whole scheme is.

See where this is going?  This shit is too pedantic, too convoluted, and violates too many preconceived notions of how authentication works.

Instead of trying to figure out your bullshit, a user will just use the same username and password that he uses for everything.   Problem solved.

As A Developer

Let's suppose that by way of some miracle, OpenID takes off.   There are millions of them who understand just how brilliant and altruistic Brad Fitzpatrick is, and can figure out how to deal with this identity provider nonsense.

(Side note: but Ted! There are hundreds of millions of OpenID users who have accounts by virtue of having accounts on such-and-such websites!  Yes, but how many of them know this?  How many of them care?  That's what I thought.)

I'm a lazy ass developer.  Making a table with usernames and MD5'ed passwords is pretty damned easy.  Now I've got to figure out something about attaching OpenIDs to my existing user accounts, gotta do some shit with HTTP given that URL the user is going to use to sign in, probably have to redirect them somewhere off my site if they're not logged in.  What a pain in the nuts.

There aren't enough people using OpenID now to make it worth my while.  People will be turned away from your website for hundreds of other reasons before it comes down to you supporting OpenID.

tl;dr

OpenID is too idealistic to be useful.

pressflip: what I'm tracking

Pages