Probabalistic Systems

By Shamus
on Jan 13, 2006
Filed under:
Nerd Culture

Kaedrin has a post about probabalistic systems (a new word for me, although I’m familiar with the concept) like recomendations, weblogs, Netflix movie ratings, and any other systems governed by mass input instead of authoritve control. Of all the systems I’ve mentioned, only one – weblogs, and the way they link to one another – is a truly “natural” system free of any tampering.

What weblogs exist, what information they contain, and how they link to one another, cannot be controlled. This is different from Wikipedia, Netflix, ratings, emusic ratings, and a host of other content feedback systems. The weblog system is spontaneous and naturally occurring. Glen Reynolds might not be your cup of tea (I don’t read him myself) but he’s the most popular political blogger (or was, last time a I checked, cut me some slack here I’m making a point) and there is absolutely nothing you can do to change that aside from making your own blog and being better than he is. And by better, I mean, better in the eyes of everyone else. You can’t fiddle with the system because there is no central point of control. It is the result of millions of people making their own decisions. This is different from the other systems, which are driven by software, and are usually hosted on a particular set of servers. They are, in fact, “owned” by someone. (For you young kids, I’m simply using an archaic spelling of the word “pwn3d”.)

One thing I haven’t seen yet that I keep expecting, is for the systems that are owned to experience a certain degree of cheating. Consider:

You’re a company like You buy a million red widgets and a million blue widgets. You make a better margin on the blue ones, but it turns out that the red widgets are just a little better in quality. So the feeback for red is a little better. Which leads to red being reccomended more often than blue, which leads to better sales, more feedback, and even more recomendations. Now you’re down to your last 100,000 red but you still have 500,000 blue.

Now comes the moment of truth: Do you cheat? You’d rather sell blue. You see that you could “nudge” the numbers in the feedback system. You own the software, pay the programmers who maintain it, and control the servers on which the system is run. You could easily adjust things so that blue reccomendations appear more often, even though they are less popular. When Amazon comes up with “You might also enjoy… A blue widget” a customer has no idea of the numbers behind it. You could have the system try to even things out between the more popular red and the more profitable blue.

Does this really happen? I have no idea. I’m not really accusing Amazon of anything so much as pointing out that, in general, owned systems like this are easy to tamper with and offer a lot of incentives for the owner to do so. When Netflix suggests a movie to me that I don’t like, I never think, “Wow. this system isn’t working very well”. Instead I think, “I wonder how much money Netflix was paid to pimp this lame movie”. I guess I’m just really cynical, but these systems are made to attract customers and generate revenue. If a little hard-to-detect inaccuracy can mean more revenue, I don’t see why the systems wouldn’t be “tweaked”.

But how much can you cheat? How far can you push it before some clever guy proves it and posts his findings on Slashdot? That’s a tough call. (It reminds me a bit of the problem facing Waterhouse and Turing in the book Cryptonomicon. How much can we use this information to our advantage, without revealing that we are using this information?) Over time, when your fiddling goes undetected, you will have an incentive to keep pushing it, and moving the numbers even more in order to unload unwanted inventory or to favor suppliers in exchange for money. Each time you tweak the system, it becomes just a tiny bit less useful to the customer, but also more profitable for you. You always have an incentive to tilt things a little more, and you don’t know where the line of detection is.

It is possible that all of these sytems are driven by pure numbers and have never been messed with. However, just knowing what I outline above makes me trust them less.

  2. angel says:

    Only just noticed this post, but I have to say, that’s one of the reasons I tend to ignore lovefilm’s recommendations. (That, and the fact that the first page in their recommendations consists entirely of DVDs I own but which won’t stay removed from the recommendations list)
    This kind of tampering is one of the main reasons behind my ugly bodge of a movie recommendations system … let everyone’s PVRs share your recommendations through a distributed database, and remove the single point of control.

  1. By Kaedrin Weblog on Sun Jan 15, 2006 at 1:24 pm

