Probably right

By Shamus Posted Friday Jan 20, 2006

Filed under: Links 3 comments

Mark normally posts on Sundays, but he seems to be on a roll this week. He has more on the probalistic systems, which I mentioned earlier. This led me to this bit from Nicholas Carr, which is one side of a debate on the merits of probalistic systems.

Back already? Great.

As others have pointed out, one thing about the these systems is that even if nobody is cheating, deciding what is “good enough” is a bit abstract: It depends on what you want to do with the emergent data, and what your standards are for usefulness. Everyone’s big problem seems to be with Wikipedia. It is often used as an example of a probablistic system that doesn’t really deliver and (occasionally) used as an indictment of probalistic systems in general. As far as probalistic systems go, Wiki is really a poor example. I think it’s a stretch to lump it in with systems like Google and Technorati. So what makes Wikipedia so different?

Low fault tolerance

Let’s say I wrote some software that looks at common airplane approach vectors to major airports. Pilots can can enter their current position, their destination, a few other variables, and my program will then come back with, “Based on what other pilots have done in similar circumstances, we suggest using the following approach…” Let’s assume I do a good job and my program makes the right choice nearly every time.

Well, we can stop right there. Nearly every time isn’t nearly good enough in this situation. I don’t care how much depth we give the dataset or how many variables we take into account. The whole system is useless.

On the other hand, let’s say you want a picture of Brittny Spears for your desktop (humor me here) and Google comes back with a less-than-optimal result. Instead of giving you the “official” page run by some media company, it gives you a website maintained by a fan. Odds are, his site has what you want as well. Even if it doesn’t, he probably has a link that will point you to the goods.

The difference between these two situations is pretty stark. One is a waste of time, even with a 99% success rate, and the other works well enough even when it gets things “wrong”.

And this is Wikipedia’s problem: Most people have a pretty low tolerance for error in an encyclopedia. If the info is wrong (or even suspect) then they have to look it up elsewhere, so why bother with Wiki at all? More to the point, if you have a low error tolerance, should you really be using probalistic systems? Probably not.

Lack of Darwinisim

As I understand Wikipedia, each subject has one entry. If I think the guy who wrote the entry for Article 153 of the Constitution of Malaysia got something wrong, I edit the original article. The next person to visit the page will see my version, not the original. People can review new changes or revert to old versions acording to various rules, but at any time there is only one page for Article 153 of the Constitution of Malaysia, and the average visitor isn’t going to want to take part in the courtship between new data and old data.

This isn’t a good way to foster, uh, probablisim. For a healthy probalistic system, it would need to create a new article that exists parallel to the original. They would be “ranked” according to (perhaps) number of incoming references that favor one version over the other, and the number of times users clicked on “this item was helpful”. The two versions of the same subject would be allowed to compete for visitors, with better pages slowly knocking less useful pages down in the rankings. Thus, each visitor contributes to the system by helping to rank pages, often by simply using them and then going away. This means the data gets more useful even when nobody is editing the articles themselves.

(Note that I’m not suggesting it should work this way. There are many reasons why this might not be a good idea. I’m just saying this would give the system much stronger probalistic properties.)

Detecting bad data

As I mentioned before, often Google will give you a less-than optimal result, but things still work out. Often the “wrong” site will contain a link to the “right” one. Finding a Brittny Spears fan site leads me to the official one. The same is not true for poor Wiki. When I get to the wrong site, I don’t know it’s wrong. If I did, I wouldn’t need to look it up. Even worse, finding the wrong birthday for Napoleon doesn’t lead me to the right one. It leads me to propigate bad data.

Help from the user

It is very, very rare that I ever need to check out page 2 of Google search results. Usually what I want is right there on page 1. However, often my goal is not the #1 result. So, Google is great at narrowing a search down to 10 or so likely contenders, but it has a really hard time picking the right one out of those 10. Since it lists all 10, and lets me choose, it doesn’t have to. That last level of value judgments – the most difficult – is left for the user.

By contrast, there is no way the user can really “help” Wiki, unless they jump in and write an article.

I guess my point in all this is that Wiki, regardless of its usefulness, is a bit shabby when it comes to probalistic properties.

 


 

Star Wars: Done Today

By Shamus Posted Thursday Jan 19, 2006

Filed under: Movies 66 comments

Imagine what it would be like if Star Wars had not been written 30 years ago.

Now picture a young, idealistic George Lucas showing up in Hollywood with the script for Star Wars: A New Hope in 2006. It’s a safe bet the studio executives of today wouldn’t look at the script and see “blockbuster”. Actually, it’s a safe bet they wouldn’t look at it at all. It doesn’t have any toy or comic-book tie-ins, after all. But, assuming George worked hard and was lucky, he might get the thing into the hands of someone who could make it happen. Some Hollywood bigshot. This person would not see the script as the start of a revolution. They probably wouldn’t even green-light it. But if they did, what would happen to the story? How would the movie turn out?

Continue reading ⟩⟩ “Star Wars: Done Today”

 


 

Chizumatic

By Shamus Posted Wednesday Jan 18, 2006

Filed under: Nerd Culture 1 comments

Chizumatic went down a few days ago when Steven Den Beste packed up and moved. Since the site was hosted on his home computer, it vanished when he unplugged it.

I hadn’t thought about this until recently. Most people have their stuff hosted by professionals. If my house is hit by a meteor, my site will continue to exist. (although eventually it would go down when I stopped paying the bill.) Nearly everyone has others host their site for them. In fact, most people don’t even administrate their own site: They sign up at MySpace or BlogSpot or some other place where hosting and administration are both taken care of, and the only thing they need to do is add content.

But what would the net be like if more people were self-hosters like Steven Den Beste? What if all bloggers just ran their own servers out of their own homes?

Linkrot would happen much faster. Old blogs on Blogspot live on, but a self-hosted site would vanish as soon as the owner stopped taking care of it. Sometimes blogs would vanish forever (or get wiped) when a hard drive failed. Bad storms would put little clusters of blogs out when the power went down. In fact, blogs would be winking on and off everywhere as people rebooted, lost power, or borrowed the hosting machine for a LAN party. When some nasty virus struck, it would take out a bunch of blogs and small-time sites as well.

Things would need to be structured differently: most of us can’t get that much upstream bandwidth from a residential location. At least, not in any affordable manner. This facinates me, since the net itself is designed to be bottom-up, and this is one case when top-down actually works a little better.

I don’t really have a point. It’s just interesting to me.

 


 

Honey, can I borrow your lightsaber?

By Shamus Posted Wednesday Jan 18, 2006

Filed under: Pictures 4 comments

I present the following: A photoshop project I did a few years ago, just before Episode II came out. Can you tell what it is? (Besides a lightsaber)


Click for larger view

It is…

My wife’s hair-curling iron, with the lever removed in photoshop. It’s always looked like a lightsaber to me, and everytime I see it I want to turn it on and start carving up droids. So, one day I took a picture of the thing and photoshopped it into the picture above.

Okay, I’m done being a dork now. Back to work.

 


 

Sleep Patterns

By Shamus Posted Monday Jan 16, 2006

Filed under: Personal 19 comments

Back in June there was a fascinating article over in Slate about sleep patterns in teens. The short version of the article is this: Many studies have shown that people who eat breakfast do better in school. This article makes the case that the reason they do better is because they are ealy risers, and so they are hungry. It has nothing to do with actually eating food, but everything to do with whether or not they are naturally awake and alert in the early moring. Therefore, dragging your not-an-early-riser kid out of bed to make them eat isn’t going to improve their performance.

But the article talks about some things I’ve been trying to explain to people for years. In my personal life, I’m surrounded by early-risers: People who hop out of bed with a smile and are ready to eat a big meal and attack the day! They start out alert, and slowly become tired as the day goes on. My own view of their cycle looks like this:


The average day of an Early Riser.

Note that these charts are entirely subjective. There is no data behind these. I’m just using them to make my point clear. I don’t want to create the impression that I’ve done some sort of formal data-collection.

I’m a slow, slow riser. I don’t get hungry until mid-afternoon, and I’m not ready for complex tasks until noon or so. The late evening hours are when I’m most awake and effective, and I’m all but useless in the moring. Many people are wired this way. My own cycle feels like this:


The life of a Night Owl.

This is bad, since it means I spend my worst hours working, and my time of highest mental activity is spent with family or playing video games. When I work on the weekends, I almost always do so at night, when I’m sharpest.

Note that when I talk about being “awake”, I’m talking about a broad range of physiological effects, not just alertness. For example:

  • I wake up feeling horrible. Classic morning zombie. I’m usually miserable right after waking up. In the evening, just as I fall in bed, I feel quite satisfied. This is in contrast to people who are frayed at the end of the day, but wake up feeling refreshed.
  • When I wake up in the morning (the low end of my cycle), I’m cold. At night (at the peak) I’m hot. I go to sleep with the covers off, and wake up with them wrapped around me, shivvering.
  • I’m most creative and talkitive at night. In the morning (I’m talking about the first few hours of the day, not just the half-hour or so after waking up) I hardly speak.
  • If I get sick, my symptoms usually hit overnight, so that I wake up sick. Through the day, the symptoms weaken, and I feel relatively better by evening. The next morning, the symptoms return. The cycle repeats until I get better.
  • I have no appetite at all in the morning. I’m not usually hungry until I’ve been up for 5 or 6 hours. At night – just before bed – I always eat.
  • I never laugh and smile very little at the beginning of the day. I joke around a lot in the evening.
  • Music: never in the morning, often in the evening.

My wife is the opposite in almost every way: Wakes up refreshed. Goes to bed cold. Wakes up hot. Alert in the morning. Tired in the evening. Sick in the evening. Feels better in the morning.

I’d love to know the breakdown of how many people operate the way my wife does, versus people who operate the way I do. I predict that I’m in the minority, but it’s not like I’m in a position to back that up with hard numbers. I’d love to see a study on this.

But for people like me, how do you cope with the fact that most of the most important hours of the day are spent in a stupor? You can force yourself to get up earlier, but if you don’t get a solid 8 hours of sleep it will do more harm than good. Instead of moving the alert hours to earlier in the day, it just dilutes them:


A night owl wakes up early.

After years of dealing with this, I’ve adjusted my sleep so that I get up at 5am. It means going to bed around 9pm, but I’ve found it really helps me to function like a normal person. By the time normal people wake up I’ve gotten my brain in gear and I’m more or less ready to cope with them. I’m alert by the time I start work, which has done wonders for my productivity.

Up until about two years ago, I never really drank coffee. However, I eventually experimented with it and found that a good dose of caffene is very useful in smoothing out the curve and making the morning hours more useful, at the expense of losing a bit of the edge at night. I usually skip the coffee on weekends, sleep in, and enjoy the extra energy in the evenings.


A night owl drinks coffee during the day and is a lot more useful in the morning.

It’s an interesting subject to me, although early-riser types have no patience for any of this. They think that their own pattern is “normal”, and if you are sleepy in the morning then it’s because you are irresponsible or lazy.

 


 

Half-Life 2

By Shamus Posted Saturday Jan 14, 2006

Filed under: Movies 17 comments

As I’ve mentioned before, I am not exactly what you would call a “fan” of Steam, the service used to safeguard Half-Life 2 against nefarious piracy. I am, however, partial to the game itself. So let’s take as look at what might happen if the game were to become a movie. Remember, this is just for fun. I am in no way suggesting that this would be a good idea.

So let’s get started…

Gordon Freeman

Deciding who would “play” Dr. Gordon Freeman is a bit strange. In a first-person game, the main character is the one you never see unless you look in a mirror. Gordon has no dialog, no personality, and is defined solely by his actions.

How does he feel about being regarded as a messiah? How does he feel about his role in opening the door to another dimension, or about his place as the puppet of the G-man? How does he feel about Alyx, or his fellow scientists? The answer, of course, is “however you think he feels”, since you play him in the game.

So, we might as well choose the actor to play this role based on looks alone. Ethan Hawke looks like a good match.

G-Man

The G-man is an enigma in the game. Who’s side is he on, anyway?

His two most notable features are his gaunt appearance and his strange, off-beat speaking cadence. For the gaunt, remorseless look, you can’t go wrong with Christopher Walken. For the odd pauses and menacing delivery, I might suggest Alan Rickman.

Alyx Vance

My favorite thing about Alyx’s character is that she seems real. The obvious thing for video games or movies to do is to take the female lead and whore her up. You know: Hot pants, tube tops, and high kicks. (Or leather and guns.) But no, she doesn’t have a gravity-defying bosom and strut through enemy base in high heels. Alyx is dressed in practical, durable clothing that fits her character and lifestyle as a member of the resistance. She isn’t wearing makeup. This sounds mundane, but video game females wearing freshly applied glossy lipstick into combat is so common that this borders on revolutionary. The rest of her shows the same attention to the authentic. More time was spent animating her eyes and mouth than her boobs and ass. This shows that her creators take her seriously, which lets the rest of us take her seriously.

To play Alyx, we can go with the obvious mainstream choice and use Halle Berry. She’s a great actress and playing the tough yet feminine Alyx is right up her alley.

As an alternate, I might suggest, Gina Torres, who plays Zoe on the ill-fated TV show Firefly. The way she looked when heading into combat reminds me a lot of Alyx.

Eli Vance

Robert Guillaume did the voice for Dr. Vance in the game, and as an added bonus looks like Dr. Vance, so I can’t think of a reason to use anyone else. The only nitpick is that Eli Vance ought to be about 50 or so, and Guillaume is nearly 80. However, he’s a robust 80, could pass for 60, and looks young enough to be Alyx’s father if he started late.

Barney Calhoun

Gary Sinise is the best choice I can come up with, although he’s not a perfect fit. He’s a bit too old and seems a bit too smart for Barney. He does have the right accent and the honest, down-to-earth delivery the part calls for. I keep thinking there has to be a better choice. Heather suggested that a young Matthew Broderick would look closer, but that would seem to sacrifice attitude and delivery for looks.

I’m betting there is a closer match that I haven’t thought of yet.

Dr. Kleiner

At first I thought I could be funny and suggest Bill Nye for this. He’s got that geek vibe the role needs. Hearing Bill Nye the Science Guy explaining the nonsense teleport mumbo-jumbo as if he was talking about real science should be good for a laugh, which is the whole point of Dr. Kleiner anyway.

But for looks I think a closer match is Bill Nighy. (Strange coincidence that the names are similar) I think that once you get the glasses and the lab coat on him he should look just fine. However, his voice is too deep, and he’s a brit. So, the choice: The right voice / vibe with Bill Nye, or the right appearance with Bill Nighy.

Dr. Breen

Like many of the greatest villians, Wallace Breen does not believe himself to be a bad person. He is much like the frenchmen who thought the best they could do in WWII was help their Nazi conquerers and hope they are treated well in return. He’s pragmatic, yet foolish and gutless. He’s a bureaucrat who aids the aliens in their conquest of Earth, in the hopes that someday the ends will justify his grotesque means. The typical Hollywood approach to someone like this would be to simplify the character and make him plain, easy-to-understand evil. You could give such a role to Christopher Lee and turn Breen into a cruel and calculating despot.

However, to stay true to the nature of Dr. Breen, you’d need someone who usually plays heroes. Someone who would otherwise be genuine and likeable. I’m open to suggestions.

Father Grigory

Good luck getting someone like Sean Connery to play a bit part like this. He’s a good fit anyway, and could pull off the odd graveyard humor Grigory uses.

 


 

Probabalistic Systems

By Shamus Posted Friday Jan 13, 2006

Filed under: Nerd Culture 2 comments

Kaedrin has a post about probabalistic systems (a new word for me, although I’m familiar with the concept) like Amazon.com recomendations, weblogs, Netflix movie ratings, and any other systems governed by mass input instead of authoritve control. Of all the systems I’ve mentioned, only one – weblogs, and the way they link to one another – is a truly “natural” system free of any tampering.

What weblogs exist, what information they contain, and how they link to one another, cannot be controlled. This is different from Wikipedia, Netflix, Amazon.com ratings, emusic ratings, and a host of other content feedback systems. The weblog system is spontaneous and naturally occurring. Glen Reynolds might not be your cup of tea (I don’t read him myself) but he’s the most popular political blogger (or was, last time a I checked, cut me some slack here I’m making a point) and there is absolutely nothing you can do to change that aside from making your own blog and being better than he is. And by better, I mean, better in the eyes of everyone else. You can’t fiddle with the system because there is no central point of control. It is the result of millions of people making their own decisions. This is different from the other systems, which are driven by software, and are usually hosted on a particular set of servers. They are, in fact, “owned” by someone. (For you young kids, I’m simply using an archaic spelling of the word “pwn3d”.)

One thing I haven’t seen yet that I keep expecting, is for the systems that are owned to experience a certain degree of cheating. Consider:

You’re a company like Amazon.com. You buy a million red widgets and a million blue widgets. You make a better margin on the blue ones, but it turns out that the red widgets are just a little better in quality. So the feeback for red is a little better. Which leads to red being reccomended more often than blue, which leads to better sales, more feedback, and even more recomendations. Now you’re down to your last 100,000 red but you still have 500,000 blue.

Now comes the moment of truth: Do you cheat? You’d rather sell blue. You see that you could “nudge” the numbers in the feedback system. You own the software, pay the programmers who maintain it, and control the servers on which the system is run. You could easily adjust things so that blue reccomendations appear more often, even though they are less popular. When Amazon comes up with “You might also enjoy… A blue widget” a customer has no idea of the numbers behind it. You could have the system try to even things out between the more popular red and the more profitable blue.

Does this really happen? I have no idea. I’m not really accusing Amazon of anything so much as pointing out that, in general, owned systems like this are easy to tamper with and offer a lot of incentives for the owner to do so. When Netflix suggests a movie to me that I don’t like, I never think, “Wow. this system isn’t working very well”. Instead I think, “I wonder how much money Netflix was paid to pimp this lame movie”. I guess I’m just really cynical, but these systems are made to attract customers and generate revenue. If a little hard-to-detect inaccuracy can mean more revenue, I don’t see why the systems wouldn’t be “tweaked”.

But how much can you cheat? How far can you push it before some clever guy proves it and posts his findings on Slashdot? That’s a tough call. (It reminds me a bit of the problem facing Waterhouse and Turing in the book Cryptonomicon. How much can we use this information to our advantage, without revealing that we are using this information?) Over time, when your fiddling goes undetected, you will have an incentive to keep pushing it, and moving the numbers even more in order to unload unwanted inventory or to favor suppliers in exchange for money. Each time you tweak the system, it becomes just a tiny bit less useful to the customer, but also more profitable for you. You always have an incentive to tilt things a little more, and you don’t know where the line of detection is.

It is possible that all of these sytems are driven by pure numbers and have never been messed with. However, just knowing what I outline above makes me trust them less.