on Apr 7, 2013
This post could also be titled, “Much Ado About Nothing”. It began as an ill-advised Twitter tirade, and turned into a confusing argument in which pretty much everyone was in misunderstood agreement. If the medium is the message, then the message of Twitter is that nobody has the time to communicate clearly.
The dispute is all settled down now, but it’s still nagging at me and so I want to put everything here as an official record of the foolishness. It begins with this preview of the upcoming Thief game, where we learn that:
As I learned straight from the developers that showed us the game, the artificial intelligence in Thief is something that is only possible on next-gen hardware. Whereas in past games, a soldier might react in an “if, then” manner – for example, if a glass breaks, then walk towards the sound – your enemies in Thief won’t be so predictable. Say you snuff out a lantern and attract the attention of a guard. At first, the enemy may simply mumble to himself that it must have been the wind, but should you do it again, the chances of the guard growing suspicious heightens. Do it a third time and he may launch an all-out sweep of the area, and each time is different.
This assertion is so monumentally preposterous I still get annoyed when I read it. Saying we needed next-gen hardware to do if, then, else is like saying that getting groceries is only possible with the new Chevy f-10 pickup. I understand that you want to sell the game and the platform, and I’m willing to entertain a little hyperbole when it comes to marketing, but this isn’t just an exaggeration. This isn’t a simple lie. It’s making the public more ignorant about programming in general and then taking advantage of the confusion to tell them a lie.
Annoyed, I took to Twitter:
There are a LOT of reasons AI isn’t getting better, but it’s NOTHING to do with CPU cycles. We’ve had the CPU cycles for at LEAST a decade.
— Shamus Young (@shamusyoung) April 4, 2013
Then, still annoyed, I wrote another. And this one is what caused the trouble:
Good AI is hard to write, hard to test, and does NOT sell outside of strategy games. Next-gen graphics will make it HARDER to invest in AI.
— Shamus Young (@shamusyoung) April 4, 2013
Now, this was a continuation of the previous thought: The CPU cycles are there and we’ve been capable of much better AI for ages, but AI takes resources to develop and it’s not a selling point in AAA games. It’s not that AI can’t sell, or that better AI wouldn’t make the game better, but it’s not usually part of the marketing and not often demanded by the audience. What we need is a change in attitude, not more processing power.
The last time I got really excited about AI in a shooter was back in 2007 when I played the original FEAR. (I also played the sequel, where they abandoned the smart, unpredictable and dynamic enemies for the bog-standard target dummies we get in most shooters. Not only is AI not getting appreciably better in AAA shooters, but it might even be regressing in some cases.)
This tweet was repeated a few times, people responded to it, and it generally spread around until it reached people who didn’t have the earlier context. Taken alone, you might assume I was saying:
AI is too difficult to write or test, and it doesn’t matter to the consumer. The better graphics of the next generation of machines will make AI itself fundamentally more difficult to write.
Of course, that interpretation didn’t occur to me. Over the next few days three different AI programmers responded to my comment as if I was making crazy talk. I tried to understand, but at only 140 characters it’s really hard to express, support, and clarify a nuanced argument about the intersection of software and business.
The irony here is that we were all basically in agreement on the essentials. From my point of view it looked like I (a former graphics programmer) was saying AI could be better if people were willing to invest in it, and actual AI programmers were disputing me. I got so exasperated I ended up blocking one of them. (It’s rude, but if you’re tempted to express anger in a public place it’s best to just mute the conversation and walk away. Maybe come back later.)
Once we untangled it we parted amicably.
And just another blanket apology for anyone who missed it: Sorry for being crabby.
With that out of the way, let’s loop back and look at the stuff they’re talking about in Thief.
Whereas in past games, a soldier might react in an “if, then” manner – for example, if a glass breaks, then walk towards the sound – your enemies in Thief won’t be so predictable. Say you snuff out a lantern and attract the attention of a guard. At first, the enemy may simply mumble to himself that it must have been the wind, but should you do it again, the chances of the guard growing suspicious heightens. Do it a third time and he may launch an all-out sweep of the area, and each time is different.
Now, I am not an AI programmer. And I’m sure that since this was said in the context of a press demo, everything was greatly simplified for public consumption. But what is described here is pretty simple stuff. We’ve been able to handle this type of thing for literally decades. Even a terrible, stupid, shallow AI from a strategy game a decade ago is going to be orders of magnitude more sophisticated than this. You’ve basically got a “paranoia” value that goes up when suspicious things happen, and the higher it is the greater the chance that the guard will begin searching for the player. That’s really cool and I’m eager to see it in action, but it’s not something that’s going to tax your hardware.
Again, compare this to the complexity of (say) an AI in Civilization V that’s trying to decide how to allocate their resources towards infrastructure, military, research, and exploration. Or even just deciding which city to make a priority. Or which foe poses the greatest threat. Even in Thief, the decision about where to look and when to stop looking is going to be far more complex than than the decision to begin the search in the first place. Thief may or may not have good AI – we won’t know until it’s released – but the thing they’re holding up as an example of AI sophistication is probably the simplest part of the entire system. It’s like saying Albert Einstein was a genius because he could remember where his house was.
But one thing I want to point out about this specific example of guards becoming aware of the player: A lot of the expense of a system like this is not in the AI coding, but in the art assets. It’s not enough for the AI coder to make some code that says, “If you hear a sound, path to the source of the noise and investigate.” Imagine how it would look for a guard to walk towards the source of a noise using his regular animations. He walks head up, eyes forward, arms passive at his sides. Even if the programmer has the guard in a super-alert searching state, we can’t see that as players. Instead, this search makes the guard seem even dumber than if he’d just ignored the sound entirely.
What you want is a system with different levels of visible awareness. If the Thief AI developers decided they want four different levels of awareness from oblivious to alert, those levels won’t enhance the user’s experience unless the user can see that these different states exist. Maybe the guard begins slack-jawed, bored, and mumbling to himself. If he’s spooked he starts looking around and paying attention. If he gets suspicious he might begin searching carefully, scanning the room with his hand on the hilt of his sword. And if he’s detected you he’s going to be pissed off (or terrified, with bonus points if the system has both like in Batman: Arkham Asylum) and walking around with his sword at the ready. Each state needs distinct animations and dialog.
This work ends up getting multiplied across all enemy types. So if we’ve got four states of awareness, then we need four sets of chatter for every enemy in the game.
It’s possible for the AI coder to have a guard share his awareness level with other guards. (If guard A sees guard B in an alarmed state, assume the same state.) But it’s much harder to do this in a way that will make sense to the player and really convey to them the depth of what is going on. You don’t want them to see all the guards in the room magically adopt the same posture like they’re a hive-mind. If you want the player to understand the guards are communicating, then the guards need to visibly and audibly communicate with hand gestures and words. A lot of the expense isn’t in making the foes smart, but in showing that they’re merely smart and not omniscient. Again, this leads to a multiplication of animation and voice assets.
All of this becomes that much harder to pull off when you move to a new set of hardware. Marketing has been telling people the graphics on this new device will be worth their $400, and the audience is showing up expecting to be amazed. So now all the guards have to be rendered in super-realistic detail. This means the animations are more complex to make, since now we need all those drapey-cloaks and locks of long hair to move and sway realistically. Oh, and at this level of detail it won’t do to have the voices coming out of inert heads, so all those lines of dialog will also need to be lip-synced.
With graphics this photorealistic it would be goofy to have all the guards look like an army of clone troopers, so we need to make different models. And if we’re going to have different looking guards then it won’t do to have them all speaking in the same voice. So now we need multiple voices for multiple models which all must be rigged and animated with many states of awareness and have dialog for changing states, sharing states, and while idling in a particular state, and all of that needs to be lip-synced, and oh my gosh this game is costing us how much to produce?
Again, Thief could be a great game. It might even have mind-blowing, groundbreaking AI. Eidos did Deus Ex: Human Revolution and I was a fan of that game. But this initial round of press does not inspire confidence and their claim that the new hardware will open up new frontiers in AI is either foolishness or sophistry.
Shamus Young is an old-school OpenGL programmer, author, and composer. He runs this site and if anything is broken you should probably blame him.