“Premature optimization is the root of all evil.”
That quote by Donald Knuth gets repeated a lot in programming circles. It dates back to 1974, which means that by the standards of computer science it’s more or less the Code of Hammurabi. While I freely admit that Knuth is a better computer scientist when he’s fast asleep than I am at the peak of a caffeine high, I do have some reservations about this particular bit of wisdom. Specifically: Is premature optimization really that dangerous, and what’s the difference between “premature optimizing” and simply building something properly the first time?
In my professional career, I can’t remember a single instance where premature optimization caused a serious disruption. Maybe that’s a side-effect of the domain I work in. Games have very strict performance requirements that would seem completely unreasonable – bordering on fanatical – to someone working on (say) database administration. The performance of your program is continuously expressed to the user through the framerate of your full-screen application. If a thread stalls, if a frame is dropped, if texture data isn’t fetched in time, if you run out of VRAM, if you oversaturate the memory bus, or if one of a dozen or so other systems falls behind, then it will create problems that a human being can see and feel.
Contrast this with something like a Pixar-style render farm that chugs away rendering gigantic images all day long. Both the game and the Pixar-renderer are rendering images. Time is money so there’s a huge financial incentive to get the Pixar program to run as fast as possible, but if there was a bug that wasted 15% of the available processing power, how long would it take someone to notice? Would they? Meanwhile in a game, dropped or late frames give the user a realtime visualization of your sins. It’s hard for the user to not notice performance problems.
I’ve spent a lot of time in this domain that’s obsessed with performance and maybe a little negligent when it comes to maintaining code, and perhaps that time has blinded me to the wisdom of Knuth’s words, but what’s the actual damage of premature optimization? I get that a programmer might blow a bunch of time squeezing an extra 2% performance out of some system that ultimately doesn’t matter, but aside from the squandered programmer hours I don’t see how it hurts a project. And if we’re worried about wasted programmer hours then we have bigger fish to fry than this. Unreadable code, dependency hell, lack of documentation, failure to abide by best coding practices – all of these have devoured far more of my hours than someone’s mis-spent weekend tinkering with trivial concerns. I can believe that premature optimization is a bad thing, I just can’t get behind the notion that it poses a serious threat to our productivity. At least the premature optimizer is wasting their own time instead of mine.
I understand things were different in 1974. I didn’t do much programming back then on account of being only three years old. Processor cycles were literally thousands of of times more precious than they are today, and computer memory was more scarce by a factor of millions. I imagine those sorts of extreme limitations might lead the odd programmer to fritter away several hours of productivity worrying about individual bits, and maybe the practice was common enough for Knuth to name it as an “evil”, but during my career I’ve seen more productivity lost to pointless arguments over brace styles than to misguided efforts at optimization.
But maybe I’ve been lucky. Maybe premature optimization is still a scourge industry-wide, even if it’s never afflicted any of my colleagues. Maybe I’m just oblivious to it. Maybe I’m actually a terrible offender in this regard and none of my colleagues ever had the heart to tell me. Like I said earlier, I’m not totally clear on where you’d draw the line between “optimizing” and simply “building it right the first time”.
Perhaps an Example Would be Useful
Let’s say we’ve got a game where the level is a tidy grid of rooms. Yes, that’s a very boring design. But it’s easy to visualize and that’s what’s important for this discussion. You’ve got an N×N grid, where every room occupies one space on the grid. This means the whole thing can be stored in an easy-peasy 2D array. If you want to know about the room position 3×4, then you can just look it up like so:
//In c sharp... Room r = my_rooms[3,4]; //In C++... Room* r = my_rooms;
(Having used C sharp for a few weeks now, I have to say I prefer the CS approach to expressing array indices. [3,4] is a lot less annoying to type than , and it’s shorter by one character. And even though the C++ way is far more familiar to me, I still find the C# syntax to be more readable.)
Whatever. Simple. We just fill the array when the program starts and then we never have to worry about it again.
The problem with this design is that you have all the rooms in memory at once. That’s fine if the grid is small, but if this is an open-world kinda deal with miles of dungeonMILES of same-sized rooms? Now the game is starting to sound REALLY boring. then this is going to be a huge waste. The game is probably only interested in the rooms directly surrounding the player, and if you’ve got tens of thousands of them in memory then that’s going to incur a massive cost. It will devour memory, and the loading screen will be intolerable.
The other design you could use is to have rooms dynamically allocated as needed. Instead of creating all the rooms at startup and storing them in a tidy grid, you create them as the player gets close to them and you clear them out of memory if the player leaves the area.
The problem is that this is a monumentally more complex designEr, it’s not actually THAT complex. But it’s complex compared to the array., and that complexity will have far-reaching consequences in your code.
Now you need a background thread to load in rooms on demand, and free them up when no longer needed. Are those rooms filled with other resources like enemies, game items, or textures? Are those things used elsewhere? Can we get rid of them? We’ll need a system for tracking all of that stuff.
The pathfinding system will need to be updated. If the pathing routine is trying to work out a route from A to B, what happens if its attempted route takes it from a loaded room to an unloaded one? Should it give up? Should it signal the room-caching thread that it needs this room and then wait for it to become available? It depends on the game.
Physics needs to be able to handle having things vanish from the scene. An object might be on the threshold between rooms, and then one of those rooms gets unloaded as the player moves away. We need to make sure the physics engine isn’t going to freak out when this happens. If physics object A is resting on top of physics object B, but both of them belong to different rooms, then how do we handle if one of them goes missing? If B simply vanishes, then A will fall down to the floor. But then if the player returns and B reappears, A and B will both occupy the same space and the physics system will probably throw a fit over that. If you’re very lucky you won’t lose three hours trying to figure out why 1 in 100 NPCs are randomly killed by props travelling at mach 2.
We’ll need to synchronize these changes to the world state with the changes to the player, which might create strange, game-breaking bugs. Let’s say the player opens a treasure chest, removes the treasure, and walks away so the treasure room de-spawns. Obviously we need to save the state of the now-empty treasure room to disk, or we’ll create an exploit. But let’s say the player loots the chest, runs away, the room despawns, and then they notice their boss is approaching their cubicle so they exit the game without saving via Alt-F4Because not honoring Alt-F4 is inherently a terrible design.. A few minutes later they launch the game. The treasure room was saved to disk, so it remains empty. But the player exited the game without saving the state of their character, so they no longer have that treasure in their inventory. The treasure has vanished forever. This may be annoying or it might be game-breakingYou lost the key that exits the dungeon? Oops!, depending on the design.
There will be similar concerns with many of the other systems: Audio, rendering, triggers, lights, particles, and countless other systems will need to be designed in such a way that they can operate gracefully when objects appear and vanish without warning from one frame to the next. (Or they need to communicate with some sort of management system that’s in charge of loading and unloading. It’s more work either way.)
None of these problems are insurmountable. In fact, these problems should all have straightforward solutions. The point I’m making is that this is a lot of work compared to the naive approach of tossing everything into an ever-present array.
Note for the non-coders: From here on I’m going to refer to this other approach as using a “hashtable”. I’m not going to stop and explain what a hashtable is right now, although if you’re really curious you can read about it on your own. There are a lot of ways to store data and each of them has different strengths and drawbacks, but to keep things simple we’re just talking about two:
1) Putting stuff into an ever-present array, which is simple but can be wasteful at higher scales.
2) Putting stuff into a hashtable, which is much more efficient but a lot more trouble to implement. (The hashtable itself is trivial to set up, but making all the other systems operate in a world without guaranteed data availability is non-trivial.)
The thing is, you really do want to pick one and stick with it. It’s far easier to begin with a hashtable than to get weeks into a project and then try to retrofit your design into a hashtable. Retrofitting later will mean making changes to all those other subsystems: AI, pathfinding, physics, sounds, etc. Going back to old code to make changes is far more time consuming than making new code. Once the pathfinding is working, I want to be able to treat it like a magic box and forget about it so I can spend my finite mental bandwidth on other tasks. I don’t want to have to come back three weeks later and answer the question of, “So what happens if a room vanishes from memory while I’m routing a path through it?”
So early on in a project I’ll come to some sort of crossroads. I’ll have to choose between an array or a hashtable. I don’t want to do the extra work of implementing a hashtable if I’m not going to need it, since it will add a bit of programming overhead to every single thing I add later. On the other hand, I really don’t want to start with an array and then have to switch later.
Gosh, it would be nice to know ahead of time if I’m going to need the hashtable. I should probably look at how many resources these rooms will take. How many can I reasonably fit into memory before I have to worry about them? How large will they be on disk? How much processing power will each of them consume per frame? How difficult is it to pull a room into memory when the game is running? I don’t need exact numbers, and in fact getting exact numbers at this stage is impossible because a lot of systems haven’t been built yet. I just need a ballpark estimate so I have basic awareness of the tradeoffs I’m dealing with.
So I start asking these questions about memory usage and processing cost, trying to figure out what design I want to use. And inevitably another programmer will caution me against “early optimization”. “Whoa there Shamus, it’s way too early for you to be optimizing!”
Is this really “optimization”? And is it really “early”?
Is this how the other coders work these days? Just code under the assumption that you have infinite memory, CPU cycles, and storage space, and then re-engineer everything later once you start running to hard limits?
I dunno. I’m sure most programmers will concede that what I’m doing here is neither early or optimization, but just basic design work. The thing is, I’ve never figured out where you’re supposed to draw the line. When do you cross the magic threshold where it’s supposedly okay to worry about CPU cycles again? When does it stop being “early”?
 MILES of same-sized rooms? Now the game is starting to sound REALLY boring.
 Er, it’s not actually THAT complex. But it’s complex compared to the array.
 Because not honoring Alt-F4 is inherently a terrible design.
 You lost the key that exits the dungeon? Oops!
The Best of 2017
My picks for what was important, awesome, or worth talking about in 2017.
The Gradient of Plot Holes
Most stories have plot holes. The failure isn't that they exist, it's when you notice them while immersed in the story.
A programming project where I set out to make a gigantic and complex world from simple data.
Zenimax vs. Facebook
This series explores the troubled history of VR and the strange lawsuit between Zenimax publishing and Facebook.
How to Forum
Dear people of the internet: Please stop doing these horrible idiotic things when you talk to each other.