{"id":26133,"date":"2015-03-12T06:51:18","date_gmt":"2015-03-12T11:51:18","guid":{"rendered":"http:\/\/www.shamusyoung.com\/twentysidedtale\/?p=26133"},"modified":"2015-07-01T04:49:51","modified_gmt":"2015-07-01T09:49:51","slug":"project-good-robot-31-so-obvious-i-cant-see-it","status":"publish","type":"post","link":"https:\/\/www.shamusyoung.com\/twentysidedtale\/?p=26133","title":{"rendered":"Project Good Robot #31: So Obvious I Can&#8217;t See It"},"content":{"rendered":"<p>It&#8217;s been about a year, but project Good Robot is moving again. I&#8217;m working with <a href=\"http:\/\/pyrodactyl.com\/blog\/\">Pyrodactyl<\/a> to make the game into something that&#8217;s hopefully worth paying money for. The game is <a href=\"http:\/\/steamcommunity.com\/sharedfiles\/filedetails\/?id=402578054\">going on Steam Greenlight<\/a>. Here&#8217;s the trailer:<\/p>\n<p><table class='nomargin' cellspacing='0' width='100%' cellpadding='0' align='center' border='0'><tr><td><iframe loading=\"lazy\" width=\"1024\" height=\"576\" src=\"https:\/\/www.youtube.com\/embed\/sj3Rx-rmFFI\" frameborder=\"0\" allowfullscreen class=\"embed\"><\/iframe><br\/><small><a href='http:\/\/www.youtube.com\/watch?v=sj3Rx-rmFFI'>Link (YouTube)<\/a><\/small><\/td><\/tr><\/table><\/p>\n<p>If you&#8217;d like the game to see the light of day, please <a href=\"http:\/\/steamcommunity.com\/sharedfiles\/filedetails\/?id=402578054\">vote for it<\/a> and spread the word. It would be much appreciated. It will be a lot easier to plan development if we don&#8217;t have to worry about getting stuck in Greenlight for months. Arvind (head honcho of Pyrodactyl) put up his previous title <a href=\"http:\/\/steamcommunity.com\/sharedfiles\/filedetails\/?id=402397726\">A.Typical RPG<\/a>, and it&#8217;s still awaiting approval after a few weeks<span class='snote' title='1'>A vote for that game would be appreciated as well!<\/span>. So apparently this can take a while.<\/p>\n<p>This project nearly feels like my old day job. In a good way. Once a week we get together and talk about the game, and when the meeting is over I&#8217;ve got a list of crap to do. <\/p>\n<p>Here is the most interesting problem I ran into last week:<\/p>\n<p><!--more-->One of the things that&#8217;s been on the to-do list since last year is the fact that the game isn&#8217;t performing the way it should. Once I get a few thousand particles on screen, the framerate drops like a rock. This is pretty mysterious. The last serious particle engine I wrote was almost a decade ago. It was more complex than this one, it worked in 3D, and I&#8217;m pretty sure it could handle 2,000 particles without too much trouble. <\/p>\n<p>Two thousand particles is nothing. Their polygons are ridiculously cheap, and the processing to move them around is pretty lightweight. I&#8217;ve tried tightening up the particle animation code and tried offloading a bit of the math onto the graphics card, but it didn&#8217;t seem to help. This is an annoying head-scratcher of a problem. This system is so simple I just can&#8217;t see where the bottleneck is<span class='snote' title='2'>Most of this analysis took place before I had access to good profiling tools.<\/span>.<\/p>\n<p>It works like this:<\/p>\n<p><table width='800' class=\"\" cellpadding='0' cellspacing='0' border='0' align='center'><tr><td><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/gr31_particles.jpg' class='insetimage' width='800' alt='SO MANY PARTICLES WHO THOUGHT OF THIS?!?' title='SO MANY PARTICLES WHO THOUGHT OF THIS?!?'\/><\/td><\/tr><\/table><\/p>\n<p>As the game is played, various systems create particles. As lasers fly around, they leave these brief trails of glowing pixels. Every time something hits a wall, it makes this little scatter of rocks. When robots die, they give off a trail of black smoke. An explosion makes a big cluster of dozens (or a hundred) bits of fire and smoke. Half a dozen sparks are given off by lasers when they hit something. You get the idea.<\/p>\n<p>All of these particles are added to the particle system. It has a great big queue of all the active particles, and the new ones get added to the end of the list. Once a frame, the particle manager<span class='snote' title='3'>No, I DIDN&#8217;T call it &#8220;Particle Man&#8221; in the code. I have SOME self-restraint. Also, I didn&#8217;t think of it until now. I wish Visual Studio had better refactoring tools.<\/span> runs through the list of active particles and does a little bit of processing for each one. If its life span is over, it&#8217;s marked for deletion. If not, the PM makes the particle tumble or fall or move around however it&#8217;s supposed to move. Then when render time comes around the rendering system runs through the list and sends the polygons to the GPU.<\/p>\n<p>That&#8217;s it. The entire particle system is less than 300 lines of dead-simple code. Not being able to find the performance slowdown here feels like losing track of a couch in an empty room. There just aren&#8217;t many places for something that big to hide.<\/p>\n<p><table width='800' class=\"\" cellpadding='0' cellspacing='0' border='0' align='center'><tr><td><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/gr31_dark.jpg' class='insetimage' width='800' alt='It&#8217;s like a rave where everyone hates you and wants to kill you.' title='It&#8217;s like a rave where everyone hates you and wants to kill you.'\/><\/td><\/tr><\/table><\/p>\n<p>Eventually I give up and the project goes dark for a year. When I come back I start poking at the problem again. Previously I had the number of particles limited to 2,000. Just for giggles I up the limit to 10,000 and have the player character spew a shower of hundreds of sparks every few milliseconds. Sure enough, we get a massive slowdown again. But the odd thing is that as the number of particles passes 3k, the framerate is still good. At 5k, the framerate is still good. Same thing at 8k. Then we hit the 10k limit and the framerate drops like a rock.<\/p>\n<p>So it&#8217;s not that the system slows down when we have more than 2,000 polygons, it&#8217;s that the framerate tanks whenever we hit the limit, <em>whatever that limit might be<\/em>. <\/p>\n<p>I spend several seconds looking dumbfounded at the screen before I begin nodding my head. Of course. <em>OF COURSE.<\/em> I am a fool. <\/p>\n<p>Last week we talked about how <a href=\"?p=24516\" title=\"Ideas about a new programming language for games, Annotated: Part 4\">there are a lot of ways to store crap in memory<\/a>. The popular choice is an array, You just get a big ol&#8217; wad of memory and you pack your things next to each other. This is called an <strong>array<\/strong>. Another way to store things is with a <strong>linked list<\/strong>. A linked list is a great way to store an ever-changing list of stuff and an even better way of annoying everyone who isn&#8217;t an old-school C coder.<\/p>\n<p>A linked list in action:<\/p>\n<pre lang=\"c\" line=\"1\">\r\n\/\/We are doing things C-style. \r\n\/\/Let's program like it's 1989!\r\nstruct Foo\r\n{\r\n  int   awesomeness;\r\n  int   color;\r\n  char* favorite_dance;\r\n  float psi;\r\n  Foo*  next_foo; \/\/This one is the tricky bit. A pointer to another Foo.\r\n};\r\n<\/pre>\n<p>You create a Foo object. It&#8217;s got some properties that you care about. One of those properties is that it has a pointer to a Foo object. So I make one of these objects. Let&#8217;s call this one FooAlice. It&#8217;s the only one I&#8217;ve got. I allocate a tiny little bit of memory for it and store my values there. <\/p>\n<p>Then I need another Foo. So I make FooBob. I get the memory, fill in the values, and then I set its pointer to the location of FooAlice. When FooCarl shows up, I do the same thing, but its pointer is set to the location of FooBob.<\/p>\n<p><tt>FooCarl &raquo; FooBob &raquo; FooAlice<\/tt><\/p>\n<p>So at any given time I only have direct access to one Foo, which is the most recently created. So if I ever want to see FooAlice again, I have to ask FooCarl where to find FooBob, and then ask FooBob where to find FooAlice. It&#8217;s like one of those scavenger hunts where you see a stickynote telling you to go to the kitchen, and in the kitchen you find a note telling you to go to the bedroom, which has a note to go to the bathroom, where you find a note telling you to stop wasting all the stickynotes and clean up this mess. When you&#8217;re looking at the first note, you have no idea where the trail will end, so you have no way of getting there except to follow the chain.<\/p>\n<p>You may ask, quite reasonably, &#8220;Shamus dear boy, this sounds like a lot of faffing about about to get back to our FooAlice. Why don&#8217;t we store all these locations so we can recall them later?&#8221;<\/p>\n<p>Yes, you CAN do that. At which point you have an array again. And arrays have their own drawbacks, as we&#8217;ll see in a minute.<\/p>\n<p><table width='800' class=\"\" cellpadding='0' cellspacing='0' border='0' align='center'><tr><td><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/gr31_city.jpg' class='insetimage' width='800' alt='This art is being re-done. Next next time I show this part of the game, it ought to look very different.' title='This art is being re-done. Next next time I show this part of the game, it ought to look very different.'\/><\/td><\/tr><\/table><\/p>\n<p>The advantage of linked lists is that you never have to shuffle around any big blocks of memory. You don&#8217;t reserve, free, or move groups of objects, so changing the list has a fixed (and very tiny) cost. If I&#8217;m done with FooBob, I just point FooCarl at FooAlice, and then I free up the memory used by FooBob. <\/p>\n<p><tt>FooCarl &raquo; FooAlice<\/tt><\/p>\n<p>The disadvantage of linked lists is that they&#8217;re a bit hard for some people to wrap their heads around. They require a lot of dumb boilerplate code. And (most importantly) accessing specific items in the list is slow. There&#8217;s no way to quickly jump to item #10 in the list. <\/p>\n<p>Most modern coders would turn their nose up at linked lists, but I think they&#8217;re a fine way to store stuff like these particle lists. I never need to do something to ONE particle. I only need to do things to ALL of them. I&#8217;m either moving them or rendering them. This is a great place to use a linked list, as long as you&#8217;re comfortable with the ugly retro way of doing things.<\/p>\n<p>Too bad that&#8217;s not what I used here.<\/p>\n<p>I used one of the sexy new C++ style vector arrays:<\/p>\n<pre lang=\"c\" line=\"1\">\r\nvector<Foo>   my_foo_list;\r\n<\/pre>\n<p>Using a vector is a lot less code than setting up a linked list. It has a bunch of features that do things for you. Delete stuff! Add new stuff. Insert stuff! There&#8217;s a lot of code packed inside these tools, and that code is generally hidden from people like me. You just create this array of crap and let the vector code handle it all for you.<\/p>\n<p>And that&#8217;s fine. But I sort of forgot that I&#8217;d taken that shortcut. <\/p>\n<p>Remember that new particles get added to the END of the list. So the particles at the start of the list tend to be the oldest and the ones at the end tend to be the youngest<span class='snote' title='4'>Particles have different lifespans, so sometimes ones near the middle of the list &#8220;die&#8221; before ones at the start.<\/span>. So then when I would delete particle #10:<\/p>\n<pre lang=\"c\" line=\"1\">\r\nmy_foo_list.erase (my_foo_list.begin()+10);\r\n<\/pre>\n<p>I hate that syntax, but whatever. The point is, this isn&#8217;t a linked list. This is an array. You can&#8217;t have gaps in an array. Everything has to be packed together. So when I delete #10 in a list of 2,000, it has to move the last 1,990 items over one space. (#11 becomes #10, #12 becomes #11, and so on.) I have no idea if it does this as a single move operation, or if it does 1,990 memory copies. It doesn&#8217;t matter. It&#8217;s ruinously slow. Deleting the first item in the list means moving every single  following item in the list. <\/p>\n<p>So let&#8217;s say this frame I create 200 particles. Then another 200 particles next frame. Then another 200 the frame after that. For the purpose of simplicity, let say I&#8217;ve got the particle limit set to 1,000. After six frames, the particle manager realizes it has 1,200 particles. Rather than throwing away the new ones, it throws away the oldest ones. So it immediately kills the first 200 particles in the list. One at a time.<\/p>\n<p>Remove the first particle in the list. Move the remaining 1,199 particles over to fill the gap.<br \/>\nRemove the first particle in the list. Move the remaining 1,198 particles over to fill the gap.<br \/>\nRemove the first particle in the list. Move the remaining 1,197 particles over to fill the gap.<\/p>\n<p>&#8230;and so on.<\/p>\n<p>It needs to copy the entire list 200 times. <\/p>\n<p>I have no idea how the vector works under the hood. Maybe when it moves them over it does so in one bulk copy operation. But MAYBE it does them one at a time. In an ideal world, these two things would be the same. But when you&#8217;re talking about C++ objects (structs or classes) there&#8217;s all sorts of hoodoo that might be going on under the hood. It&#8217;s the difference between moving a box across the room, or taking all the items out of a box and placing them in an identical box on the other side of the room. The result is identical but the steps are vastly different. <\/p>\n<p>The point is, deleting items out of the list like this is preposterously slow, and a single line of code generated a huge number of operations  that completely overshadowed the hundreds of lines of obvious processing and rendering code.<\/p>\n<p>The problem was right there in front of me. I scrolled past it a dozen times. But I didn&#8217;t look at it because I&#8217;m not used to worrying about list management. It&#8217;s usually a performance non-issue<span class='snote' title='5'>Again, assuming you don&#8217;t need to fiddle around with individual members.<\/span> and so I went looking for the usual suspects: processing data and rendering polygons. It was this strange professional blind spot that haunted me way longer than it should have, and a less experienced coder would probably have found it sooner.<\/p>\n<p>I could fix this by moving to a linked list, but in the interest of self-education I try a different approach. I change it so particles are only cleaned up once a second. When that happens, instead of removing items from the start of the list, it runs through all the particles and sticks all of the still-active particles in a new list. Then the new list replaces the old. <\/p>\n<p>This fixes the speed problem and now the particle code runs at least an order of magnitude faster.<\/p>\n<p>There&#8217;s probably more speed gains to be had, but I have no idea if it&#8217;s worth the time. The game is smooth again and it&#8217;s not time to optimize things yet. (This wasn&#8217;t optimizing, this was basic sanity management. This mystery was driving me crazy.) <\/p>\n<p>Onward.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s been about a year, but project Good Robot is moving again. I&#8217;m working with Pyrodactyl to make the game into something that&#8217;s hopefully worth paying money for. The game is going on Steam Greenlight. Here&#8217;s the trailer: Link (YouTube) If you&#8217;d like the game to see the light of day, please vote for it [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[498],"tags":[311],"class_list":["post-26133","post","type-post","status-publish","format-standard","hentry","category-good-robot","tag-good-robot"],"_links":{"self":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/26133","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=26133"}],"version-history":[{"count":0,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/26133\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=26133"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=26133"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=26133"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}