{"id":38002,"date":"2017-04-18T05:00:31","date_gmt":"2017-04-18T09:00:31","guid":{"rendered":"http:\/\/www.shamusyoung.com\/twentysidedtale\/?p=38002"},"modified":"2017-04-18T05:16:13","modified_gmt":"2017-04-18T09:16:13","slug":"pseudoku-texture-atlas","status":"publish","type":"post","link":"https:\/\/www.shamusyoung.com\/twentysidedtale\/?p=38002","title":{"rendered":"Pseudoku: Texture Atlas"},"content":{"rendered":"<p>Pseudoku is not going well. Part of the problem is that I&#8217;m busy with other stuff and I&#8217;m only working on this for a few hours a week. The more serious problem is that I&#8217;m still having strange compatibility problems that have no business cropping up in a project so simple. I&#8217;d be upset about this, but it&#8217;s not really hurting me right now. The project is stalled on the other end &#8211; the business end. I couldn&#8217;t possibly explain the whole stupid story here, but the short version is:<\/p>\n<div class=\"dmnotes\">I need a business bank account before I can sell a game on Steam. I&#8217;ve formed a company and we&#8217;ve got an EID<span class='snote' title='1'>It&#8217;s like a tax number for your business.<\/span> to use. Technically, I SHOULD be able to open a business bank account with this. But because of bureaucratic shenanigans I can&#8217;t begin to explain, we can&#8217;t actually find a bank willing to do this. They all insist that we need to form an S-Corp (expensive, time-consuming, and more complex than the problem we&#8217;re trying to solve, with the additional problem that it will make our taxes an even bigger job come tax day) or file for a fictitious name (costs hundreds of dollars in this state, takes weeks, and offers me nothing of value in return. (No, it doesn&#8217;t actually copyright or trademark the name from misuse. That&#8217;s ANOTHER set of paperwork. This just attaches a name (that we can&#8217;t change) to the EID.)<\/p>\n<p>The problem isn&#8217;t that you can&#8217;t solve this. The problem is that there are a hundred apparent solutions. (Different banks. Different forms. Pay someone to do this for us.) Finding the one solution that will waste the least time and money is the problem. <\/p>\n<p>Part of the confusion is that it&#8217;s really hard to parse some of these forms. Usually you can tell when you&#8217;ve got the wrong form. If you&#8217;re a single person and you&#8217;re filling out a form that asks about your spouse and children, then you can be reasonably confident you&#8217;re barking up the wrong tree and you need a different form. But in our case ALL of the forms feel wrong. Everything related to doing business was designed in the middle of the last century, where opening a business means you plan to do business locally. Everyone assumes I want to sell pancakes on main street. There&#8217;s no concept of a &#8220;one-man international business&#8221;. Sometimes it&#8217;s not possible to truthfully and accurately fill out a form because it asks questions that don&#8217;t make sense. <\/p>\n<p>Imagine you&#8217;re opening a furniture store, but the business form has a REQUIRED field that wants to know the license plate numbers of the cars you&#8217;ll be using to deliver the pizzas. That&#8217;s the level of incoherent stupidity we&#8217;re dealing with right now.<\/p>\n<p>In any event, the technology problems in Pseudoku don&#8217;t matter because the project is stalled by bureaucracy.<\/p><\/div>\n<p>I got a cheap ($90) minimalist Win7 box for testing Pseudoku. It&#8217;s an HP with integrated graphics. It&#8217;s got a virgin install of Win7 and no additional <a href=\"?p=252\">funny business<\/a>. Let&#8217;s put Pseudoku on it and see how it runs&#8230;<\/p>\n<p><!--more--><\/p>\n<h3>Here we Go Again<\/h3>\n<p>Like <a href=\"?p=37495\">last time<\/a>, the program instantly crashes when it tries to use any OpenGL calls invented after 1994. <\/p>\n<p>This makes no sense. I know integrated graphics systems are wonky, but this stuff really should be available, even on barebones system like this. I fiddle around with different ways of accessing those OpenGL extensions, and I always get the same result. <\/p>\n<p>Is this really a thing? Are there systems out there that support Direct X as it existed in 2010, but were stuck with the 1994 version of OpenGL? If not, what could I be doing wrong here? If so, why haven&#8217;t I heard about it and written a long rant about it yet? That&#8217;s kind of my thing. <\/p>\n<p>Like I said in a previous entry: I&#8217;ve never done deployment so I never had to worry about goofy edge cases and obscure hardware setups. In any case, I am tired of slamming my head into this problem. It&#8217;s a dumb waste of time. <\/p>\n<p>Looking at my code, do I really <strong>need<\/strong> the OpenGL extensions? This isn&#8217;t like Good Robot, where I needed to push potentially tens of thousands of polygons at 60FPS<span class='snote' title='2'>Which isn&#8217;t actually a big deal on modern hardware, but it&#8217;s still a couple of orders of magnitude more challenging that what I&#8217;m trying to do with this puzzle game.<\/span>. Let&#8217;s see if I can strip the program down to the base OpenGL calls.<\/p>\n<p>As it turns out, I&#8217;m using exactly <strong>one<\/strong> OpenGL extension. Everything else can fall back to vanilla OpenGL. The only thing that requires the modern stuff is the&#8230;<\/p>\n<h3>Texture Atlas<\/h3>\n<p>The problem you&#8217;re trying to solve is that texture-switching takes time, and excess texture-switching can kill performance. Imagine the graphics card is a painter. I tell him to paint a blue line. Then I tell him to paint a green line. But since his brush is loaded with blue paint, he has to lower the brush, clean off the old paint, and load it up with the new color. Then I ask him to paint another blue line and he has to go through all of that again.<\/p>\n<p>You can mitigate this by sorting all the brush strokes ahead of time. Draw all of the blue lines, then all of the green ones, etc. The problem is that now you&#8217;re doing sorting on the CPU. If you&#8217;re eating up cycles on your processor to save cycles on your graphics card, then that&#8217;s a red flag that you might be approaching the problem the wrong way around. Worse, this creates a moving bottleneck. A slow computer with a great graphics card will exhibit problems you don&#8217;t see on a fast computer with a middling graphics card. <\/p>\n<p>The better solution is to make a texture atlas. A texture atlas is when you take all the different textures you&#8217;re going to need in a scene and stick them into a single image. It&#8217;s like having a paintbrush already loaded with every color you&#8217;ll need, so you never have to clean off the brush and get a new color. <\/p>\n<p>The downside is that a texture atlas is huge and the card might not support anything that large. But this is actually a good thing! It gives you a clear pass \/ fail. You know exactly how much graphics memory the user will need and you can state so in the system requirements for the game. This is far more preferable to those weird-ass situations where a dozen computers will all have different performance problems and the bottlenecks aren&#8217;t always obvious.<\/p>\n<p>This requirement:<\/p>\n<p>&#8220;Your graphics card must have 3.2 megaboozles.&#8221;<\/p>\n<p>Is far easier for Joe Consumer to understand than this one:<\/p>\n<p>&#8220;Your graphics card needs 3 kilowappers, UNLESS your computer has less than 100 fizzlers, in which case you need 4.2 kilowappers, UNLESS you&#8217;re using the new Smeg class chips that support the next-gen shaders, in which case you can go all the way down to 2.5 kilowappers.&#8221;<\/p>\n<p>Making a texture atlas takes all these complex variables with regards to throughput and boils them down to the simple question of &#8220;Can this image fit in video memory?&#8221; It makes your engine simpler. All you have to do it place all of your textures into a single image.<\/p>\n<p>In Good Robot, I did this manually:<\/p>\n<p><div class='imagefull'><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/pseudoku_atlas1.jpg' width=100% alt='The Good Robot atlas. If you own the game, you can see the original in GoodRobot\/Textures\/sprites.png' title='The Good Robot atlas. If you own the game, you can see the original in GoodRobot\/Textures\/sprites.png'\/><\/div><div class='mouseover-alt'>The Good Robot atlas. If you own the game, you can see the original in GoodRobot\/Textures\/sprites.png<\/div><\/p>\n<p>That&#8217;s a portion of the Good Robot atlas. It was annoying to maintain. You had to manually arrange items on a grid, and if you were a pixel off in any direction then the resulting sprite would be clipped or have strange edges. Once you had the items placed, you had to tell the program how to find it using a system that was very convenient for the programmer but not convenient for the artist. That was fine when I was working on the game all by myself, but I felt bad for dumping that obtuse and inconvenient system on the artists. <\/p>\n<p>So after Good Robot I added some code that would build the atlas dynamically at launch. The artist puts in their textures, and the game will arrange them into an atlas and work out how to find them. It&#8217;s a lot less work. It looks at the sizes of the textures and figures out how to pack them efficiently. It also works out a map so it can find the individual images later, because otherwise what&#8217;s the point? <\/p>\n<p>This atlas building is currently the only part of Pseudoku that uses OpenGL extensions. I&#8217;m creating a blank atlas texture, then using a GL frambuffer object to render all the little sprites directly into the atlas. <\/p>\n<p>I don&#8217;t <strong>have<\/strong> to do it that way. Rather than handing the job off to the graphics card, I can manually build the atlas by arranging the images in main memory. Instead of creating a blank texture, I create a blank expanse of memory and copy the texture data into it a block at a time. When I&#8217;m done, I hand the memory off to GL and tell it to make a texture out of it. This new way is not as compact and it&#8217;s probably slower by some trivial amount that doesn&#8217;t matter to humans. But it gets the job done. Here&#8217;s the auto-generated atlas:<\/p>\n<p><div class='imagefull'><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/pseudoku_atlas2.jpg' width=100% alt='The auto-generated Pseudoku atlas.' title='The auto-generated Pseudoku atlas.'\/><\/div><div class='mouseover-alt'>The auto-generated Pseudoku atlas.<\/div><\/p>\n<p>Getting rid of the framebuffer stuff FINALLY gets me past the crashes. Pseudoku now runs on the virgin machine.  <\/p>\n<p>And so at last Pseudoku runs! I mean, it was already working on 90% of the machines out there, but now it&#8217;s running on a virgin machine with no redistributable packages, updates, drivers, or anything else. If it runs on this thing, it will run almost anywhere.<\/p>\n<h3>Except&#8230;<\/h3>\n<p>It&#8217;s slow. I mean really, mindbogglingly slow. It is so slow it would be hilarious if it wasn&#8217;t so annoying. On the virgin it gets a frame every other second. That&#8217;s <em>half a frame a second<\/em>.<\/p>\n<p>It takes the game two entire seconds to draw&#8230; how many polygons? <\/p>\n<p><div class='imagefull'><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/pseudoku_level1-1.jpg' width=100% alt='Rounding down, this is basically zero polygons by modern standards.' title='Rounding down, this is basically zero polygons by modern standards.'\/><\/div><div class='mouseover-alt'>Rounding down, this is basically zero polygons by modern standards.<\/div><\/p>\n<p>On the very first level there are six tiles, each of which are two quads. The slots where you place the tiles make another 6 quads. In the word &#8220;Pseudoku&#8221;, each letter is another quad. The mouse pointer and the glowing aura around it each count as another quad. The entire gradient background is one more. Then the six menu buttons at the bottom add another 12 quads. That means the entire scene is 41 quads.<\/p>\n<p>&#8220;Well Shamus, everyone knows integrated graphics are terrible. You should expect poor performance.&#8221;<\/p>\n<p>Let me see if I can put this into perspective. This is Unreal:<\/p>\n<p><div class='imagefull'><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/unreal_intro.jpg' width=100% alt='In 1998, this was one of the most amazing things I&apos;d ever seen on a computer.' title='In 1998, this was one of the most amazing things I&apos;d ever seen on a computer.'\/><\/div><div class='mouseover-alt'>In 1998, this was one of the most amazing things I&apos;d ever seen on a computer.<\/div><\/p>\n<p>Unreal came out in 1998. At the time, <a href=\"https:\/\/wiki.beyondunreal.com\/Legacy:Unreal\">it required a 166Mhz<\/a> computer. My machine was right in that ballpark when I played it. This was before ubiquitous graphics acceleration, so the game could do all of its rendering on your humble little CPU. At the time, the <a href=\"https:\/\/www.youtube.com\/watch?v=26I-Pw-yPJ4\">flyby intro<\/a> would dip down to about 10FPS when the camera pulled back to reveal the entire castle. For the purposes of comparison, let&#8217;s make the fairly reasonable assumption that the game was rendering about 400 polygons when you include the castle, the canyon, the little guys milling around in the distance, the particle effects, the sky, and the reflections.<\/p>\n<p>Pseudoku is drawing 1\/10th the polygons, on a CPU that&#8217;s 20&times; faster, and yet it&#8217;s running twenty times slower. I know integrated graphics are garbage, but they&#8217;re not <strong>that<\/strong> bad. They&#8217;re not &#8220;three orders of magnitude slower than 1998 processors&#8221; bad. If that was the case, there would literally be no point in having integrated graphics at all.<\/p>\n<p>So it&#8217;s probably not the rendering itself that&#8217;s slowing it down. On the other hand, I have no idea what&#8217;s causing this. I can have the game skip most of the rendering and the framerate stays about the same. <\/p>\n<p>I guess the next step is to go through the program and disable systems one at a time until I find the culprit. <\/p>\n<p>OR!<\/p>\n<p>I could fix this by simply requiring the end-user to have a graphics card made in the last 5 years. But damn it, it SHOULD work on this machine and it shouldn&#8217;t be this slow. <\/p>\n<p>The truth is, I don&#8217;t <strong>need<\/strong> to solve this problem. This machine represents such a vanishingly small portion of the market that I don&#8217;t need to spend this much time trying to get stuff to work. I guess I&#8217;m being stubborn because this is really bugging me, not because this is a good use of my time.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pseudoku is not going well. Part of the problem is that I&#8217;m busy with other stuff and I&#8217;m only working on this for a few hours a week. The more serious problem is that I&#8217;m still having strange compatibility problems that have no business cropping up in a project so simple. I&#8217;d be upset about [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[66],"tags":[],"class_list":["post-38002","post","type-post","status-publish","format-standard","hentry","category-programming"],"_links":{"self":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/38002","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=38002"}],"version-history":[{"count":0,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/38002\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=38002"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=38002"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=38002"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}