{"id":9579,"date":"2010-10-06T07:20:49","date_gmt":"2010-10-06T12:20:49","guid":{"rendered":"http:\/\/www.shamusyoung.com\/twentysidedtale\/?p=9579"},"modified":"2010-10-06T21:33:39","modified_gmt":"2010-10-07T02:33:39","slug":"ask-me-a-questionwhat-is-trashing-the-heap","status":"publish","type":"post","link":"https:\/\/www.shamusyoung.com\/twentysidedtale\/?p=9579","title":{"rendered":"Ask Me a Question:<br\/>What is &#8220;Trashing the heap&#8221;?"},"content":{"rendered":"<p>In an <a href=\"?p=9557\">earlier post<\/a>, I talked about making programs trash the heap, and someone wanted to know what that was. Trashing the heap is something you&#8217;ve seen before.  It often looks like this:<\/p>\n<p><table   class=\"\" cellpadding='0' cellspacing='0' border='0' align='center'><tr><td><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/popup_crash.jpg' class='insetimage'   alt='popup_crash.jpg' title='popup_crash.jpg'\/><\/td><\/tr><\/table><\/p>\n<p>Here is how it works:<\/p>\n<p><!--more--><\/p>\n<h3>The Heap&#8230;<\/h3>\n<p>You&#8217;ve probably noticed that programs take up computer memory.  As they do stuff, they need to store information. The program says, &#8220;Hey, I need 8 bytes of memory.&#8221;  The system finds a free spot in memory big enough to hold something 8 bytes in size, and tells the program where it is. This happens thousands of times a second for a busy program. Get four bytes. Get eight more bytes. Then release the four because you&#8217;re done with them. Now get 100 bytes. Now get a megabyte. Now drop the eight bytes. <\/p>\n<p>This sea of data is called &#8220;the heap&#8221;.  It&#8217;s also sometimes called the &#8220;free store&#8221;, but I&#8217;ve only ever heard old beardy types use that terminology. I think the last time I saw the words &#8220;free store&#8221; was in 1991-ish.  And the book I was reading was already old.<\/p>\n<p>Anyway, usually this activity is abstracted for a programmer. You just create variables when you need them and throw them away when you&#8217;re done. <\/p>\n<p>But sometimes, in some languages, you do need to worry about memory. And this is where things get messy.<\/p>\n<h3>&#8230;and the trashing thereof.<\/h3>\n<p>Let&#8217;s say you&#8217;re programming in C or C++, and you have the program grab enough memory to store 20 bytes of data. And now you make a perfectly innocent mistake and accidentally copy 4,096 bytes into that 20-byte slot.  (This is actually easy to do for a lot of reasons. More on that in a minute.) Your data will fill up those 20 bytes, and then overwrite the next 4,076 of data.  Any variables that happened to be occupying that space have now had their values replaced with something different.  <\/p>\n<p>Congratulations, you&#8217;ve just trashed the heap.  If you are very very lucky, the program will crash right away. <\/p>\n<p>If you are <em>not<\/em> lucky, it will continue to run but begin behaving oddly. See, those 4,076 bytes of memory might have been filled with crucial bits of data needed to keep this program operational. If it crashes instantly you can look at what it was doing just before death and you&#8217;ll find the trouble spot. But those bytes of memory could have been empty, and thus spewing a bunch of random garbage into that space is &#8220;harmless&#8221;.  (This time.) What you&#8217;ve got in this case is a random crash bug. The program may run fine, act oddly, or insta-crash, all based on what things happened to be in that spot of memory at the time. <\/p>\n<h3>WHYYYYY!?!?<\/h3>\n<p>This is the subject of holy wars. Some say that C is a crap language because you can (and must) interact with memory directly. Other people say the people in the first group are just crap programmers.<\/p>\n<p>I am not an expert on languages. Other than doing a lot of BASIC in my teenage years I&#8217;ve spent limited time dabbling outside of C, so I can&#8217;t make any really good comparisons.  But 90% of the problems I have in C are because it doesn&#8217;t have a simple way of handling strings of text.<\/p>\n<p>Here is a bit of old-school BASIC code that adds two strings of data together:<\/p>\n<pre lang=\"basic\" line=\"100\">a$ = \"Twilight is a great book\"\r\nb$ = \"- for starting arguments!\"\r\nc$ = a$ + b$\r\nPRINT c$\r\n<\/pre>\n<p>Run this, and it will take two phrases:<\/p>\n<p><code>Twilight is a great book<\/code><\/p>\n<p>and: <\/p>\n<p><code>- for starting arguments!<\/code><\/p>\n<p>And merge them together to print the entire sentence. <\/p>\n<p><code>Twilight is a great book - for starting arguments!<\/code><\/p>\n<p>In many languages you can define bits of text, cut them up, join them, or whatever you like and you don&#8217;t have to worry about memory. In fact, it&#8217;s actually impossible to worry about memory in old-school BASIC &#8211; it has no tools for doing so.  It&#8217;s simple. It&#8217;s readable. It&#8217;s impossible to make it crash. (Although it&#8217;s still possible to create all sorts of other bugs. But trashing the heap is not a risk.)<\/p>\n<p>In C*, the language does not do all of this legwork for you.  If you wanted to add those strings together you&#8217;d need to measure the length of the first string.  Then measure the length of the second. Then allocate a block of memory large enough for both strings plus 1 extra byte. Then copy the first string into that spot of memory.  Then copy the second string in the spot 24 bytes after that (just after the first string).  <\/p>\n<pre lang=\"c\" line=\"0\">\r\nchar* TellMeAboutTwilight ()\r\n{\r\n\r\n  char a[] = \"Twilight is a great book\";\r\n  char b[] = \"- for starting arguments!\";\r\n  \/\/See how big a and b are\r\n  int len = strlen (a) + strlen(b);\r\n  \/\/Now allocate enough memory for both, plus 1 byte\r\n  char* c = (char*)malloc (len + 1);\r\n  \/\/Now copy a into c\r\n  strcpy (c, a);\r\n  \/\/Now copy b just after that\r\n  strcpy (c + strlen (a), b);\r\n  return c;\r\n\r\n}\r\n<\/pre>\n<p>(Note to would-be nitpickers: I&#8217;m aware you&#8217;d use sprintf to save yourself a few lines, and I know you wouldn&#8217;t really do things just this way. This post is for non-coders. Don&#8217;t Be That Guy.)<\/p>\n<p>The advantage of the C way is that it is crazy fast and memory efficient. This was important back when machines ran at sub-megahertz speeds and had 64k of memory. Which is not even enough memory to store this one image:<\/p>\n<p><table width='256'  cellpadding='0' cellspacing='0' border='0' align='center'><tr><td><a href='http:\/\/icanhascheezburger.com\/2008\/02\/16\/funny-pictures-i-upgraded-your-ram\/'><img src='https:\/\/www.shamusyoung.com\/twentysidedtale\/images\/ram.jpg' class='insetimage' width='256' alt='ram.jpg' title='ram.jpg'\/><\/a><\/td><\/tr><\/table><\/p>\n<p>But today we&#8217;ve got computers with lots of power and spending three minutes trying to save ten bytes of memory is a horrifying waste of programmer resources. It&#8217;s like pushing your car through an intersection to save on gas. The effort is much greater than the savings and you&#8217;re a lot more likely to cause a crash somewhere. <\/p>\n<p>90% of my memory mishaps are the result of juggling string data like this. <\/p>\n<p>1) Measuring and allocating memory is annoying and adds a lot of extra lines of code and is prone to mishaps.  <\/p>\n<p>2) You have to remember to explicitly free the memory later, &#8220;Okay, I&#8217;m done with this spot now.  Something else can use that memory.&#8221;  If you forget, then each time this part of the program is run it will grab more memory. (This is called a memory &#8220;leak&#8221;.  The program eats up more and more memory the longer it runs.)  <\/p>\n<p>3) You have to remember to not use that spot after you&#8217;ve freed it. The variable might still be around, but after you&#8217;ve freed the memory it pointed to it&#8217;s just a crash waiting to happen. <\/p>\n<p>4) Some programmers &#8211; myself included &#8211; save themselves the headache of measuring &#038; allocating by just grabbing a space that&#8217;s &#8220;always going to be big enough&#8221;.  Instead of measuring a and b, I&#8217;ll just grab&#8230; hmmmm&#8230; 100 bytes? That sounds good. In effect, I&#8217;m routing around the features of the language that were intended to make C fast and memory efficient by deliberately wasting memory.  And of course, maybe in some unusual circumstances 100 might not be enough. Did I remember to add a bunch of code to catch and handle that case? <\/p>\n<p>5) Strings must end in an invisible terminating character. When you print or copy strings, it looks for this terminator to let it know when to stop copying. If that terminator isn&#8217;t there for some reason it will keep printing or copying until it runs right out of the space you&#8217;ve allocated and will sail off into the heap looking for it. You&#8217;ll end up printing a bunch of garbage or (worse) copying a lot more stuff than you intended. This also means that the length of all strings (in memory) is the number of characters it contains, plus one.  It&#8217;s just really easy to make one-off mistakes like this. <\/p>\n<p>Interfacing directly with memory is really fast but also dangerous. It&#8217;s a powerful tool, like a flamethrower. Critics of C and C++ say that languages shouldn&#8217;t have flamethrowers. Supporters say that flamethrowers are fine, you just need to not make any mistakes.  I&#8217;m of the opinion that having a flamethrower  is a good thing, but I shouldn&#8217;t need to use it every time I want to light a cigarette. <\/p>\n<p>I don&#8217;t mind this hassle when I&#8217;m dealing with something big. When I&#8217;m loading 10MB texture maps and complex 3d models into memory I don&#8217;t mind the overhead involved to take care of them efficiently. They&#8217;re big and you&#8217;re usually in an unbelievable <em>hurry<\/em> when you&#8217;re dealing with those. But juggling crappy little 10-bytes strings like they were live hand grenades is tedious. Partly because they&#8217;re so trivial, and partly because it&#8217;s something that needs to be done often. Even after all these years I still get annoyed at how cluttered and inelegant it is when I want to deal with a couple of short strings. What would be a single line of code in any other language ends up being half a dozen. (If done properly. If done improperly it&#8217;s just two lines of code now and half an hour of pulling your hair out six months from now when you have to sort out why it&#8217;s crashing.) <\/p>\n<p>There are add-ons for C++ out there that will help with this, but they&#8217;re not <em>standard<\/em>.  If you use one, you may find your code is no longer portable. Or it will be a headache for other programmers to read and maintain. Or those add-ons might conflict with something else you&#8217;re trying to use. <\/p>\n<p>And now you know.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In an earlier post, I talked about making programs trash the heap, and someone wanted to know what that was. Trashing the heap is something you&#8217;ve seen before. It often looks like this: Here is how it works:<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[66],"tags":[],"class_list":["post-9579","post","type-post","status-publish","format-standard","hentry","category-programming"],"_links":{"self":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/9579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9579"}],"version-history":[{"count":0,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=\/wp\/v2\/posts\/9579\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.shamusyoung.com\/twentysidedtale\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}