Coding Style Part 1

 By Shamus Jan 17, 2013 154 comments

splash_keyboard.jpg

I think most software companies have a set of rules for writing and formatting code. This is often referred to as a “style guide”. This tells the programmers on staff how to make code that will fit together in a coherent system. For example, in C++ this code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#define IMPORTANT_NUMBER          10
 
void DoSomething (const  int iFirstVariable, char* szSecondVariable )
{
 
        int   iThirdVariable ;
        float iFourthVariable ;
 
        iThirdVariable = iFirstVariable / IMPORTANT_NUMBER ;
        iFourthVariable = 0.0f;
        if ( iFirstVariable > IMPORTANT_NUMBER ) 
        {
                DoSomethingElse ( iFirstVariable + 1 ) ;
        }
        if ( strlen ( szSecondVariable ) > IMPORTANT_NUMBER ) 
        {
                return;
        }
        strcat ( szSecondVariable , "foo" ) ;
        iFourthVariable = (float)iFirstVariable / IMPORTANT_NUMBER ;
        if ( iFourthVariable < IMPORTANT_NUMBER )  
        {
                if ( strlen ( szSecondVariable ) > IMPORTANT_NUMBER ) 
                {
                        return;
                }
                else if ( CalculateSomething ( ( iFirstVariable * 
                          iFirstVariable ) > IMPORTANT_NUMBER ) 
                {
                        strcat ( szSecondVariable , "bar" ) ;
                }
        }       
        DoFinalThing (  iThirdVariable, iFourthVariable, szSecondVariable ) ;
}

… and this code…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void do_something (int var1, char* var2)
{
  int var3=var1/10;
  float var4=0;
  if (var1 > 10) 
    Do_something_else (var1+1);
  if (strlen (var2) > 10)
    return;
  strcat (var2, "foo");
  var4=(float)var1/10;
  if (var4 < 10) {
    if (strlen(var2) > 10) 
      return;
    else if (calc_something (var1*var1) > 10) {
      strcat (var2, "bar");
  }
  do_final_thing (var3, var4, var2);
}

…are basically the exact same code. (Assuming I didn’t botch the conversion.) They will do the same thing in the same way and will have the same bugs. (They might produce very slightly different executable code, but only because on line 3 of the second example, I assign a value to var3 before declaring var4. Other than this discrepancy, the resulting program should be identical. I think.) The second is written in a very compact style that was popular among traditional C programmers fifteen years ago. The first is more modern and is more common among younger C++ focused coders. Both of them have drawbacks and shortcomings.

big_brain.jpg

The thing is, the cost of code isn’t in writing it. It’s in debugging, maintaining, updating, and re-using it. Contrary to popular belief, we coders are not super-geniuses that spend all day solving equations in our massive pulsating distended brains. More often than not, we’re regular people with a very peculiar idea of fun. In fact, I’ve found that it’s the really smart programmers who understand just how limited their mental faculties are and will make allowances for their own limitations. It’s the dum-dums (or more often, just the inexperienced) who assume they will always remember everything in perfect detail later.

It’s pretty common to write code and then forget what it does or how it works. So when you’re writing code you should always be thinking of the other programmer. The dumb programmer. The programmer who doesn’t understand this code and has no idea what’s going on. The odds are very good that this other programmer is going to be a future version of you. That other programmer is going to want to look at this code and understand it as quickly as possible. This will happen many times, every time the code is revisited or even read. (Sometimes the coder will be looking for something else, but will get turned around and find themselves at your code. The sooner they realize this isn’t what they need, the sooner they can move on.)

If it takes five minutes to untangle your code, then it will cost five minutes every time someone needs to look at it. If you can reduce that re-learning time to one minute, then the other programmer (who still might be you!) will be able to work five times faster. Multiply this effect across the entire codebase and you suddenly realize that you have a lot of power over that other programmer. You can make their life easy or you can torment them with obscurity and confusion. (The awesome thing is that if the other programmer is you, then they get what they deserve either way.)

Back when I was paid to write software, our company had a coding practices guide. We were a small company, so the system was largely informal and only covered the basics. The original guide was devised by one guy, who basically wrote most of the original software himself. When the rest of us joined him on the coding team, his own personal style became the company style by default. The guide remained after he departed. His successor didn’t see any reason to change it because consistency is more important than personal preference and nobody wanted to re-write or re-format all that old code. When the job fell to me, I retained the style for the same reason. After several years I got comfortable with it and it influenced the style I use in my personal projects. If you’ve ever downloaded the source to any of the projects I’ve done over the years, you’ve probably noticed bits of anachronistic C formatting in my code. This is why.

In a larger company these guides are more likely created by a group of people, and the question of what is “best” is probably shaped by what sort of software you make. If your software is some sort of complex mathematical system (perhaps a simulation) then your team might be served with a really sparse style with lots of empty space so it’s easy to follow the flow of all those densely packed variables. Meanwhile, if you’re writing something with lots of branching, looping code broken into small methods (like code that drives a GUI interface) then a really wide style will make even simple code overflow onto many excess pages. Your coders will end up spending all day looking at whitespace and smacking the Page Down key.

doom3.jpg

I bring all this up because the Doom 3 source code has been released to the public and someone asked about this analysis of the source. In the process of reading that, I stumbled on the internal coding conventions of id Software, (which are publicly available as an Office document) and got to read about how they write code in the house of Carmack.

I’m going to follow up with another post where I talk about these standards and, what I think about them, and what makes these fussy little details so important.

A Hundred!202014We've got 154 comments. But one more probably won't hurt.


  1. Dan says:

    If it takes five minutes to untangle your code, then it will cost five minutes every time someone needs to look at it.

    When I was just starting in my career a senior developer I really respected explained this point by saying: “I’m going to write this code once; I’m going to have to read it a hundred times.”

    So when you’re writing code you should always be thinking of the other programmer…. The odds are very good that this other programmer is going to be a future version of you.

    I explicitly use this concept in my workflow. Before I commit anything to source control I ask myself how ‘future me’ is going to feel about coming back in to that code, and then cleaning it up if necessary. And sometimes I curse ‘past me’ for not being careful enough with that criteria.

    • Wedge says:

      “I’m going to write this code once; I’m going to have to read it a hundred times.”
      This is explicitly the design philosophy behind Python (expressed by its creator as “code is read more often than it is written”), which is why it’s such a joy to work with: it’s expressive without being inscrutable.

      • Asimech says:

        Doesn’t “expressive” imply non-inscrutability?

        Anyway, there’s something about whitespace, even when it’s made visible, that I can’t quite handle so even the tiny pieces of code I’ve done in Python felt harder to read than they should have been. So there’s a bit of Your Mileage May Vary with Python’s readability, even if it’s just weirdos like me.

        I thought that the Python philosophy was “there should be only one (obvious) way of doing it”, what you’re saying is a lot better.

        • nmichaels says:

          Expressive does not imply scrutable (it’s a word now!). See, for example, Perl, where people pride themselves on being able to write an entire program in a single line. Or APL, which is like Perl but replace line with character.

          • Asimech says:

            I check words like “expressive” online, but Oxford Online Dictionary unfortunately only has the common meaning* and I didn’t remember it’s also a programming term.

            I didn’t use “scrutable” because the spell-checker didn’t like it, so I figured I would be safer just using what is essentially a double-negative.

            * Which is “effectively conveying thought or feeling”.

          • Scrutable has been a word since 1590ish, so don’t worry about using it.

            It’s a British English word, which is probably why the spellchecker hates it.

            Mind you, this spellchecker also hates the word ‘spellchecker’ which I think tells us all we need to know about how useful it is.

        • Wedge says:

          Yeah, here by “expressive” I roughly mean “how much you can do in a single line of code” This is often the *opposite* of readable–think the kinds of difficult-to-understand things you can do with pointer arithmetic in C, or any Perl program ever. Python tries to be expressive in a way that still makes it clear exactly what’s going on; most of the time it succeeds.

        • Feriority says:

          The python philosophy is a lot of things, and both of those ideas are part of it. PEP 20, the Zen of Python, is a good summary (and readily accessible in any interactive python session by running `import this`).

      • bloodsquirrel says:

        That’s…. not really true. Python takes discipline on the part of the coder to be readable- otherwise, it can get full of lists containing things and variables of indeterminate type that pop out of nowhere.

      • Peter H. Coffin says:

        Let’s just say that you have a very different opinion of “joy” than I do… *mumbling about significant whitespace being arbitrary, hidebound convention thrown off by even COBOL and RPG programmers a decade ago*

    • Feriority says:

      Closely related to that thought is this one:

      “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” – Brian W. Kernighan

      If you’re not familiar with Kernighan, take a look at http://en.wikipedia.org/wiki/Brian_Kernighan – he’s the K in K&R C, and a founding father of Unix, too. He knows what he’s talking about.

  2. Weimer says:

    Yeah, my preferred style is ultra compact with loads of magic numbers. I do remember (usually) what the numbers mean if I ever review my code, even if they seem to drive everyone else insane.

    • Kdansky says:

      Don’t do that. Code that everyone else hates gets you fired, because it’s the programmer equivalent of “That guy is very difficult to work with.”

      • Brandon says:

        Alternatively, Code that no one else can deal with is good job insurance. ;p

        People will still hate you, but at least you will get paid for it!

        • Dasick says:

          I hope you aren’t serious.

          At any rate, if they’re smart, you’ll get fired and they’ll pay someone to write it properly. It will cost them more in the short run, but it will save them money in the long run and serve as an example for all other coders that want to go rogue.

          • Aristabulus says:

            You seem to be assuming that management can actually divine the function of the code without help. This is rarely the case.

            I agree with the sentiment that it’s a shitty way to go about coding, but it would be foolish to believe that such things don’t occur; crazy things get justified for perceived job security. Byzantine code also feeds into the BOFH ego trip that some programmers have.

      • Weimer says:

        Yeah I know. They tolerate me because I comment the crap out of the code, so it isn’t at all incomprehensible. I’ll admit my “style” makes it look ugly as sin though. ;)

        • Felblood says:

          So long as the end user never sees your code, ugly-but-practical trumps pretty-but-useless.

          • aldowyn says:

            that’s the attitude of a lot of programmers, I think, especially game programmers who have a tendency of wearing a lot of hats (or they USED to back when even the biggest games had relatively small teams)

            That’s part of why the Doom 3 code is so exceptional I think…

          • Dasick says:

            Writing is re-writing.
            Game design is playtesting.
            Painting is re-drawing
            Coding is debugging

            Like many disciplines, iteration is king, unless you can get it working *perfectly* on your first try. If you are, then you’re probably not challenging yourself enough and bored out of your mind.

    • Volfram says:

      You should at very least strip your magic numbers off into separate variables. This has the advantage of letting you change every instance of that magic number at once, while making your code much friendlier to other people and pretty much eliminating the points when you yourself don’t remember.

      I’m currently reviewing the game engine I’ve been building for a couple of years now, and I’m pretty happy with most of my old formatting. The stuff I’m not happy about is currently getting torn out and rewritten.

    • nmichaels says:

      You are my nemesis. We have a special token at my work for people like you.

  3. Ingvar says:

    Noooooo! Nooooooooooooooo! Hungarian should only encode semantic types. That you have a zero-terminated string is clear from the type, but is it a username, a resource identifier, a blarghinator, a fundibulum or a skvorzer?

    • Kdansky says:

      Good use:

      int xCellSize;
      int yCellSize;

      This really makes a difference when you write “yHeight – xMaxDistance” by accident, and it’s really clear that this calculation is probably wrong.

      CString usName; //unsafe means it can contain unsafe chars such as ‘\’
      CString sName; //The same string, just safe for DB-use
      CString csName; // crypto-version

      This prevents you from writing “SELECT * FROM table WHERE name = ?” with the usName as parameter, resulting in people entering “‘asd’ OR TRUE; DROP *;” and destroying your database.

      And generally speaking: Your variable names are some of the most important bits of information about your code. If at all possible, use descriptive names, and don’t repeat just what the type information already tells you. “Geometry* pGeometry” is really not a useful name when you are writing a software with many geometries in it, and totally redundant with its type on top of that! Name it for its use, such as “fileContents” or “previewObject”, or whatever it is you are doing. It is really annoying that people write names like “int iValue”, which not only tells you nothing about the use of this variable, but also pretends to be informative, when in reality I already know it’s an int because it says it’s an int!

      As for short variable names: If you only need your variable for five lines, you can go with “i”. If you need it for ten lines or more, choose a proper name (“height”). If you need it for the whole class, it should be at least two or three words strung together (“HeightOfBiggestBuilding”).

      • X2-Eliah says:

        usName is for unsafe name? And not, say, username? And the sName could mean surname.
        Just because you think usName/sName are obvious doesn’t really mean that they are. If you truly want crystal-clear variable names, force them to be crystal clear atthe expese of length. unsafeName / safeName. Basically.. like you yourself showed with “fileContents”

        • Zukhramm says:

          It is crystal-clear for those in on the convention. As for username, it does not seem to me to be the kind of information you’d put in a prefix bur rather in the variable name proper, i.e. “usUsername” or “sUsername” in the case of marking safety wit ha prefix.

          • Felblood says:

            If it’s made explicit in the style guide for the project then you don’t have to make it explicit in the code,

            BUT it pays to be aware of the fact you’re doing that, because you may someday work on a project with it’s own style guide.

        • Kdansky says:

          I’m not saying usName is the best variable name ever (especially because “name” is so unclear to begin with). But when you have fifty variables read from a webform in front of you, and some of them are safe and some are unsafe, this makes a lot of sense. I would not use these shorthands in generic code (for the reasons you describe).

          My point is: If you really want to use hungarian notation, use it correctly! I’m technically required to write crap like “uiIndex” (because it’s an unsigned integer), but that is completely wrong for more than one reason:

          1. Using an unsigned when you could use an int results in problems such as any subtraction becoming a potential error.
          2. Unsigned are harder to convert to and from floats. That may sound irrelevant, but when you want to access single bits in floats (for comparison, for example), converting them to and from int is actually really easy.
          3. uiIndex tells me nothing that “i” (as an acceptable shorthand for any indexing int) would not have told me, but it clutters my screen with useless data.

          Just found this gem in the code in front of me:

          if(rCI.pT1L) rCI.pT1L->SetAdjacency(rCI.dwT1LEdgeVA, rCI.pT1R);

          Guess what that means? The hungarian “r” and “p” only make this even harder to read, but don’t give me meaningful information. I suppose this code does something with adjacencies and edges, and therefore polygons. I still believe that our code has very high standards in regards to style, but there is still so much needless complexity.

      • silver Harloe says:

        I don’t … understand camel case. I really don’t. I’ve been reading English all my life, and I’m used to spaces between words. Variable names can’t have spaces (in most languages), fine, but that’s what ‘_’ is for – it provides the same visual spacing as the space bar so the words read more naturally.

        I also am not a big fan of Hungarian for the same reason X2-Eliah mentions. I’m going understand the difference between “unsafe_name” and “safe_name” a lot faster than “usName” and “sName”. I also sometimes put the ‘type’ at the end because, as a long time reader of English, I’m used to the form “adjective adjective noun.”

        So, yes, I end up with long identifiers, but I can read them very quickly, and it turns out I can type them very quickly because my editor has a completion function. So, sure, I end up using a loop variable like “current_model_id” instead of “i”, but darned if my future self doesn’t love me for that later.

        Were I to see “confirm_no_image_dialog = confirm_no_image_accept_callback” in my code, I’m pretty sure I’d spot the bug just as fast (the dialog uses callbacks, it doesn’t equal them). I dunno, maybe it is slower to read my code – and I’m sure I invite a comment like “why use == instead of is_equals then?” and the answer is vague and hand-wavy, though I’m sure I’d have more bugs caught at compile time instead of run time if I DID have to write
        “set animal to cat” and “is animal equal to cat?” out like that (as opposed to them being “animal = cat” and “animal == cat” and dang that missing = is sometimes really hard to spot). I think my ideal world language would be able to translate between those views with a button click.

        • Wedge says:

          Agreed on camelCase, I find it EXTREMELY ugly. I usually use lowercase_with_underscores for variables and TitleCase for classes, which is the usual style in Python-land, but I tend to carry it with me to other languages because I find it’s the most readable.

        • ngthagg says:

          The reason I hate underscores is BECAUSE it looks like whitespace, which is confusing when reading code quickly. Reading a line such as “Type variable_name = value”, I don’t want to see white space in the middle of the variable name. I want to see it as one chunk, variableName.

          Of course, I suspect if I had used underscores when I first learned to code, I would be agreeing with you. It’s tough to differentiate between “this is better” and “this is how I always did it”.

          • silver Harloe says:

            Ahhhh, I’m so used to editors with syntax highlighting that I don’t need whitespace to differentiate token breaks, but I can see why that would be useful in monochrome. Thanks, at least now I understand the case for camelCase.

          • Felblood says:

            I can adapt to either system, but I prefer no underscores for this reason.

            When I write my own guide for a project, I use a system I learned from one of my first CS profs, which allows no more than one underscore per name, to delineate variable type data from unique variable identifiers.

            IE: privateString_characterName is preferred; characterName is acceptable; and Character_Name is an abomination.

            –but once again that’s just when it’s me deciding the rules of the game. As Shamus wisely points out, consistency is more important than personal preference.

            Everyone on a project has to be on the same page as to what is implied and what must be explicit.

      • cadrys says:

        If your DBA is allowing you to send straight dynamic SQL like this for any important table, they deserve the job of restoring the backup & explaining where yesterday’s sales orders went. IMAO, of course.

      • Neko says:

        The warts on the variable name suggest you want it to be an unsafe/safe string, but nothing’s stopping them from getting mixed up somewhere along the line.

        I’d use typedefs and let the “escape unsafe characters” function do the conversion from UnsafeStringType to SafeStringType.

        • Alan says:

          As in this?

          typedef std::string unsafestring;
          typedef std::string safestring;
          
          unsafestring a = "Danger!";
          safestring b = a; /* Hoping compiler will reject */
          

          I don’t think the compiler is required (or even allowed?) to block that. Indeed, GCC 4.6.1 thinks the above is just dandy. I think you need to go as far as creating new classes:

          class unsafestring : public std::string { };
          class safestring : public std::string { };
          

          That works, although you actually need to provide wrappers for all of the constructors; otherwise you lose access to them. That said, I like this strategy; I’m generally in favor of moving error detection into the compiler whenever possible.

          • Peter H. Coffin says:

            All of this kind of thing is going to be HIGHLY environment-dependent. The Casing Police will have a lot of fun as soon as they start dealing with a mix of languages that may or may not have token case-sensitivity mixed with an OS that doesn’t have name sensitivity at all, combined with writing to a database system that can be configured to behave either way, inconsistently between scopes, such as the connection specifying insensitive, but with tables that are, but the comparison functions being so or not depending on the defined attributes of the column, which, oh by the way, are user-supplied tables so you can’t assume ANYTHING about them.

    • David says:

      Correction: Hungarian should only be taken out behind the barn and shot.

      • Dasick says:

        Describe what is wrong with the Hungarian notation please, and how it’s a fundamental problem, not just people abusing it due to lack of understanding.

        • Alan says:

          You can probably safely assume that when someone is angry at Hungarian, they’re angry at Systems Hungarian, which is a war crime. And unfortunately Systems Hungarian is the one that’s widespread.

          • Dasick says:

            Systems Hungarian is a tool, like all others. People don’t understand the tool and misuse it, but is that really the reason to condemn it and destroy it? Wouldn’t it be better to try and instill a culture of respect and understanding of the tool?

            • Alan says:

              Some tools are so awful, some completely free of utility that they should be destroyed so people don’t injure themselves on them. Systems Hungarian was created by accident, spread by fluke, and only persisted out of habit and bureaucracy. It’s useless and irritating. Teaching people to respect and understand it is like teaching people to respect and understand bloodletting to release bad humours as a medicine. What’s better is pointing people at better tools, in this case Apps Hungarian.

              • Asimech says:

                Can’t really comment on this specific topic as I haven’t done coding in well over a year and I’ve never used Hungarian notation, but:

                “Some tools are so awful, some completely free of utility that they should be destroyed so people don’t injure themselves on them.”

                This is true. It’s not even all that important if it’s an inherent problem in the tool or just people using it wrong, if it causes enough trouble it is safe to consider it a “bad tool” and discourage people from using it.

              • MrPyro says:

                I find Systems Hungarian useful in weakly typed languages, where you cannot rely on the compiler/interpreter to pick up on accidental type conversions, and variables are not declared anywhere as being of a specific type.

                • Alan says:

                  Fair enough; I was coming at the discussion from a C/C++ (and I suppose Java/C#) direction, where you have reasonably strong typing.

                  • Deoxy says:

                    Reasonably strong typing… in C? Where you can put an alpha in a variable and then perform addition on it? (for just one example among many)

                    Or has C finally joined the modern world since I ran screaming from its useless stupidity a few years ago?

                    But yes, Systems Hungarian is EXTREMELY useful in weakly type languages.

      • Ingvar says:

        I am not fond of Hungarian Notation, but when used to denote semantic types, it at least has a space where it may not require purging with fire. If all you do is to encode the same type as your variable declaration, you’re making more work for yourself.

  4. aldowyn says:

    Ah, I saw that tweet. I figured this was brought on by that post.. Looking forward to part 2.

  5. Sam says:

    I’m a fan of the approach “Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live.”

  6. Stranger says:

    Can’t you use comment expressions to label what tricky parts of your code does? I know when I was doing my two coding classes in college, me and a couple other students would keep two copies of the code. One was bare, and the other used comments to label stuff.

    . . . I know, it’s amateur level stuff but if you’re the only one who has to keep score (“this is a hobby” as opposed to “this is work”) then I don’t know if it should matter.

    • kmc says:

      We absolutely comment our code, which we might not have to do so much if we were better about documentation, but there’s nothing like having a little line just above a block of code that says, “This section applies an OTA filter and outputs an array of peaks per frequency range at each time slice.” That way, if that’s not the part of the code you’re looking for, you can skip it entirely without having to translate anything.

      • Felblood says:

        Another good practice is to Cut/Paste this same comment next to the function declaration, if you’re using a language that requires headers.

        That way if someone just needs quick summary of the function because they came across it in a different function, and not the long version, they can find this in a hurry. Just be sure to update them both whenever a function undergoes a major overhaul.

        If you functions are too complex for that to be feasible, consider taking Alan’s advice further down this thread. This will save a lot of time updating comments if you change your method of doing things, and is also handy for benchmarking the practical impact of an alternative technique.

        I.E. if you are doing a certain thing one way and it works, but you have an idea on how it could be done better, make a new function that does things the new way, and call it instead. Make a comment on the original function, noting the date of the last time it was part of the “active” program, and what kind of performance issues it was phased out for.

    • Kdansky says:

      You should always comment your code (and for Raptor Jesus’ sake, don’t ever have two valid copies of the same piece of code!), but the idea is that you don’t need to comment something that is obvious already. It saves text, which saves brain-processing for important bits. Comment the Why and the How, but never the What.

      Really bad:
      // Increase the index
      index = index + 1;

      Really good:
      // The name field is stored behind the street field
      index = index +1;

    • Jimmy Bennett says:

      Using comments to explain tricky bits of code is probably a good idea. Having said that, you shouldn’t rely on comments to explain your entire code base. If you choose good variable/class/function names, and your project is well designed, a lot of the code should be self-explanatory. On the other hand, if your code is a complete mess, then comments probably aren’t going to make it that much easier to follow.

    • Alan says:

      Generally speaking, if your code needs comments to be understood, you might want to redesign your code. There are exceptions, but far fewer than most people think. A particularly useful trick is too move the tricky section of code into its own function whose name describes what it does.

      • silver Harloe says:

        Also of importance:

        a comment is essentially another copy of your code, albeit in another language (usually your native natural tongue)… which means the same caveat that applies to cut-and-paste programming applies: you’re very likely to end up with the two copies of the code being out of sync after future changes.

        At that time, it’s possible, even LIKELY, that the comment will actively frustrate attempts to understand the code rather than help it.

        So the rules of commenting are:
        *) explain WHY but try to make HOW obvious by good use of variable and function names, thus forcing the “how” part to be up-to-date with the latest changes
        *) if you change commented code, then you update the comment under penalty of pain. If you can’t be arsed to update the comment, it’s better to delete it entirely than let it be wrong.

        That said, I generally have “section heading” comments in my code: “parse the inputs” “validate the inputs” “load old db records” “update and save the records” “print the new form”. Those should probably be in functions, but when writing a page to handle a 5 input form, functions can be overkill (and require passing basically Every Variable In The Script to them, which is nearly as impenetrable as global variables).

        • Alan says:

          I’m going to try and remember the “a comment is another copy of your code”; that’s a good turn of phrase.

        • kmc says:

          You know, this is a good point–when I said, earlier, that we always comment our code, I’m thinking of when we work in scripting languages (specifically, MATLAB). Our C# code typically doesn’t have embedded comments. I do a fair amount of commenting in my Python code, but that’s partly because I’m still learning it so my thought processes aren’t always easy to follow. I’m still doing a lot of things in a non-Python way and that makes the code progression a little obfuscated. Once I’m better at it, I should be able to leave out all but the biggest section headers.

      • Wedge says:

        My approach to commenting code is to comment at a “higher level” than individual lines of code, that is, sections of code that have a coherent purpose. For example, say I have a function that takes a string, parses it into some kind of data structure, uses the data structure in some equation and returns the result; each of those steps will be several lines of code, and I’ll usually comment each section with the basics of what’s going on so that someone reading the code can get a sense for what each of those steps are without having to understand each line of code.

        You may also be thinking that what I outlined above should really be three separate functions; you might be right! Sometimes doing something like that is one-off, and simple enough that it doesn’t need to be split into separate functions, but you’ll end up discovering LATER that you’re doing one of those steps over and over again; commenting code this way also means it’s easy to go back, find that block, and cut+paste it into its own function for easy reuse.

        Another way I (and most programmers) “comment” their code is to document higher-level constructs like functions, classes and modules. It’s usually called “documentation” rather than “commenting” but it serves the same purpose.

    • H.R. says:

      If you have tricky bits in need of a comment for explanation, you are probably doing something wrong (I know I do that a lot. getting a tool to compute your cyclic complexity helps a lot as long as you follow its rules).
      split the function into subfunctions, rethink the way you do what the function is doing and try to find a neater, cleaner, better way etc.

      yes, it’s only a hobby (for you, at least). but it helps a lot in the long run. it’s the difference between keeping your tool kit organized so you can find the stuff you need fast or throwing it into a heap and having to search for ages to find a hammer.
      imagine writing a convoluted cronjob and having to rework it a year later. when I started I had to rewrite it all, cause I had no clue what the first one actually did and even my comments didn’t help…

      • Wedge says:

        Funnily enough, the Kotaku article praises Carmack for judicious use of comments…while the article he references praises Doom 3′s code for being almost 30% comments

      • And sometimes, the code is tricky because you have to work around some proprietary third-party library where access to the source code isn’t possible (beacuse your company doesn’t want to pay for it) and said library has … quirks. And comments are perfect for documenting that.

        • H.R. says:

          true, but then I’d rather have a “vodoo-box” (put in a, b, c, get d) than 20 inline-comments in 40 lines of code and the box itself is excessively commented.

          with such a box I may not know how it does stuff, but I know what it does, which increases reusability and the chance that someone better/smarter/more versed in such vodoo can improve the box without touching (and possibly breaking) everything around it

    • Blake says:

      I see comments as a last resort.
      Rather than write
      // This iterates over objects in a list and moves them 20 metres to the right
      for (int i = 0; i < m_numObjects; i++)
      {
      float oldXpos = m_objects[i].getXPosition();
      float newXPos = oldXPos + 20.0f;
      m_objects[i].setXPosition(newXPos);
      }

      You should write
      MoveObjectsToTheRight(20.0f);

      And have your MoveObjectsToTheRight function contain just that loop.
      Breaking complex code into smaller functions with good function names will make it far more readable than comments (as well as being more maintainable).

      Comments should be used when you have to do something non-obvious because of special demands on your system. They should generally explain why you did something, not what you did.

    • Feriority says:

      There’s one major problem with relying on commenting – comments lie. You make a comment that means something very clear and useful, and then tomorrow you change the code to fix a bug and forget the comment, and in a month when you look at the code again you skim through the comments and have the completely wrong idea of what the code does.

      I think comments explaining *why* you do something that’s unclear are great, because they add context to the code. Comments that explain *what* the code is doing are usually a bad sign.

      • Stranger says:

        Again, not a coder/programmer here. I’m an amateur writer, who happens do do a lot of work around tabletop RPGs :P As such “commenting the code” is something I do a lot.

        A lot, a lot, a lot. (And roughly 50% of that is in the margins of the notes I take during a game I run of what gets ad-libbed.)

        The replies all on this thread have told me quite a bit, and honestly? I now know I will *never* be able to program :) My brain does not function in the proper method to do it well enough to be paid for it. And if you’re not getting paid for something, it’s not helpful to you, and it’s not fun, why are you doing it?

  7. MJG says:

    I used to write in the second style ten years ago, when I was a student who never had to work on anything more complex than a class assignment or small personal project. I did not at all get why some people seemed so stressed out over things like variable names, spacing, comments, etc. If the code works, who cares what it looks like?

    Then, I finished school and got a job, and it didn’t take long for those things to click. That first style became king, and any coworkers who continued to use the second style might as well have been the devil himself to me. When you’re working 40+ hours a week, with dozens of other developers, rolling on and off projects along the way, the value of a consistent and clear style to your code feels almost immeasurable.

    It may take a teensy bit longer to write it that way, but you’re going to make up for that by the very first time you have to revisit that code and are able to skip out on the 5, 20, 60 minutes you’d have to spend stepping through it to figure out exactly what in the world is going on.

  8. Jimmy Bennett says:

    I saw that article about the Doom 3 source code on reddit. I was immediately curious about what you thought about it. I’m glad that you decided to comment on it. I’m curious what you have to say about Carmack’s C++ style. It seems good to me, but I’m a Java guy, so my understanding of C++ coding practices is pretty limited.

    • aldowyn says:

      I’m 90% sure the typical response to “I’m a Java guy” is ‘you shouldn’t be a one-language guy, you should be a programmer’.

      Programming is largely similar no matter what language it’s in, at least at the same rough level – the language is just the specifics. If it’s well-written and documented I would THINK you should have a fair chance of telling what each part is doing even if you don’t know the language at all.

      That said, I’m basically a ‘java guy’, and I didn’t even bother to LOOK at the code, so… yeah, perhaps I’m being a bit hypocritical here.

      • Dasick says:

        There are no different programming languages. They all are dialects.

        • Jimmy Bennett says:

          You should try programming in Prolog sometime.

        • Feriority says:

          Well, they all reduce to a Turing machine or lambda calculus, but that doesn’t mean they’re all the same. Good procedural code, good OOP code, good functional code, and good Prolog are all very different, and some languages let you mix and match paradigms to taste, which is a different experience as well. It’s definitely worth at least trying a few of those different styles.

        • Kdansky says:

          I’d say we have four major languages, and then a ton of dialects and variants:

          1. Imperative languages, like C or FORTRAN
          2. object-oriented languages, like C++ or Java. These have replaced the first category pretty much completely.
          3. functional languages, like Haskell or Prolog
          4. structured query languages, like SQL or XQuery

          Of course, we have hybrids, such as Scala (omg Scala is awesome!) which offers all that 2 and 3 do. You need to learn one language per category (though you can skip 1 if you know a 2). But going from SQL to XQuery to XSLT is easy, and going from C++ to Java to C# is all but trivial.

          • silver Harloe says:

            close, so close :)

            1. Imperative languages, like C or FORTRAN
            2. object-oriented languages, like C++ or Java. These have replaced the first category pretty much completely (except in embedded apps or driver development)
            3. functional languages, like Haskell or Lisp
            4. structured query languages, like SQL or XQuery
            5. logic programming languages, like Prolog (you don’t define or use functions in Prolog, you make a database of assertions, then make queries and it figures out the assertions it needs to resolve your query. It’s… weird, but if you’ve used a makefile, you’ve experienced something close to it)

            Wikipedia lists #s 1-3 and 5 as the “four main programming paradigms” (here http://en.wikipedia.org/wiki/Programming_paradigm ), but also has a list of about a billion paradigms on the same page. Also, #s 1-4 are the ones actually used in Real World App Writing, so those are the ones to learn (though I’m surprised we don’t see more of #5 in game AIs)

      • Jimmy Bennett says:

        When I said, “I’m a Java guy,” I didn’t mean to imply that I only know one language. I say I’m a Java guy because that’s the language I’m the most familiar with and it’s the language I spend the most time working in. I do know other programming languages (I’m learning Clojure at the moment) but I don’t get the chance to work with them as much as I’d like.

        Anyway, the conventions for writing clean code can vary quite a bit from one language to the next, even if the languages themselves are pretty similar. So I’m curious what the Doom 3 code base looks like from the perspective of an experienced C++ developer instead of, you know, a Java guy.

  9. Alan says:

    Both examples are awful, but probably necessarily so as 1) they’re illustrating fairly extreme ends of the spectrum, 2) highly contrived for the sake of example (iFirstVariable is no more useful than var1) and 3) mingling not-necessarily related style decisions. Readability lies somewhere between, with the ideal location varying from person to person. Anyone promoting a single style as optimal for everyone is an arrogant jackass who shouldn’t be creating style guidelines.

    (Also: Shamus, there is an stray { on line 14 of the second example. It’s lacking a closing } that would have been on line 15.5. )

    “…consistency is more important than personal preference…” is a really important point. Conform to the nearby source code and you solve most of your style problems.

    “The Exceptional Beauty of Doom 3’s Source Code” irritated me because a lot of it boils down to “This is the style I like, therefore it’s beautiful.” Arguing that somehow

    if(velocity.getLength() > 1.0)

    if beautiful where

    if(velocity.length() > 1.0)

    isn’t is silly.

    I’m sympathetic to cutting down on useless comments, but “Comments are bad unless they’re totally necessary and they’re rarely necessary.” is overkill. They’re a good place to note assumptions the implementation makes, to make promises and give warnings about what you’ll get out, to note performance characteristics, to direct readers to the research paper the algorithm is based on, and generally answer questions more quickly than reading the implementation.

    • [...]
      Arguing that somehow
      if(velocity.getLength() > 1.0)

      if beautiful where
      if(velocity.length() > 1.0)

      isn’t is silly.
      [...]

      Whie I agree with the rest of what you said, this is false. From the () we know that the code calls functions/methods, which should always describe what they DO in their name. getLength() tells me it gets me the length. length() tells me it… lengths?

      Does it set or get the length? In this case you can infer what it does by context, but that’s not a given for the whole of the code.

      If you omitted the (), thus directly accessing an attribute, the velocity.length would be fine, since attribute names should always describe what they ARE or contain.

      • Canthros says:

        I agree with the first guy; the distinction is fairly immaterial. Might just be that the example is very contrived, but the best solution to this problem is actually something like C#’s properties, which aren’t available in C++. Within C++, I think it’s pretty subjective, and making an exception to the general rule for mutators and accessors is probably OK, just as long as it’s handled consistently.

      • silver Harloe says:

        if your language has implicit get/set methods on attributes (C#), then foo.length is Objectively Right.

        I agree 99% on the “function names should always be verbs” rule, though a case can be made for, “foo.length() is get, and foo.length(x) is set,” in languages which lack implicit get/set methods.

        • Alan says:

          I’m actually not a fan of velocity.length(1), because you are taking an action and the meaning isn’t entirely clear. The context is a vector, and the length is almost certainly a calculated value (sqrt(x*x+y*y+z*z)). Willing to the vector to be a new length could mean scaling it, setting it to [length,0,0], or maybe something else. In practice, it probably means scaling the vector. So I’d go with

          velocity.scale_to(1); // Modifies in place
          new_velocity = velocity.scaled_to(1); // Returns new

        • Zukhramm says:

          I do not know what the common arguments for or against but just on instinct using the same method name with different parameters to do wildly different things seems like something I’d want to avoid.

        • bubba0077 says:

          But that’s not what the example is. The () designates a function call, which should have some sort of action word associated. You’re correct, if the language allows you to access it as a static property, foo.length is fine. But if you need to call a function, use foo.getLength().

          • Kdansky says:

            Technically you should not need to know how that call returns the value you ask for. It might be a function call, it might even be recursive, it might be a variable in memory, it might ask the database on the server, or it might be a cached variable in memory that was calculated by a function call.

            What I’m saying: The caller of “length” should not need to know these specifics. For him, “velocity.length” is enough.

      • Alan says:

        The key is my mind, and why my preference is for velocity.length() is how something reads.

        // "If the velocity's length is greater than 1.0"
        
        // Captures my intent relatively directly
        if(velocity.length() > 1.0)
        
        // Adds in a verb (get) not present in the English description
        if(velocity.get_length() > 1.0)
        

        A more general rule is: if something is more property-like, it should look more like a noun. If it’s more active, it should look more like a verb. In light of that, I actually don’t like velocity.length(1) to mean “scale to a length of 1″; we’re actually changing the vector by scaling it and the verb disappeared. I’d probably go with something like velocity.scale_to(1).

        Of course, your mileage may vary, which gets to my core complaint about the article; there is an implication that the author’s preferences are objectively correct, when in fact it tells us more about how he thinks than about what is objectively beautiful.

        • aldowyn says:

          Here in Java-land we have ‘objects’, which are obviously nouns, each of which has ‘methods’, which are obviously verbs. ‘variables’ are nouns of a certain type – you know, numbers, strings, whatever.

          Actually, I thought that was how OOP in general worked? Having not worked on another language yet it’s a little hard to know the differences… >.>

          • Blake says:

            Yes that’s how it should work, here the question is over a method called either Length or GetLength, which just returns a length.
            Do you think in terms of doing an action when saying if (velocity.GetLength() > 2), or in terms of a noun when saying (if velocity.length() > 2)?

            I don’t think there’s much difference, generally I’ll say GetWhatever, but most of my container classes just go with a Count() function.

            • silver Harloe says:

              The argument for the method length() isn’t really based on “I think methods should be nouns,” it’s based on “grrrrarrrgh this language doesn’t have proper properties, so I’m going to write a .length() that ACTS JUST LIKE a property” which is why .length(1) is inarguably “set the length to 1″ not not some scaling function or whatever, so when you read .length(1) you’re reading “if this language had proper property systems, this would say .length = 1″

              (by “proper properties” I mean that .length = 1 can be changed to do something more complex if length is a derived property – obviously if you just “public” an “int length”, you can do .length = 1 in almost any language, but if length is a derived rather than explicit property, there might not be an attribute to make public)

          • Canthros says:

            OOP is a way of modeling. While naming your types using nouns and your methods using verbs is a good practice, it’s a convention and not a requirement.

            Java certainly doesn’t adhere to this convention with unwavering devotion. See String.length() … (Which I point out merely to say that reasonable people can have differing opinions on such things, as they apparently did in the early days of Java!)

      • Deadfast says:

        Allow me to present std::vector::empty(). Whoever came up with that deserves to be slapped with a large fish.

  10. SteveDJ says:

    If you are going to demonstrate Hungarian (even a style people don’t like) — then shouldn’t line 7:

    float iFourthVariable ;

    …actually be…

    float fFourthVariable ;

    …or somesuch (haven’t worked with floats at work enough to recall the correct notation, but it certainly ISN’T “i”)

    • Canthros says:

      I dunno. I think there’s something very apropos about a demonstration of Hungarian notation indicating physical types, and having the type of the variable and the type in the variable name be wrong in at least one case. Very close to the sort of Hungarian one is likely to find in ‘real’ code.

      • Kevin says:

        Yep. I don’t know why anyone uses type-based Hungarian. If I get information-based Hungarian wrong, (xPosition vs yPosition), bad things happen. If I mess up on types, the compiler comes in and slaps me upside the head, I fix it, and the only thing wasted is the 30 seconds I spent compiling and fixing the bug. No subtle bugs, no nothing. All it does is make the code harder to write AND read.

        • Canthros says:

          Oh, the why is actually pretty easy. They picked it up from Microsoft a decade or two ago (possibly at some level of remove). Joel Spolsky wrote a good article about “Systems Hungarian” vs. “App Hungarian” a number of years ago.

          In practice, it’s just easier to say “Don’t do that!” than try to explain why bad Hungarian notation is worse than none at all, and why what the developer in question probably thinks is good Hungarian is almost certainly not.

        • MrPyro says:

          As we are talking about C++ relying on the compiler for type-checking is not too bad; however in weakly typed languages relying on the compiler/interpreter for type checking can land you in just as much trouble.

    • nmichaels says:

      I think this is a great example of why Hungarian is bad. (Yes, Joel Spolsky explained that apps hungarian is fluffy and rainbows, but I was never forced to use that.) It’s really just a comment, and it’s a comment on something particularly inane. It’s like writing this:

      int foo; // integer
      float bar; // floating point

      And as everyone knows, when the comments and the code say different things, both are broken:

      float foo; // integer

  11. Kevin says:

    As a young programmer who just started his first job (so the younger than young programmer), I’d like to comment on the first one.

    1) Don’t use #define. Preprocessor replacement is bad, and you should feel bad. Make it a const int instead, so the compiler can grab it and not the Preprocessor.

    2) It’s camelCase(), not PascalCase() for function and variable names. PascalCase is for Class names and class names only.
    2b) Though I have seen lower_case(), usually as a sign that this is C-based code, not C++-based code.
    2c) As long as you’re not that idiot who did Pascal_Case() for his entire library, and you’re consistent, I’m happy and can work with it. If you are that idiot, I can work with it, but I will be casting horrible curses on you the entire time because my fingers don’t work that way.

    3) Type-based Hungarian is bad. No one uses it (except when forced to by evil senior programmers and their coding standards).
    3b) Everyone uses semantics-based Hungarian, but they couldn’t tell you what semantics-based Hungarian is.

    4) You are allowed to define variables when they are declared.

    5) This is C++. We have std::string. Char *’s are bad, and error-prone. Let’s use the STL for what it was intended for.
    5b) If you really need to do significant work with Char *’s (where the odds of me screwing up the code (and the extra benefit of std::string doing in 2 lines what I would need to write in 20) are greater than the cost of the 2 extra copies), copy it over to a string, do the work on the string, and then copy it back.
    5c) Minor work can be done in Char*’s, but ONLY for interop with old C libraries (or super-extreme optimization). If you’re writing new code, just use std::string.

    6) Const is definitely your buddy. Agreed on that. Const to the extent that the compiler will let you, because it will keep you from making errors.

    7) Brackets are odd, and usually depend on what I’m doing. You should ALWAYS use brackets around if’s, fors, and whiles to prevent errors, but using 3 lines for 1 line of actual code is pointless.

    if (condition) {
    [TAB]single line of code
    }else{
    [TAB]single line of code
    }

    if(condition)
    {
    [HUGE GIANT BLOCK OF CODE HERE]
    }
    else
    {…

    /Also, you can tell how close I am to the due date by the extent that I follow these. If I’m going to be working in this code for 2 months, let’s make it nice and pretty. If I have 3 hours left, let’s make it work.

    • Shamus says:

      I’ve heard people say that preprocessor #defines are “bad”, but I’ve never seen anyone make a compelling case as to why. And I’ve NEVER heard anything that would make me willing to define application-wide variables, even if they are const.

      1) A #define is easy to spot in code. That ALL_UPPERCASE name is an eye-grabber and makes it clear that we’re NOT dealing with a variable.

      2) #defines can be type-less, which I realize some people see as a drawback. Still, if I’ve got FOOTSTEPS_PER_MILE I can use it with ints or floats without casting and the compiler converts to the proper type.

      3) I use #defines all the time, and never have trouble with them. They’re not a source of bugs or confusion.

      I really don’t get the advantage of const, aside from the aforementioned type-casting.

      • Alan says:

        ALL_CAPS is just style; works just fine for const global variables. For simple constant literals, I’m not seeing cases where “const int FOOTSTEPS_PER_MILE=5200;” would do the wrong thing compared to “#define FOOTSTEPS_PER_MILE 5200″.

        Why #define is bad:

        - Your debugger can actually tell you what the value is as opposed to unhelpfully complaining it doesn’t know that symbol. (Some IDEs will Do The Right Thing, so maybe moot for you)

        - Your compiler is unlikely to be able to name the macro it’s angry about in error messages; you’ll see complaints about things not actually present on the line in question. (I think some compilers are starting to Do The Right Thing, but it’s more cutting edge.)

        - Things fail far less cryptically if you accidentally reuse a name; the compiler will tell you in as many words that it’s multiple defined.

        - Writing macros that behave like functions are work correctly in all cases is extremely hard. If you get it wrong, the failures will likely be cryptic. The big two are being called in a implicit block and multiple evaluation of an argument with side effects

        Why you can pry #define from my cold, dead hands:

        - Everything you care about is in one place, the header file, not declared in a header file and defined in a non-header file.

        - The compiler can better optimize; it can’t see the actual value in another source file and insert it directly.

        - There are things you can do with #define (string pasting, clever tricks with __FILE__ and __LINE__) that go from brief and easy to use to long and cumbersome.

        As for const, it’s a promise (admittedly, one you can break) that you won’t monkey around with something you’ve been handed. This makes it easier to identify code that may have side effects. If I give you “mystrtok(const char*);” you know I won’t be changing the string you pass in. It’s particularly useful if you’re writing in a more functional way, as espoused by John Carmack.

        • nmichaels says:

          On your third point, GCC’s preprocessor will complain if you try to redefine something. I find it hard to believe that there’s a preprocessor that won’t complain about

          #define foo 7
          #define foo 8

          I do like const, but C and C++ are horribly broken languages for letting you cast it away. Horribly broken.

          • Alan says:

            Fair enough. The problem still exists for clobbering a function, type, or variable name with a macro. Or, in the case I actually ran into…

            Back in the late 90s I was working on the AD&D Core Rules CD-ROM Volume 2. For reasons that baffle me now (and baffled me then) there was something like

            typedef int INT;
            

            I think I can blame the Microsoft Foundation Classes (MFC) for this, but it’s been a long time. Whatever the reason, it was widely used throughout the code. This went well for a while, until someone was unifying some constant strings and added something that made perfect sense in the context of software for Dungeons & Dragons:

            #define STR "Strength"
            #define DEX "Dexterity"
            #define CON "Constitution"
            #define WIS "Wisdom"
            #define INT "Intelligence"
            #define CHA "Charisma"
            

            All of a sudden about half of the project refuses to compile, with errors that I think were something like “String not expected here.” And you’d go look at the code and it would innocently say

            INT counter;
            

            There is no string there, why is it angry? And INT worked just fine in other files (one that didn’t include the header with the #define), and had been working for months at that point. Debugging that was an angry few hours.

            One can limit the danger with style requiring that macros be ALL_CAPS, but it’s better to move problems to a level where the compiler can help you if a human makes a mistake

            • silver Harloe says:

              “I think I can blame the Microsoft Foundation Classes (MFC) for this”

              I think this is older, probably to the first transition from 8bit to 16bit processors, though perhaps older still. The basic point of “INT for int” is that you can swap headers to compile the same code on a machine that disagrees with the bit size (or endianness) you designed for. I could be way wrong, though. I just know in *n*x things, there are lot of pseudo-types like INT32 and INT64.

            • Steve C says:

              Hey I bought that CD! I still have it. Only thing I remember about it now is that made me realize the encumbrance system was nuts. I had a 16 STR character that was encumbered because he was wearing clothing. Not armor, just clothes.

              Random nostalgia moment is over.

            • nmichaels says:

              That’s a great example of how coding styles matter. Using all caps for things that are not macros is very bad juju. Of course, my first instinct on seeing that error for the line

              INT counter;

              would be to M-. (etags) the INT, then check the preprocessor output because INT must be a macro, and what else could be wrong with that line?

              Modern C (C99) has stdint.h to solve that problem, but I’ve seen many approaches for codebases that couldn’t use modern compilers.

              For what it’s worth, I do agree that even C’s weak type system should be used whenever possible. I just strongly object to the notion that one should never use macros. Static inline functions and code generation can only get you so far.

        • Shamus says:

          That’s a much more useful argument than I’ve read in the past, thanks.

          And yes, over-use of #define macros is a wonderful way to blend obscure puzzle-solving with cryptography and sadism.

          • Blake says:

            Outside of the debugger related issues, my main argument against using #define’s for constants are because if someone uses them in a header, anyone using something named the same could potentially conflict.

            Having
            class
            {
            static const int NUM_OBJECTS = 4;
            Object m_objects[NUM_OBJECTS];
            };

            Has all your nice scoping so that if you include another header with another class with NUM_OBJECTS = 8, you don’t run into issues.

            As far as using them in the .cpp, it’s just to maintain consistency with the header.

        • On a nearly unrelated note: It’s amazing that I can hear John Carmack’s voice as soon as I start reading his prose. His writing style is so close to his speaking style that the two map almost seamlessly. Un-affected may be the word I’m looking for.

      • Kevin says:

        #define’s:
        Never doesn’t always mean never. It just means that unless you have a fantastic reason for using it, don’t.

        The fundamental problem with #define’s is that the code that you pass into the compiler is not the same as what the compiler gets as input, because the preprocessor does find-replace magic.

        For trivial stuff, #define VARIABLE value is fine, and the typeless is an advantage. (Though unless you need the typeless for something, there’s no fundamental difference between that and const int VARIABLE = value; which preserves the obviousness in code scanning).

        For non-trivial stuff, you’ll eventually hit a point where the macro causes a bug that the function doesn’t. (See Alan’s SQUARE(get_next_int()) example down below). You can work around it with some simple non-compiler enforced code style, but I write the majority of my code at 3 AM, and won’t always remember. And than I’ll have a non-trivial bug

        Const:

        1) Your compiler LOVES const. Loves it, loves it, loves it. Properly-consted code will always be equal to or faster than non-consted code on that same compiler because of all the crazy tricks you can play knowing that this value will always be this value and never, ever change (Or that something going into foo() will be the exact same state coming out of foo()). And no, I don’t know what they are, I just remember my friends in the compiler course laughing about it.

        2) The compiler will catch lots and lots of bugs with const.

        const int n = 3;
        [BIG BLOCK OF CODE]
        n += 2; //This is a bug.

        will cause a compiler error. “int n = 3; //THIS SHOULD BE CONST” won’t. Naming conventions (“n => constN”) won’t (though they’ll make it easier to find). Having the compiler catch as many bugs as possible is a positive event.

        Especially since if(x = y){ [code] } is valid C++. Every C++ coder has made that typo at one point or another. And it's nearly impossible to find. But if x was const, the compiler would scream, you'd fix it to if (x == y) and you'd never have the bug at all.

        3) If you're writing an underlying library, you had better be properly consting, so that the people above you can properly const their code. You can't half-const. It won't compile without significant work (and lots of const casts, which is my guess as to why const cast exists).

        Edit: And the whole promise thing that Alan said, which makes it easier for other people to use your code. Forgot that one. Thanks Alan.

        /I also vaguely remember having fun times with non-const class getter definitions and the STL, since the STL very quietly consts a bunch of stuff of stuff without telling you. But don't quote me on that at all.
        //I also remember having even more fun with C++98 and double-consting in the STL (const T where T= const int becomes const const int, which is a complier error). But that's been fixed in C++11.

        • nmichaels says:

          On the subject of compilers loving const:

          a.c:
          const int foo = 7;

          a.h:
          extern int foo;

          b.c:
          #include “a.h”

          blargle = foo * 2; // This is the special line.

          Your compiler does not know how to optimize out that multiplication. Had a.h used #define FOO 7, no code at all would be generated for the special line. It would just substitute the number 14 the next time blargle was used. Instead it has to make your poor users read from memory and multiply by 2 every single time they run the program. They might even need to do it again if the special line is duplicated or the function it’s in is called repeatedly.

          • Veylon says:

            My compiler did! I had it return blargle from main, and all it gave me was:
            B8 OE // mov eax
            C3 // ret
            No moving from variables or multiplying or anything. Just a straight up “return 14″. So apparently, compilers really do like consts. Or at least VS2010, anyway.

            • nmichaels says:

              That’s nutty, but probably because your compiler has link-time optimizations turned on. For situations where you can do that, it’s great. But for people who have to use object files generated by other people, it’s not. And many compilers (like older versions of GCC and several really wonky ones that I have to use on embedded systems where saving a few bytes of memory actually matters) don’t support that feature. Instead, you get from this code:

              #include "a.h"
              
              int main()
              {
                  int bar;
                  bar = foo * 7;
                  return bar;
              }
              

              to this:

              00000000004004ac :
              main():
              /home/nathan/c/b.c:4
                4004ac:       55                      push   %rbp
                4004ad:       48 89 e5                mov    %rsp,%rbp
              /home/nathan/c/b.c:6
                4004b0:       8b 15 c6 00 00 00       mov    0xc6(%rip),%edx        # 40057c 
                4004b6:       89 d0                   mov    %edx,%eax
                4004b8:       c1 e0 03                shl    $0x3,%eax
                4004bb:       29 d0                   sub    %edx,%eax
                4004bd:       89 45 fc                mov    %eax,-0x4(%rbp)
              /home/nathan/c/b.c:7
                4004c0:       8b 45 fc                mov    -0x4(%rbp),%eax
              /home/nathan/c/b.c:8
                4004c3:       5d                      pop    %rbp
                4004c4:       c3                      retq   
              
        • Rick C says:

          BTW, if (x = y) … is something that most modern compilers are capable of catching. For some reason, with MSVC you have to use /W4 (report all warnings) to get the warning, but you should compile with that anyway.

          I believe gcc will report it without any extra switches, too.

          Aside: Shamus, it’d be nice if you slapped a tag on the “confirm you are not a spammer” text so that we could click on that instead of having to hit the checkbox itself.

      • bloodsquirrel says:

        Maybe I’ve had a different coder’s upbringing than you, but my preference is always to assign a constant as a public/private class variable.

        One of the big advantages is scoping- SomeClass.SOME_VARIABLE has a very clearly defined scope. Other classes can use SOME_VARIABLE without it cluttering their global namespace. Subclasses will inherit it. Subclasses can override it. SomeOtherClass.SOME_VARIABLE will not cause any conflicts. If I decide to put three classes into one source file, I don’t have to worry about which #define is for who and if they conflict.

        In other words, #define has the scope of the source file it’s in. SomeClass.SOME_VARIABLE has the scope of SomeClass, which is more meaningful from an architectural standpoint.

        Other advantages:
        -It’s immediately apparent what SomeClass.SOME_VARIABLE is- even if I’m not familiar with the coding style, I know where I should go to find it.

        -Your IDE’s autocomplete will point it out for you. If you’re using a class you haven’t used in a long time, just looking at the list of options after SomeClass. in the autocomplete is a good way to remember what’s what.

        Yes, #defines do allow for some optimizations that constants don’t, but even embedded systems nowadays are powerful to the point where it isn’t going to matter. There are much more important ways to optimize your code, and the less time you have to spend debugging and trying to re-figure out your code the more time you have to optimize what’s important.

    • Alan says:

      It’s true that a lot of C++ code is CamelCase and that underscore_separated is more common in C code. I find it a bit strange, given that everything that C++ gives you out of the box, including notably the STL, uses underscore_separated. Personally, I blame Java.

    • About #defines vs. const int. In C (NOT C++), a “const int foo = 42;” will reserve memory to store the 42 (and at least using GCC) will place it in a read-only section; “#define FOO 42″ will replace instances of FOO with 42, meaning FOO is treated as a literal, not a read from memory.

      “const” in a function declaration like “int foo(const int bar)” informs the programmer that bar isn’t modified, and may help with some optimizations.

      These days, I use enum instead of #define for numeric values (still with upper case names). The compiler is aware of them, and most debuggers will have access to that information as well (at least, the compilers I’ve used—mostly under Unix).

  12. I’m pretty sure the first one won’t execute since the first line is commented out.
    #define IMPORTANT_NUMBER 10
    Or, is that a global variable definition? Hash mark is comment in Python, but in C++ it’s the double slash?

    Curse you divergent code conventions!

    Very interested in your thoughts on the ID style et all.
    Also, I like to think of that last screenshot as a visualization of bad coding practice and byzantine code catching up with the developer.

    • Kevin says:

      That’s a preprocessor defintion. Think of it as a giant find-replace.

      #define X 10 means “Before you hand this off to the compiler, head over and replace all instances of X with 10. In text. As a string.”

      Now that’s not a problem. The problem is when you do things like

      #define Square(X) X*X

      and Square(x + x) gets replaced with “x + x*x + x”, making the answer X^2 + 2X

      Which is why you NEVER EVER use Preprocessor definitions

        • Kdansky says:

          #define true false
          // Funny and evil to leave behind in some obscure piece of code

          #define TRUE 2
          // More amusing and evil version, because it is correctly following the standard, but 99% of all implementations use TRUE = 1 instead. You should never rely on TRUE = 1, but a lot of code implicitly does, because it’s easy to miss.

          #define true (math.random() > 0.01)
          // Exceptionally evil. Works correctly in 99% of all cases.

      • Michael says:

        No, that’s why you shouldn’t ever use badly written preprocessor definitions/macros.

        #define square(x) (x)*(x)

        is a perfectly valid method of implementing a square method without the overhead of making an actual function call.

        it also avoids the square(a+a) = a + a*a + a problem, by making it (a+a)*(a+a), as it should be.

        In fact, it should really be
        #define square(x) ((x)*(x))
        to avoid any ambiguity at all.

        As a matter of fact, many of the “functions” defined in core system libraries are actually implemented as macros to avoid unnecessary overhead. It may not matter for something that gets called ten times in a program, but when your square() (or abs() or whatever) method gets called in a ten billion iteration for loop the micro- and milli-seconds add up and the performance difference can be striking.

        (I realize that some of the optimization flags available on most modern compilers can recognize these situations and inline the code for you. Unfortunately these optimization flags frequently make complex code much more difficult to debug, and may even change the functionality of your program in unanticipated ways…. I know, because I’ve run into situations where that is the case)

        • Alan says:

          It gets worse, because what if what you pass into your macro definition has side effects. To take a simple example:

          // Using ␣␣ for indenting, 'cuz the blog eats spaces.
          #define square(x) ((x)*(x))
          
          int get_next_number() {
          ␣␣static int i = 1;
          ␣␣return i++
          }
          
          int next_square = square(get_next_number())
          

          That turns into

          int next_square = ((get_next_number())*(get_next_number()))

          Which on the first call returns 1*2=2 instead of the expected 1*1=1.

          There are a variety of techniques to get it right, but they’re really non-obvious, easy to screw up, and when you screw up the failures will frequently be extremely cryptic. Non-trivial macros are dangerous tools. But, they’re tools you sometimes really, really want and I miss them when they’re not available. (I’m fond of various techniques involving __FILE__ and __LINE__ to aid in log messages.)

        • Kdansky says:

          Put square (x) in a header file, and write “inline” in front of it. Trust the compiler. Or rather, trust those guys who implemented your compiler. They do nothing except working on making it more reliable and faster. I’m pretty sure they are better at optimising than all of us.

      • nmichaels says:

        Never ever is a terrible rule. I use preprocessor defines all the time in C code. The important thing is to make sure the preprocessor has its very own namespace. Anything in ALL_CAPS is by convention something the preprocessor will replace. Anything that isn’t is not. If the preprocessor is given something that’s not, it’s a horrible bug and someone will probably get a beating. The solution to your problem with Square(X), by the way, is to always put arguments in parentheses. This is (an important) part of our coding standard:

        #define SQUARE(x) ((x)*(x))

        That will yield correct results as long as evaluating x doesn’t have side effects (or, in some cases, take lots of time). And as a bonus, you don’t have to write a different implementation for each type x could ever have (int, float, long, short, char, double, don’t even get me started on all the types in stdint.h…). So it’s important to make sure SQUARE is in all caps so that users know that x may be re-evaluated.

        But then I write in C. If this were C++, you could use templates to solve that problem but you’d be forced to write C++ which is pretty torturous.

        • Kevin says:

          Yep. C is why Preprocessor exists. It’s C’s templating.

          and the templated version isn’t too terrible (And will be inlined by even the most idiotic compiler).

          template [typename T]
          T square(T val){
          [TAB] return T * T;
          }

          or even

          template [T val]
          struct Square{
          [TAB] enum {value = val * val}
          }

          and then use Square(N)::value. (though N must be known at compile time)

  13. Wedge says:

    McGrath’s Kotaku article is amusing to me because so much of what he focuses on are not so much on how Carmack wrote beautiful code, but on the ways that C++ is ugly as hell, and how id managed that ugliness.

  14. nmichaels says:

    We recently went through an update of our coding standards where I work because the person who wrote the original left and the engineering department tripled in size and we’re starting a new project with lots of new code. I was gratified to find that most people didn’t care about most of the silly things like Function() vs. Function () vs. function().

    I used to be crazy about putting the opening { on its own line, but then I learned to write Go and love the bomb. Hungarian is still the devil though. I don’t care what Joel Spolsky says.

    • aldowyn says:

      In reading this set of comments, I’ve realized that I’m pretty sure the only REAL reason I put { on the opening line is so that I can easily match it up with the end bracket. That’s helpful on medium-long blocks, yes, but in short blocks it wastes a TON of space.

      Of course I don’t have much experience, but that’s a style-dilemma I’ve run into myself…

  15. Chris says:

    I think it is overestimated how much time will be spent investigating old code to make small change, the change will take 15 minutes to implement and test. The investigation on the other hand occupies a day. Code should always focus on easy to understand or if difficult, document. I have spent about half my career on supporting code, only getting to write one new program a year. I can tell you that clever is sometimes needed due to unique problem, but most of the time adding additional steps hurts nothing. This is the opinion of someone who spent a majority of 20 year career in a support role, so much wasted time in investigation.

    As always Shamus, I enjoy your programming articles and insights.

  16. Canthros says:

    I like the guy’s choice of brace style, but, hey, it’s what I use. (And more-or-less what I have always used.) Placing the brace on the same line as the keyword, declaration, etc to which the block belongs makes it clear the block isn’t a standalone thing. And elses should be cuddled for the same reason: an else exists only as part of an if block, not as a thing of its own. (I notice nobody sane ever intentionally drops a newline and an extra level of indent between the else and the if in an else-if statement in most anything C-derived.)

    But, I loathe his predilection for tabs (a space is a space the world around, but a tab’s who-the-heck-knows, depending on your editor, the other guy’s personal preferences, the font employed …), and I expect the way I format complex if expressions would drive him up a wall.

    I also suspect that a few of his preferences are predicated on the assumption that his current project only needs to build against the interface of the version of his engine that he’s got right there, and that known of it will really be maintained over, say, the next decade.

  17. krellen says:

    I gave up on programming years ago (my degree is actually in programming) and all I can do these days is clean up other people’s sloppy code, and not even always then, so while the post body itself is wholly comprehensible to me, the comment section is a bunch of gobbledy-gook I cannot process.

    Thus, my only comment is this:

    NERDS!

  18. Sabrdance (MatthewH) says:

    It isn’t just programming. Any set of notes or instructions is like this.

    Earlier today I had to update a project with a new Census Dataset (if I could find the guy who decided the Bureau needed 3 different ID codes rather than just sticking to Federal Information Processing System, I’d chew him out good). Anyway, it was new data, but I’d done 30 some years of prior data last year. So I pulled out the old datasets and the notes for variable selection et cetera. Following my old instructions I came out with not quite the same results.

    While I’m mostly going to blame the Bureau for chaning data standards between 2002 and 2007, I also had some difficulty interpreting my notes. It turns out “E+F+H+L+M+N” is not a clear enough note (I had to refer to another document which explained all those variables).

  19. Urs says:

    I haven’t done anywhere enough textual programming to be able to even catch a glimpse of what all these comments are talking about…

    “My, uh, variables are camelCase and my filenames use underscores!”

    However, from the angle of Visual Programming:

    Whenever I start out to do something, I usually just drop nodes in the lose manner of a top-down structure (those nodes have their inputs on top, outputs on bottom) as long as I can follow what I’m doing. But whenever one functionality is implemented or a problem – no matter how small – is solved, I take some time to clean up and arrange everything neatly into (visual) blocks or clusters of ThisDoesThat.

    This topic in vvvv, illustrated:

    Nice, I get it
    Work in progress?
    Spaghetti!!

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *

*
*

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!