Programming Vexations Part 7: Where Does the Time Go?

By Shamus Posted Thursday Oct 10, 2019

Filed under: Programming 66 comments

Last week I talked about the compiler and what it takes to turn source code into a program. At the end I talked about the apparent problem where the compiler seems to be slowing down in proportion to Moore’s Law, creating a stalemate where faster processors don’t do much to reduce compile times. This naturally leads to the question:

So what is the compiler doing with all those CPU cycles?

There doesn’t seem to be an agreement on this. I can’t even find any proper research on the topic. All of the discussions take the form of repeated platitudes and received wisdom on Stack Overflow.  As far as I can tell, nobody has done the homework to compare the performance of C compilers in the mid-90s with the compilers we’re using todayIf I’m wrong and there is indeed research on the topic, then please drop a link in the comments!.

It’s Not That Simple

Oh hey! I found Waldo!
Oh hey! I found Waldo!

Part of the problem is that you can’t just blame the compiler and be done with it. Obviously we’re writing more complex programs, but those programs are also built using more complex libraries. Even beyond that, not all code is created equal in terms of the time it takes for the compiler to digest it. It’s less about the performance of the compiler itself and more about what you’re asking it to do.

As features are added to the language, people change their design to incorporate those new ideas. Perhaps these newer ideas take longer for the compiler to sort out, so it doesn’t really hurt compile times until you’re using a lot of them.

I want to stress that I’m not suggesting that the folks who maintain the C++ compiler have been lax in their duties. I’m willing to bet that there is a good reason why things work the way they do. C++ is a big, complex language and its use-cases stretch across the entire spectrum of software projects, from operating systems to embedded systems to productivity software to hobbyist projects. The language is incredibly flexible and it ends up being used in a lot of different ways.

I’m sure there are people out there who can account for the processor cycles C++ uses. The problem for us garden-variety programmers is that there’s no good way to sort their knowledge from the people reflexively chanting “You have too many templates”. Objective truth exists, but unless you want to become a compiler expert yourself there’s no good way to sort truth from platitudes. And even if you take the time to sort that out, it’s not always clear how to use that knowledge on your own project.

There are lots of things that can slow down compile times, but the chief suspects are:

  • Templates: I’ll talk more about this below.
  • Codebase size: This is probably obvious, but our programs are larger and more complicated these days. Back in the 90s, I worked on a codebase that was a million lines of code. That was pretty big for the time period, but that’s pretty modest by today’s standards.
  • Frameworks. Not only are our programs getting larger, but the external packages we’re using are also growing. The Windows SDK – which you need to use if you’re developing anything for Windows – is pretty gargantuan by now. It only takes one line of code to include the windows.h header fileI’ll talk about header files in later entry. in your project, but that one line of code pulls in a total of 40 different files! The compiler needs to process those tens of thousands of lines of code. Even if you’re only using 1% of the features in Windows, the compiler still has to read the whole damn thing. This applies to all the libraries you might be using: Graphics, sound, networking, Steamworks, etc.
  • Redundant compilation: Because the C and C++ compiler only runs on one file at a time, it ends up doing a lot of redundant work. Sure, you can use muti-threading to compile multiple files at once, but those isolated instances of the compiler can’t cooperate. The same few lines of included code will get compiled again and again, and then the linker has to come through and throw away all of the needless extra versions.

Templates

Let’s say you want a function to add one to a number. So you write the following code:

float AddOne (float num) {
 
  float result;
 
  result = num + 1;
  return result;
 
}

That works for floating-point numbersNumbers that can have an arbitrary number of decimal places, like 10.78881 or 1284.4. , but what about integerVariables that can only store whole numbers with no decimal places. values? If you want to do the same thing to integers, then you need another function:

int AddOne (int num) {
 
  int result;
 
  result = num + 1;
  return result;
}

It’s pretty easy to see that these two bits of code are nearly identical. In fact, all you need to do is replace all instances of the word “float” with “int”. Programmers do not like duplicating code like this. If the spec changes later and we realize we need to add 1 if the number is positive but subtract 1 if the input number is negativeI’ve emailed the project lead and insisted that we should rename the function from AddOne to IncreaseMagnitudeByOne, but she hasn’t gotten back to me., then it’s very easy to forget / overlook the various versions of the AddOne function. You’ll change the floating-point version but leave the other version alone, and now the AddOne function will behave differently based on the type of input variable.

What if we need yet another version for unsignedVariables that can only hold value that are positive or zero. values!? What about doublesFloating-point values with twice the precision, so it can hold values like 221.453739006345.?! Single-byte values? Are we going to need to write and maintain five different versions of this function??

This is where template programming comes in. The idea is that you write a function like this:

template 
T AddOne (T num) {
 
  T result;
 
  result = num + 1;
  return result;
}

Now you can use AddOne on any variableMore precisely, any variable that can have 1 added to it. and the compiler will automagically build a version of AddOne based on the variable you’re trying to use. You only need one version of the function, and the compiler uses it as a template to build the rest.

That’s the ideal, anyway. In practice, people have a lot of gripes with templates. Additionally, they rapidly inflate compile times.

At this point I’d give a quick overview of how I use templates and whine about any perceived shortcomings, but to be honest… I’ve barely used them. I’m sure they’re useful to people working in other domains (people must use these things for a reason) but they never seemed to help me out in my game-type work. In practice, I rarely have functions that need to apply to ALL types. Even when I do have a function that needs to work on more than one type, it always seems like the variants all need slightly different behaviorLike maybe floating-point numbers need to be rounded off or unsigned values need to be clamped to 0 during subtraction rather than allowing them to wrap around to four billion. so that using templates would actually be more work. Templates don’t solve the problems I typically run into.

So don’t use templates, right? The problem is that C++ is really barebones in terms of game development. You might be one of those unicorn developers that writes everything from scratch, but the overwhelming majority of developers are going to be using external libraries like boost, which makes heavy use of templates. The compile times of boost itself borders on surreal. I think a full compile of boostYou only need to do this once, after downloading the source. is something on the order of 10 minutes.

TEN MINUTES!

The point is that if you’re doing game development, then you’re probably going to be using external code that makes use of a lot of templates, and thus your project will compile very slowly. This is on top of all the other stuff that makes C++ so slow to compile.

Why Compile Times Matter

You thought I was going to put that one XKCD comic here, didn't you? But no! Instead here's a stock photo of an office worker THINKING about the XKCD comic. While playing air guitar.
You thought I was going to put that one XKCD comic here, didn't you? But no! Instead here's a stock photo of an office worker THINKING about the XKCD comic. While playing air guitar.

I’m afraid it’s time for another Terrible Car Analogy™. Let’s say you’re working on a car that won’t start. So you make an adjustment to the combobulator manifold or the humding injection filter. But you can’t test the car in this condition. You have to re-assemble the part you’re working on, put the cover back on, reconnect it to the combustion flange, and close the hood. The whole process takes a full three minutes.

There are a dozen different things you can try in order to diagnose this problem, but every attempt costs you three minutes, even though the change itself takes two seconds. If you try them all, then you’re going to spend 36 minutes re-assembling everything again and again, and in all that time you’re only going to do 24 seconds of real work.

In a tricky problem, you might need to make a lot of very small changes. Maybe those changes are only a single line of code. But if every test eats up a minute of compile time then you’re going to spend most of your time waiting for the compiler.

Worse, this particular break in flow is incredibly disruptive to concentration. I’m mentioned before that programming involves having a lot of different ideas in your head at once. You need to keep thinking about the problem, yet there’s nothing to think about during the compile. You have to somehow maintain concentration while also being idle and bored. Have you ever seen a runner jog in place while waiting for the light to change? They’re trying to keep their heart rate up and their body moving because they don’t want to fall out of their groove. Imagine if they were locked in place and unable to move until the light changed. That’s what waiting for a compile feels like.

Personally, I think around fifteen seconds is where the compile process becomes disruptive to the coding process. As soon as compile times get above that threshold, I find it really hard to keep my head fixed on the problem. In the old days you’d fire up Minesweeper or (my favorite) Freecell. These days I guess everyone just picks up their phone.

As the compile drags on, I get bored and my mind starts to wander. If the compile is above half a minute, then I’ll jump over to another window to check social media / email / play a round of casual mobile game / check my website comments / etc. It’ll take me five minutes to get back from that voyage of distraction, and then it’ll take me half a minute to get back into the groove of the problem and remember where I left off.

I realize it’s not fair to blame the language for human frailty. I’m not saying we should blame Bjarne Stroustrup for my social-media distractions. But it’s also true that programmers are people and the problem of losing concentration during compiler downtime is not a rare one. If we can make a language that keeps compile time low, then we can stay “in the zone” for longer and thus enjoy higher productivity.

Jai Does it Differently

Jai is apparently ridiculously fast to compile. Jon Blow shows his game compiling in under a second on live streams. For reference, a second is about how long I usually have to wait before the compile process begins in Visual Studio. The difference in speed is almost comical. He’ll occasionally compile the wrong program, or with the wrong settings, and it takes him longer to realize his mistake and say “oops” than the entire compile process.

To be fair, some of that improved compile time is because Jai has less to do. It isn’t chewing through decades of language features that aren’t strictly needed for your project. But I’m guessing the biggest speed boost is that the compiler is able to look at all the files in question, rather than looking at each file in isolation.

I’m sure Jai compile times will go up as the language matures, but currently it’s compiling a nearly-complete game an order of magnitude faster than C++ wouldThere’s no way you can fully compile a mid-tier 3D title in under 10 seconds.. I’m not saying Jai is going to save us all, but I do think it’s reasonable to conclude that a language devised for games could really help us out when it comes to compile times.

I’ve dabbled in other languages, but I’ve never done anything above trivial complexity outside of the C family and Java. I’d love to hear from Go, Rust, and Dlang developers: How are compile times in your neck of the woods?

 

Footnotes:

[1] If I’m wrong and there is indeed research on the topic, then please drop a link in the comments!

[2] I’ll talk about header files in later entry.

[3] Numbers that can have an arbitrary number of decimal places, like 10.78881 or 1284.4.

[4] Variables that can only store whole numbers with no decimal places.

[5] I’ve emailed the project lead and insisted that we should rename the function from AddOne to IncreaseMagnitudeByOne, but she hasn’t gotten back to me.

[6] Variables that can only hold value that are positive or zero.

[7] Floating-point values with twice the precision, so it can hold values like 221.453739006345.

[8] More precisely, any variable that can have 1 added to it.

[9] Like maybe floating-point numbers need to be rounded off or unsigned values need to be clamped to 0 during subtraction rather than allowing them to wrap around to four billion.

[10] You only need to do this once, after downloading the source.

[11] There’s no way you can fully compile a mid-tier 3D title in under 10 seconds.



From The Archives:
 

66 thoughts on “Programming Vexations Part 7: Where Does the Time Go?

  1. Zerfix says:

    Rust uses llvm so everything after the pre-processor is the same speed as c++ with clang. It does however use some clever tricks to reduce compile times like only compiling libraries once but linking them multiple times if there are no changes, and every library gets its own compile thread. Best practice is to divide the program into smaller libraries.

    1. tmtvl says:

      Yeah, it isn’t astonishingly fast, but not having to recompile everything when you make a change is a godsend.

  2. The Puzzler says:

    Recent AAA project I worked on:

    Before checking in any code, you were supposed to update to the latest code, and then check your changes on debug and release for Playstation & Xbox, which amounts to four different builds. (Sometimes our policy was to check more or fewer build combinations.)

    Updating also updated the art, which required an art rebuild (or a download of rebuilt art; there were lots of experimental systems for this). Art rebuilds would also touch .h files, which meant that when you did a code build after that, it would have to be a full rebuild.

    Having done your (first) rebuild, you’d then have to upload the build to the test kit, and run it. We hadn’t optimized load times yet, so testing to see if a single level could load without crashing might take a few minutes on top of that.

    So, in order to check in tested code, you’d first have to update and then do all this rebuilding, and then upload and run it on multiple devkits, and hope that no-one changed anything that might be incompatible with your changes during this time.

    There were over a hundred people working on this project, internationally. The central server was (depending on where your team was) in another country, meaning updates had to go over the regular internet, which can be pretty slow. Doing an update and full rebuild in the morning could take an hour or three.

    The chances of someone changing something during testing was about 99.9%. Basically, the only people who could ever get any code in were those who skimped on testing compatibility with the latest updates. (“Sure, this code could break the game for a hundred other people, but my best guess is that it won’t.”)

    Code compile times were part of the problem, but not the whole problem.

    1. Decius says:

      Did “the latest code” that you were supposed to check your changes with update every time someone checked in code, or was checking in code O(N^3) with programmers in total testing time?

      1. The Puzzler says:

        I don’t think I understand “O(N^3)”.
        It would change every time, although some programmers would work on specific sub-branches of the code to reduce the amount of integration required.

        1. Decius says:

          Each programmer needs to compile and test their changes against the most recently accepted change.

          Then a race condition exists where only one change gets accepted and all other programmers have to merge, recompile, and test against the new code.

          That happens a number of times equal to the number of changes.

          Thus, the compile and test time scales with the cube of the number of people making changes.

    2. Asdasd says:

      Did the game get finished? I’m reading these articles and comments, as a non-programmer, wondering how these large-scale software projects ever get done.

      1. Sleeping Dragon says:

        The deadline looms and the release date can’t be pushed further or it won’t make the holiday/trade show/fiscal quarter/Jupiter being ascendant in the house of Mars/release date that has been hyped by marketing until the very last moment window so everyone needs to crunch to put out something that doesn’t outright explode 2/3rds of the time. If the game sells they might later patch it so it doesn’t explode 4/5ths of the time.

        I’ve noticed I’m becoming a bit jaded about AAA gaming industry.

  3. Decius says:

    I disbelieve that Jai can look at more than one file at once. Or even that it looks at files at all.

    Maybe it can load code from more than one file into memory and then work on it all there, but that’s the kind of thing that an adequate C++ compiler would be able to do.

    There’s some more hard limits around paging to disk and paging to professor cache that limit maximum compile speed, given that everything is in “memory” (when dealing with virtual memory and access times, the distinction between RAM and disk is important, but OS features have generally made the two hard to distinguish, particularly when the OS decides to write RAM to disk so that it can cache disk reads to RAM, and then decides to do that more and more aggressively as disk queue times get longer because I’m actually using all of that stuff in memory and actually making lots of small nonsequential disk reads.

    Back on topic, I don’t see how putting everything into memory at once removes the need to compile every line of code that appears in any of the files. Sure, in theory the duplicate code repeated in each file could be linked, but in practice identifying the common substrings over a given length is computationally hard and for nontrivial cases I’m pretty sure that is NP-complete;

    So if the faster compile times are real, it’s because the Jai compiler is ignoring things that the C++ compiler is attending to- for example, if an include is bringing in a quarter gigabyte of code but the program uses only one function of it, it seems like a good compiler would find the function that is being used and what that function needs and pull the code for that function from the cache, rather than parse all of it.

    Maybe it’s easier to write a minimally adequate compiler for a new language rather than for an existing one? Because all of the compiler time losses you mentioned are problems with the compiler, not the language.

    1. methermeneus says:

      I think a lot of the improvements are related to starting fresh rather than building on decades of prior work you can’t break. I don’t know how much is the language and how much is the compiler, but the same basic concept behind the problems exist in both. I wouldn’t be surprised if you could get a lot of improvement from writing a new C++ compiler without reusing code from GCC or VCC or Clang, it’s true. On the other hand, Jai is also doing stuff with the language that (Jon hopes) will make the actual coding process easier. Granted, other languages do this as well, but most of them include the protective features that Shamus was talking about earlier in the series (the “language for good programmers” thing).

    2. Timothy Coish says:

      “Back on topic, I don’t see how putting everything into memory at once removes the need to compile every line of code that appears in any of the files. Sure, in theory the duplicate code repeated in each file could be linked, but in practice identifying the common substrings over a given length is computationally hard and for nontrivial cases I’m pretty sure that is NP-complete;

      So if the faster compile times are real, it’s because the Jai compiler is ignoring things that the C++ compiler is attending to- for example, if an include is bringing in a quarter gigabyte of code but the program uses only one function of it, it seems like a good compiler would find the function that is being used and what that function needs and pull the code for that function from the cache, rather than parse all of it.”

      First of all, the compile times are real. Secondly, did you read the article? The Jai compiler does not have to ignore anything that a C++ compiler attends to. Thirdly, dear god when are you going to be loading in 250+MB’s of code? And with a single include?! For a single function?! What projects have 250MB’s of code in the first place? We’re talking about text files here. Fourthly, modern SSD’s have random read speeds of well over 50 MB’s, so if you somehow had 250 MB’s of text files, your source code, and this was all randomly arranged on your hard drive, you would load all that in 5 seconds. You are never going to need to reload this, so that’s a non-issue.

      The compiler time losses he talks about are absolutely 100% problems with the language, not the compiler. Recompilation is the biggest offender, and that’s inherent to how C/C++ compile. If you don’t recompile headers, your compiler is broken. Templates are also a massive offender, which is multiplied in time by the previous “feature” of C++.

      “I disbelieve that Jai can look at more than one file at once. Or even that it looks at files at all.”

      I don’t even know how to respond to this. Obviously the Jai compiler looks at files. It does not get code into memory magically.

      “Maybe it can load code from more than one file into memory and then work on it all there, but that’s the kind of thing that an adequate C++ compiler would be able to do.”

      What? It’s not a maybe. What you’re talking about is trivial. A C++ compiler can also load in multiple files. Any program can load in multiple files. Also, you have to “work on it all there”, referring to memory. What is even meant by the above statement?

      1. Decius says:

        A line of code is on the order of about 30 characters, each of which is two bytes. 60 bytes to the line, 20 lines to the kilobyte, 20k lines to the megabyte.

        If one include brings in 40 files, each of which is 100k loc, that’s a quarter gig, more or less. Maybe some of my assumptions are wrong.

        And it was never the load-from-disk-the-first-time time, it’s the cost of doing substring matching within that space to find redundant code. With perfect coding, there would be no advantage to doing so, but I’m uncomfortable assuming that.

        Processors can’t address bits on disk. Processors can ONLY address memory and hardware, and some hardware can also address memory. The processor should generally avoid waiting on disk access, because disk access is slow compared to processor speed. Even more specifically, the processor can only *directly* address cache.

        If the current code+executable+compiler+OS doesn’t fit in cache (which it never will), there will be a slight reduction in maximum speed as cache is swapped out with RAM. If the above content doesn’t fit in RAM, there will be a larger delay as it is swapped to disk. (if it doesn’t fit on disk, the compile fails). SSDs won’t help much, because it’s more a latency than a throughput thing.

  4. Ermel says:

    I realize you’re simplifying things here, and I also realize I’ve been out of this topic for decades, but I still have to ask: why does your system need to recompile all these libraries all of the time? Isn’t there a mechanism that can distinguish the one or the few files that changed since last time, and only recompiles those (and maybe the ones dependant on it)? I mean to remember something that was called an object file from a dim past.

    Care to elaborate please?

    1. Gabriel says:

      At the very least, “make” has facilitated what you describe for ages – for the library itself. I think the problem more is if you changed a line in foo.c, and foo.c includes windows.h, steam.h, and kitchensink.h.

      1. Decius says:

        Why would it need to recompile windows.h, steam.h or kitchensink.h the first time foo.c is compiled?

    2. Phil says:

      Visual Studio, at least, does that: it will only recompile files that have changed since the last build. But that does still involve going through all the headers that file includes, and potentially duplicating work that will be thrown away by the linker.

      Precompiled headers can help with that a lot. You add a header to the list of precompiled headers (in Visual Studio, this is typically done through “stdafx.h”), and once that is compiled, it doesn’t need to be again; any time that header is seen in a translation unit, it knows the work’s been done and can skip that step (or, something akin to that). Best used for header files that are somewhat stable (either your own code, or the standard library), as you don’t want to have to recompile the PCH often if you can avoid it.

    3. Olivier FAURE says:

      Every build system on earth has some way to cache the results of previous compilations and only recompile the code that is potentially affected by your changes.

      The problem that C++ more than any other language suffers from is that changes to some parts of the code (header files) have the potential to affect a lot of other code, even though most of the time you only want to make minor changes.

      Rust is especially good at this (I’m told, I don’t work with it for my day job), because it can separate compilation into multiple stages, and cache the intermediary results. So for instance, if you replace “x = 7” with “x = NUMBER_OF_DAYS_IN_WEEK”, after a few passes the compiler can detect “Hey, this produces the same logic you already had!” and skip further calculations, whereas other compilers might go “This file was written to, therefore everything that used this file must be recompiled”.

      1. Ermel says:

        Thanks to everyone who responded to my stupid question. Much appreciated.

  5. EwgB says:

    Java can be fairly slow in my experience (though not as slow as C++). I currently work on a large productivity software package, and a full rebuild takes about 6 minutes on a fairly beefy machine. Luckily I don’t have to do this often, and a single file takes maybe five seconds.
    I remember working on a large-ish C++ project in college (it was a computer graphics class, so it was a bit related to gaming). A whole rebuild on my old laptop took a whopping 40 minutes! I once reported a bug to the maintainer of the project (it was a tool developed at my university), I saw him fix the bug and do a rebuild in “just” seven minutes on his Core i7 6-core workstation.
    As far as I know a large problem for C++ is the fact that every time you #include a file, it has to be compiled, and all files it includes and so on and so forth. In a large project you end up building some files dozens, maybe hundreds of times. More modern languages don’t do that. Java doesn’t even bother with linking as such, it just compiles every necessary *.java file into it’s bytecode version (*.class), and the linking is done at runtime by the JVM. The single *.jar file you often see is basically just a zip-archive with those *.class-files and additional meta-data, property-files etc.
    Other languages can be faster as well. Pascal/Delphi was always known for being pretty fast. When I worked on a Delphi desktop application, it would build the 2-3 million LOC project in a couple of minutes, for a full re-build. Pascal was originally designed as a single-pass compiler, so it would never look at a file more than once. In a more modern example, Go was also designed with a focus on compile times, according to legend by some Google engineers who were bored waiting for a 30 minute compile of their C++ project.

    1. John says:

      One thing I’ve noticed about Java is that the first compile of the day takes significantly longer than all subsequent compiles. I’m not entirely sure why. The same is true for Java applications. The first application of the day takes significantly longer to get started than all subsequent applications. For applications, I expect it has something to do with getting the JRE going. I suppose something similar must be going on with the compiler. I count myself lucky that all of my Java projects and applications (with the exception of Netbeans) are so small that “significantly longer” usually amounts no more than five or so extra seconds.

    2. Cubic says:

      “I currently work on a large productivity software package, and a full rebuild takes about 6 minutes on a fairly beefy machine. Luckily I don’t have to do this often, and a single file takes maybe five seconds.”

      Ironic.

      1. EwgB says:

        Well, it’s working with the software is what supposed to be productive, not on it. The compile time is not even a significant part of the difficulty here. It is a monster of an application, both in terms of raw amount of code, and in the complexities of the architecture and underlying data model. Just as an example, the data for a customer is stored across at least a dozen database tables. There are several hundreds tables overall. In addition to all that, the app was started with Java version 1.0, all the way back in 1998. So digging through the code is like archeology, going through younger and older layers, and sometimes finding dinosaur poop on the way.
        And that’s not even considered an old legacy system in my company. We have an even older system, written in a language called MUMPS (look it up, it’s horrible), that is still used by the largest of our customers. Think ASCII based interface, white text on black, keyboard only. There is no longer active feature development on it, but there is still a developer team fixing bugs for the old system.

        1. Cubic says:

          Lol, wasn’t Zork written in MUMPS or something like that?

          You have my sympathy btw.

          1. EwgB says:

            No, Zork was MDL. About the same age (MUMPS is 5 years older), but quite different in principle. MDL is apparently descended from LISP, so it’s more of a functional language. MUMPS is similar to COBOL, a procedural language with a focus on business data processing, with it’s main distinguishing feature being that it sits directly atop a database. What this means is that the database is part of the runtime environment. To read from it, you don’t need to talk to the database driver, write SQL statements, or use some ORM (object relational mapping) framework. Instead, all the tables are present as objects at runtime in your environment, you can just access them as if they are variables in your code. Which makes data access very fast and convenient, especially for the time this thing was developed. Also means that you can break stuff pretty easily.
            Also, the language has this quirk where you can abbreviate every command to a single letter (QUIT to Q, WRITE to W). This was done to save on storage, which was tight in the sixties. Also makes programs very hard to read.
            Luckily, I don’t have to work on that stuff, or I wouldn’t have taken this job in the first place, I only do Java. I’ve never even seen the MUMPS code from my company. Only examples I’ve seen are actually from Wikipedia and a funny site called thedailywtf.com, which hosts stories from IT and software development. They have some choice horror stories on MUMPS and it’s (mis-)uses.

            1. Cubic says:

              Oh right, MDL might have been pronounced ‘Muddle’ which is close enough for my confusion. Makes sense that it was Lisp-like too since I seem to recall the Zork guys came from MIT.

  6. Joshua says:

    As a non-programmer, I do find this kind of article interesting. My only real experience with coding was using NWScript in Neverwinter Nights, and I do remember trying to make a change, clicking Build Module (or whatever the command in the toolset was), wait for all of the scripts to compile, hoping for no errors, launch the module on a test server, and then go see if the code actually worked. This could take 5-10 minutes, and it was a very annoying process if it didn’t work, which required further troubleshooting. So easy to lose hours and hours on this process.

  7. tsholden says:

    One of my favorite things about prototyping games in Love2d (a framework that runs games written in Lua) is that there is no compile time, you simply run the `love` command in your project directory and you’re off to the races. Even better, if you’ve decided to use something like the lovebird library, you can open a browser window which lets you inspect and manipulate values in realtime or run arbitrary code, much like how you can mess with javascript in your browser console. Super useful when you want to quickly experiment with different physics settings, etc etc.

    Disclaimer: while people have managed to make 3d games in Love2d, it’s intended for 2d development, so I’m well aware that this is an apples-and-oranges comparison with what Shamus is talking about. But for the sake of prototyping it’s an immense difference.

  8. ElementalAlchemist says:

    As a non-coder that occasionally dabbles in 3D rendering, when I finally got to the part where you revealed you were grumbling over a whole 10 minutes, I admit to laughing out loud.

    1. Paul Spooner says:

      The real-time raytrace render preview in Blender is really great for avoiding this kind of problem. I suspect other packages have similar features.
      But yes, render times can be killer. I try to keep mine down to 1 minute though, especially on animations. 10 minutes per frame would be a real long process.

  9. Milo Christiansen says:

    Go was designed from the ground up for fast compile times, so as a programmer who does a lot of work with Go the compile times Jai has are nothing special.

    1. pseudonym says:

      This made me go to go’s wiki and that got me to this interesting read by one of the go developers about the differences between go and c++ : https://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html?m=1

      Some people think less is more, others think more is more. I have not worked with c++, but I have worked with scala. A language that also follows the more is more philosophy. In scala the compilation is very slow. The build tool (sbt) is slow. Testing is a chore.

      Less is more in my opinion. It keep things simple, fast and testing is very rewarding and fun!

      The ultimate less is more language seems to be scheme, but I have not worked with that. Is there anyone here who has some experience with scheme?

      1. Chad Miller says:

        Is there anyone here who has some experience with scheme?

        I actually use Racket for personal stuff occasionally (generally when I feel like doing math and really want exact answers without having to worry about practical compromises like floating-point numbers or limited-size integers). That said, this is a Scheme variant that accumulated so many features it stopped calling itself Scheme so I don’t know that it’s a good example of the “less is more” principle you’re talking about.

      2. Retsam says:

        I’ve dabbled just a bit in Scheme (and more broadly in the LISP family), and the compile times for scheme are unbeatable.

        … because it’s a dynamically typed language that doesn’t use a compiler.

        With dynamic typed languages, in general, the tradeoff is quicker cycles – I can test my code very quickly – but without the compiler catching errors you have to test your code a lot more often. Some languages (e.g. python) also have compilers which can give some of the performance benefits of a compiler, but it seems they can only do so much.

        But yeah, it terms of overall design Scheme (and this LISP family) are very elegant languages – very few concepts, and a lot of flexibility, but it’s an extremely apples to oranges comparison to most of the languages discussed here, as the language leaves a lot on the table in terms of performance.

        It seems like Clojure is a decent compromise on the “less vs. more” spectrum in the LISP family – it’s still very LISP-y but introduces a number of concepts for the sake of pragmatism – different types of ‘lists’ for different performance characteristics and such, compared to the classic “cons” linked list of LISP families.

        I found the dynamic typing aspect of Clojure a bit frustrating, though, as it’s a language that encourages fairly complex types, but doesn’t have a compiler to help you get them right, so I spent a lot of time having to rerun my program due to those sort of “trivial” errors.

      3. tmtvl says:

        Less is more in my opinion.

        “Less isn’t more, less is less.” -Lordi, “Bringing Back the Balls to Rock”

        I will say, as a Schemer, working more barebones helps you become a better software architect.
        I will add to that, as a Perler, having a more developed base helps you become a better software developer.

  10. Kyle Haight says:

    It’s amazing to me that people think a 10 minute full build is slow. I had to do a full from-scratch source pull and build on a project at work yesterday and it took over 2 hours. At least 90 minutes of that was compiling. And I work at one of the FAANG companies, so theoretically we should have our shit together.

    Rebuilds are much faster, thank goodness.

    1. Cubic says:

      I remember recompiling Open Office while having a Thing for gentoo. I had to leave it overnight, and that’s when I realized gentoo and I had to break up.

      On the other end of the spectrum, Naughty Dig had their Lisp dialect, GOAL, where you could load patches to the game while it was running and being debugged. Apparently quite convenient. (Playstation 2)

    2. Echo Tango says:

      Where I work, if any system takes longer than 5 minutes to build and deploy, people are already complaining from the lost productivity. Some of our legacy stuff takes a full 45 minutes, which is viewed as very tiresome indeed. ^^;

  11. Timothy Coish says:

    “The compile times of boost itself borders on surreal. I think a full compile of boost[10] is something on the order of 10 minutes.

    TEN MINUTES!”

    I know one of the guys who worked on Gears 5. BTW, don’t blame him, he’s just a programmer, and a very good one at that. Anyway, they’re using the Unreal 4 engine, and a full re-compile of the game takes over 45 minutes. Of course, the codebase is over 17 million LOC, but still. That’s 2700+ seconds, so we’re looking at about 6000 LOC/s compilation speeds. A comparison with Jai is even more unflattering, since Jai does not have header files, which gives it a smaller codebase. When Jon Blow recompiles his 100,000+ LOC project in less than a second, the equivalent C++ compilation speed would be even higher than 100,000 LOC/s, due to the addition of header files.

    It takes the guy longer to alt-tab, hit the up arrow key to get the last terminal command, then hit enter, then it does to actually fully compile his 3D game including the engine.

    1. Decius says:

      100k LOC, in 1 second, on (assuming for example) a six core 3gz processor, (20gigaflops) means that a line of code can be compiled in 20M processor cycles or fewer, including I/O.

      That’s not slow enough to do global optimization across the entire codebase, for example by noticing that a given subroutine is never called and therefore can be omitted from the binary, or noticing that a given variable is set only once at startup and never changed, or several other compiler-level optimizations.

  12. Cubic says:

    As far as I can tell, nobody has done the homework to compare the performance of C compilers in the mid-90s with the compilers we’re using today.

    If you compare mid-90s gcc with current day llvm, you will see that llvm has an insane number of optimizations and passes. But I get the impression that the low-hanging fruit had been picked by the mid-90s so it doesn’t matter all that much.

    There is something called Proebsting’s Law, which indicates that compiler performance improves by 4% per year, doubling in 18 years (compared to Moore’s Law’s 18 months). After formulating this law, Proebsting apparently changed course, left compilers and went into management. (He’s a professor now, from what it looks like.)

  13. baud says:

    There are a dozen different things you can try in order to diagnose this problem, but every attempt costs you three minutes, even though the change itself takes two seconds. If you try them all, then you’re going to spend 36 minutes re-assembling everything again and again, and in all that time you’re only going to do 24 seconds of real work.

    One very good feature in Java is live code reloading, when debugging, the code execution stops at a breakpoint, I can edit the code, the compiled java code is automatically reloaded and I can continue debugging step-by-step with the new code. In my case it avoids having the copy-paste the new jar on the server, since I can’t run the project locally.

  14. Sven says:

    Most languages that can reason on more than one file at once are significantly faster to compile than C++. Actually, I think I simplify that to saying most languages are significantly faster to compile than C++. C++ is so insanely complex to compile that nothing else even comes close.

    And if you think 10 minutes is bad: I work on Windows. Compiling Windows from scratch takes between 10 and 15 hours (and this does not include a bunch of post-build processes that are needed to get it ready to install). And it takes that long for every architecture (x86, amd64, arm64, etc.), and for every branch (there are probably over a hundred that are built every night). So yeah, the Windows engineering system infrastructure is HUGE.

    Fortunately, developers don’t ever need to run this kind of build on their own machine. We just build the components we’re working on. :)

    1. baud says:

      Yeah, for us a full build from scratch can take most of the day (with a dev babysitting the process, since the home-made build tools are quite finicky), but that also include msi generation, antivirus scanning, delivery signing, build of the iso and uploading the artifact. But usually a dev just rebuild one of a ~ 100 modules and those take between 5 seconds (small java module) to 2 minutes for our C++ modules to 5 minutes for the msi.

  15. The Rocketeer says:

    Yo, just want to chip in that I’m reading and really enjoying all these articles, as I do all your coding articles. I just don’t have anything to add on the substance and as a non-coder the comments are basically gibberish, but I’m still enjoying following along week by week, like a dog watching golf on the TV, as I’m sure many are.

    1. Baron Tanks says:

      I do enjoy it too under the same circumstances and have posted this before, you just gave me an excuse to reiterate it.

      But mostly I’m now curious about the many dogs watching golf. I had no idea that was such a mainstay hobby for dogs :)

    2. Kathryn says:

      Yeah, I’m a hardware engineer leading a software team (long story), so the insight is helpful. I’m not sure what languages we’re using, though. I know across the larger team (mine is networks, while others cover overall integration and certain technical functions), we employ different OSs on different computers (I refer here to the computers on the final product, not the ones at our desks). I haven’t inquired into why, but I assume it’s a combination of different tools being best for the wildly different jobs said computers are doing and the need for dissimilar solutions for redundancy.

      As a side note, part of the challenge of leading a software team is that I don’t speak software. For months I thought regression testing meant, you know, regression testing – where you vary an input over its range and observe what happens to the output. (This idea makes perfect sense in our context, where the behavior of effectors depends on the inputs, and people die if our effectors don’t do the right thing at the right time. Which is why we need redundant, dissimilar solutions.) Turns out that in the software world, regression testing is just re-verification testing.

      1. Leeward says:

        Some etymology: regression testing is testing for regressions. Software regressions are where something that used to work no longer works. Thus, re-verification.

  16. OldOak says:

    Simple compiling speed exercise between C and C++.
    Two open source libraries that were “competing” to replace Motif in ’90s: Gtk+ and Qt. Gtk+ written in plain C and Qt in C++. Try to compile both using gcc (and its g++ counterpart). Disclaimer: I didn’t do this in a long time (maybe +15 years). But I’m still sometimes compiling from source some apps written for each of these libraries.
    Bottom line: you’ll be still compiling Qt for a good while after finished with Gtk+ including all its depending libraries (Glib, Atk, Pango, Gdk-pixbuf, and Gtk+).
    C++ takes way longer to get that .o file out of .cpp code than C, even at the same optimization level. But it’s also true that finding a googled reference for a game programming hint in C++ is way easier, and faster, that its C counterpart.

  17. Frank says:

    I worked on a codebase at work that was several hundred thousand lines of heavy C++ template code. It took about 8 CPU hours to compile each of the debug/regular builds, several minutes to link, and created a 2.1GB object file. We had a special LSF cluster for builds on our server farm that would reduce it to about 40 min. of elapsed time on 16-24 cores. The longest file took about 20 min. to compile so there was no way to get compile time under that value. Even a small change to a simple file took 3-5 min. to compile and link. It drove me crazy. I’m not sure how anyone can get any work done in that environment. The project I normally work on only takes 2 min. to compile with 16 cores and 10 min. on a single core.

    1. Cubic says:

      I’ll go out on a limb and claim your app didn’t have great performance either, because 2.1 GB.

      1. Frank says:

        It took some time to start up, maybe 2 minutes. But then, this is the type of compute heavy tool that typically runs for many hours on hundreds of CPU cores.

  18. kdansky says:

    Note that C++’s compile times are a big outlier in the world of modern programming languages. Everything (!) else compiles a few orders of magnitudes faster, most languages are as good as instant. Many languages are so good at it that they can compile while you type code, and immediately highlight mistakes you make that would take a 10 minute roundtrip in C++.

    10 minutes is not actually all that much for a C++ project. If you have a million lines of code (e.g. a modern fat application, like photoshop or other expensive business software), then you’re more likely looking at an hour instead.

    This is also a reason why nearly all modern games use at least two languages: C++ for the engine, and a script language (often lua because it has a great interface to C and C++) for the business code. That way they can change stuff like AI behaviour or triggers on the fly without even compiling at all, without losing the performance advantage that low-level C++ can grant.

    1. tmtvl says:

      That’s what Swig’s for. Be it LUA, Perl, Tcl, Scheme, Ruby, or any of a bunch more of higher level languages, they can integrate C/CPP code with few heartaches.

  19. parkenf says:

    Speaking of games can anyone say what’s the best in class for a sub £100 (insert your Brexit currency exchange joke here) PC game controller, as I’d like to buy one for my son who’s moved out, who was using an old PS3 controller to play GTA5 (on PC, not the PS3). Is it just a PS4 controller, or is there one that all the smart gamers are playing with?

    1. The one with the easiest compatability is an Xbox controller typically because a lot of of games from the last era were ports that were designed that way.

      Personally, I use a PS4 cause that’s the console I have (and I prefer it’s layout) and if he was using a PS3 controller, it means the software to use a PS4 controller is already on there and SHOULD be as simple as connecting it.

    2. tmtvl says:

      Steam Controller is the best around. It’s easily worth thrice what it actually costs.

      1. parkenf says:

        Thanks. Ordered that. Had to order from Steam directly as Amazon are scalping.

        1. Warlockofoz says:

          Mild support for the steam controller from me as well.
          Pro: can do anything a standard controller can and a bunch more.
          Con: needs configuration, feels weird (especially if used to a standard controller).

          1. Richard says:

            Does the Steam Controller still need Steam to be running for it to appear as a ‘controller’?

            I’d quite like to use one on hardware that can’t run the Steam client for various reasons, but last time I tried it wouldn’t work.

            1. tmtvl says:

              There is a tool called SC controller, if you can run GTK apps. If not, maybe you can have a look-see if you can extract useful code.

              1. Richard says:

                Thank you, sadly that seems to require python, rather than just have python bindings.
                The platform doesn’t have the spare resources to run Python – it’s very much a barebones Linux.

                However, that pointed me down a very productive Github search, and I found a couple of projects that must just be adaptable. :)

          2. parkenf says:

            Steam say on their website that there’s a good crowd support community providing downloadable mappings for available games. I’ll see how true that is.

  20. Zak McKracken says:

    I mostly code in Python, so compile times are nonexistent :)
    …on the other hand, compute times are very much existent, and so are other things I’m doing. Some of it involves opening huge files of flow simulations of 14GiB or so, which then expand to 200+ GiB in RAM, and take several minutes to open (let alone do anything with them) … so the issue of waiting ages during troubleshooting is very familiar, and between one and about 5 minutes is just the worst. Stuff that takes over night, you can deal with but if it’s urgent and there’s nothing useful you can do in the meantime without losing the thread, that’s pretty awful.

    What I try to do when troubleshooting something that takes minutes to test: Keep lots of notes of the approaches I’ve tried, and the next several approaches I’d like to try if the others fail, and what to do depending on the outcome of those attempts. Then, while one attempt is being tested, I can implement the next one and start the test on that. While that’s running, check what happened to the previous one. Or go over the notes again, update what worked and what failed, and also update the decision tree because otherwise I’ll totally forget what I wanted to do in case xy fails again.
    …I’d like to think I’ve got a decent system there, but it’s not a mode of working I can sustain for long, particularly not under stress.

    1. pseudonym says:

      I work with 20GB+ files regularly. What we do in our field is use a very small subset of the dataset for testing purposes. Of course this only works when data is easily subsettable (is that a word?)

      This allows you to debug the program much faster before you try it out on the real data.

      1. Richard says:

        Of course.
        However there are often issues that don’t happen until you are using the entire dataset.

        While in most cases those are due to a small number of them ‘colliding’ in some way, it’s often impracticable to determine which ones are causing the problems.

        And sometimes the problem is in fact the sheer volume of data.

        I’ve had programs that should theoretically fit in memory just fine, but in practice allocator fragmentation means that sometimes it doesn’t – the memory tools all claim there’s sufficient “free” memory, it’s just that there wasn’t a big enough block at that moment – worse, a millisecond later, there would have been as another thread would have completed and cleaned up.
        Thankfully moving to 64bit ‘fixed’ it. I’m not doing ‘big data’, so I’ll never run out of 64bit address space.

        I didn’t really fancy writing my own “damnit-try-again” allocator, even though C++ makes that pretty trivial.

  21. Leeward says:

    I think I’ve made this point before, but C++ has exceptionally slow compile times. If I were designing a language to be slow to compile, C++ would be a good starting point. It’s got a phenomenally complex grammar, plus code generation (which makes the grammar more complex). On top of that, the most straightforward way to write C++ programs involves putting information about implementation in headers. That leads to the fact that if I have a class for representing player state, and I change the implementation of the hit point calculation to cache something new, I have to recompile EVERYTHING that uses that class. So on top of incredibly difficult compilation, I have to recompile stuff unrelated to my changes all the time.

    Meanwhile, the other languages you mentioned (Go, Rust, D) don’t have either of these problems. D (2) and Rust both have generics which take more horsepower to compile than something like C, but they all have sensible grammars and module systems.

    Go, in particular, was designed to be fast to compile. In my experience, it’s been pretty successful. I haven’t made any large Rust programs, but the decent-sized D programs I’ve written have had negligible compile times.

    Incidentally, the fact that a language uses LLVM does not imply that its compile times will be on par with C++. Maybe once it’s parsed and compiled to LLVM’s machine code, but the uniquely expensive part of compiling C++ comes before the optimizers.

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply

Your email address will not be published. Required fields are marked *