Experienced Points: What Does the End of Moore’s Law Mean for Gaming?

By Shamus Posted Monday Aug 31, 2015

Filed under: Column 122 comments

My column this week is described perfectly by its title. I always get nervous writing about hardware. I’m not a hardware guy and I’m more likely to make factual blunders in that area.

I didn’t get into it in the column, but it’s sort of unfortunate the consoles launched when they did. They’re just barely (in Moore’s terms) short of the power needed to handle 60fps games and VR. Another eighteen months might have fixed that problem. Then again, nobody realized 60fps was going to be a big(ish) deal, and it would be suicide to show up to the market 18 months after the competition. You don’t want to launch a next-gen console into a market where everyone already has a next-gen console and several games. You either want to launch at about the same time and at roughly the same power level, or you want to launch several years later when you can have a nice technical advantage.

Or you can do what Nintendo does and put out an “under-powered” console and focus on gameplay instead of technology. But that’s crazy talk.

 


From The Archives:
 

122 thoughts on “Experienced Points: What Does the End of Moore’s Law Mean for Gaming?

  1. Tektotherriggen says:

    Are we likely to get a “generation-and-a-half” this time? The only difference being a faster graphics card, so that existing XB1 / PS4 games can run at full 1080p/60Hz.

    I’m vaguely aware that previous consoles shrank during the generation, as new versions were brought out, so it isn’t unprecedented.

    1. Shamus says:

      A mid-gen upgrade / revision strikes me as being very likely. Moreso for XB than Sony.

      1. Dreadjaws says:

        The only console I remembering doing such a thing is the 3DS, which now re-launched with a more powerful processor, a couple of more buttons and such. Other consoles have had upgrades, such as slimmer designs and larger hard drives, but I don’t remember any other console for which new games would suffer a drop in quality if you tried them in the older version.

          1. Dreadjaws says:

            It’s not the same thing. Those are not modified consoles, those are accessories or cartdriges with extra hardware.

            1. Wide And Nerdy says:

              Same difference. They could have put that extra hardware in the console. the only advantage of the accessories is that they look good on existing consoles.

              1. Joe Informatico says:

                If you’re counting those, then Nintendo has been doing that since the NES era.

          2. Soylent Dave says:

            The Sega Megadrive / Genesis (from the same generation as the SNES) did something similar, beginning with expansion cartridges (again, just like SNES coprocessors).

            They then moved onto the Mega-CD / Sega-CD – a bolt-on peripheral that added a CD-drive, plus an additional (faster) CPU and more RAM.

            And finally the 32-X which was sort-of a new graphics card (upgrading the console from 16-bit to 32-bit graphics).

            If you bought all the peripherals, your Megadrive / Genesis ended up as this hideous frankenstein’s monster, complete with hardly any decent games that really took advantage of either one of the bolt-ons and only six that took advantage of both.

            That might explain why those sorts of upgrades didn’t catch on, in the end… as a developer, when you release a game on a console, you do it because you know that every owner of that console is a potential customer.

            Balkanising the player-base of a single console is risky business. It’ll be hard to pull off well (which isn’t to say that no-one will try).

          3. Tektotherriggen says:

            It’s fun to see how game distribution costs have plummeted. From cartridges that carried their own dedicate graphics processors, to digital downloads that probably only cost a few cents (if that).

        1. Volfram says:

          The N64 had a RAM pack that some games benefited from and others required. Each Game Boy generation from Color up to Advanced had intermediary games which were playable on the older generation but had features to take advantage of the new one(like the Color dungeon in Link’s Awakening DX and the Advance room in the Oracles games. The former only enabled if you were on a GBC, the latter could only be entered on a GBA)

          1. Dreadjaws says:

            But I’m not sure if you can call the gameboy generations “upgrades” as much as “different consoles with backwards compatibility”. In any case, it seems that this is the kind of thing that only comes up in Nintendo portable consoles.

    2. Chefsbrian says:

      Perhaps, but the very first Xbox 360’s are still capable of running the very end of generation 360 games, because the hardware is still practically identical, just reconfigured. If they made an actual, notable upgrade to this generation of machines, they’d have an actual performance difference between the early machines and the latter, or god forbid stuff that only runs on the new machines. Neither of those will go very well. You’d also end up with troubles of trying to optimize the game for two different graphics chipsets, which starts to get into the PC trouble area of fighting multiple hardware configurations.

    3. Moridin says:

      I don’t think their CPUs can keep up with modern games at 60 FPS. They use weak(relatively, anyway) jaguar core design meant for netbooks AND they’re fairly limited, clockspeed-wise(only about 1.5 GHz). We already have plenty of games that struggle to hit 60 FPS with much more powerful CPU cores running at 3.5 GHz or higher.

      1. Tektotherriggen says:

        I’m assuming that resolution and frame rate are purely graphics card limited – it’s far more about pushing pixels to the screen, than about pushing more polygons to the graphics card. I could be wrong.

        If the CPU is the limit, I’m sure that the clock speed could be increased without changing anything about the architecture or instruction set. Perfect backwards and forwards compatibility (except for graphics speed) should be relatively easy.

        Is it obvious that I’ve never programmed a game? :-P

  2. Tektotherriggen says:

    Also (maybe this would be a Diecast question), why do we have console generations anyway? I can see that being “first” to release a console is an advantage, but how and why (other than industrial espionage) do different manufacturers release at about the same time, after many-year development times?

    As it is, gamers are forced to choose their “loyalty” when the consoles are released; but if different manufacturers released new consoles every two years or so (but only, say, every six years for each company), they might persuade people to buy “the new one” each and every release.

    1. Chefsbrian says:

      Basically market impact. If Sony came out with the PS4 two years before the One came out, you’d have a two year period where next generation gaming simply IS Sony. By the time Microsoft has gotten to market, the PS4 would be everywhere, people would already have larger libraries for it, adjusted to the controls, ect. You could make a gamble by spending those extra years making your hardware better, but that’s not going to make a huge difference, really, because for the initial years the other console is still going to be competitive, and it will hurt you greatly.

      In the end, I’d not be persuaded to buy a console every two years from the different players, because my Playbox whatever is still gonna be able to do what the Xgames thirty they just released can do for years, and by the time it is starting to fall behind, there’ll be a Playbox one-eighty on the market. Only the people who are incredibly deep into the hobby will feel the need for every console to own every small exclusive, the Mass market will buy one to play after work and be done with it.

      1. Tektotherriggen says:

        Except that 2 years before the PS4, then-gen gaming was the XBox 360, and 2 years before that, the Wii was everywhere, and so on. You’re argument works for why the “generations” system is maintained, but not why it started in the first place.

        Data from Wikipedia; release dates are a range to account for different countries:

        Nintendo NES/Famicom released 1983-1987; Sega Master System 1985-1987

        SNES 1990-1992; Mega Drive/Genesis 1988-1990 (where the SNES wasn’t completely blocked despite being late)

        N64 1996-1997; Saturn 1994-1995; PlayStation 1994-1995 (with the Saturn probably doing worst of these, despite coming out just before the PS1)

        GameCube 2001-2002; Dreamcast 1998-1999; PS2 2000; XBox 2001-2002 (being early didn’t help the Dreamcast much)

        Wii 2006; PS3 2006-2007; XBox 360 2005-2006

        Wii U 2012; PS4 2013-2014; XBone 2013-2014

        I’ve kind of forgotten the point I was going to make with all that… The synchronisation is improving over time (except for the Wii U). But presumably, at some point in the late ’90s, someone had to rush their console development to catch up with the others, with all the risk and expense that involved. There must have been a very good, deliberate reason for that, given that in the 16-bit era, the delay between international releases was almost a full console generation itself.

        1. Thomas says:

          I don’t think graphics and hardware actually matter that much to people. They don’t want to be short-changed, they want what everyone else is having, but I don’t think they care enough to buy a PS4.5 if they already owned an Xbox One and had games for it.

          After a while of having old hardware though, an interest builds for games that do more and look better. As it becomes more and more obvious that there’s better hardware possible the demand to have new stuff grows.

          It’s not in the console manufacturers interests to release regularly, more expensive hardware means less profits, less optimised manufacturing and less optimised games (so getting less quality per pound of hardware). The developers and publishers don’t really care, but if they have to they’d prefer the hardware stayed standardised and old.

          So the game theory is, you want to wait as long as possible before releasing an upgraded console, but you want to release a ‘latest hardware’ console before your competitors do to take advantage of all that pent up desire.

          If you release a console two years after your competitor, there’s no desire for better hardware and the better price and games library of your competitor will drive you out of business. If you release one too early you make no money from it.

          So the optimal solution you eventually arrive at, is you and your competitor releasing consoles at roughly the same time. You get to share marketing and hype for the ‘new generation’, the developers like it, the publishers like it and you maximise the audience desire for a new console. If you try and rock the boat by releasing too early, you suffer decreased profits due to manufacturing problems and the audience caring less.

          1. Tektotherriggen says:

            Yeah, sharing the marketing hype does sound like a good strategy.

        2. Joe Informatico says:

          I suspect it’s largely a retroactive classification that’s now become the norm. My first console was an Atari 2600 over 30 years ago, I then owned an NES and a Genesis, and I can’t remember hearing the term “console generation” until the early or mid-2000s. In the past, usually a manufacturer with an inferior market position would release their new-tech console first, to try and gain a headstart on the market leader, e.g. in your list above, the “winner” of the last console generation never released their console first in the next generation*–and why wouldn’t they ride their market dominance as long as possible? It’s only in the past 10-15 years I can remember grouping these releases into generations.

          I’d be interested if anyone knows the actual history of the term “console generation”, or even “generation” in the terms of any manufactured product (e.g. iPods, Volkswagens) but otherwise, my best guess is as people who grew up with games, especially from the NES era onwards, became writers, journalists, academics, or tastemakers, some of them started chronicling the history of the medium. “Console Wars” have been a popular narrative of the medium since the NES/Master System days, so at some point, some writers must have grouped the competing consoles into “generations” for ease of reference.

    2. AncientSpark says:

      I imagine it’s because you need a substantial advantage to lure people away from the console that they already bought, such as very noticeable hardware upgrades or killer apps and 2 years or a “half generation” isn’t long enough for companies to feel confident that they’ve generated that advantage. It has worked before (the PS2 successfully shutting out the Dreamcast, for example), but with so many big console makers being risk-averse, I imagine the prospect has become unfavorable for such companies.

  3. Dreadjaws says:

    “Or you can do what Nintendo does and put out an “under-powered” console and focus on gameplay instead of technology. But that's crazy talk.”

    Sarcasm aside, it might be. As much as I love Nintendo’s “fun-first” approach to gaming, there’s no denying their console is the least popular from this generation. Granted, this is in part to their absolutely bonkers approach at marketing (there’s still a ridiculously high amount of people who believe Wii U is just a tablet accessory for the Wii), and also due to the fact that they keep alienating third-party developers, but the competition has managed to convince that graphics quality is the important thing even though the previous gen proved the opposite.

    Also, I can’t get myself to care about 60 fps. I see the difference, I just don’t consider it valuable enough. I didn’t care for it in the previous gen, and I sure as hell don’t care this one. I see some gamers claiming that 30 fps games are “unplayable”, and as much as I bash my head against the wall I really can’t figure out what’s wrong with those people.

    1. guy says:

      Nintendo has sort of gotten into a rhythm where everyone makes fun of their new hardware and it sells poorly and then they start putting out their exclusives and the hardware starts flying off the shelves. I expect that by a few months after they put out the new main Zelda game (not the three-player one) they’ll be selling quite well.

      1. Dreadjaws says:

        Yes, they do receive a boost in console sales every time a new first-party game is released. But, again, it’s not just “exclusives”, it’s “games made by Nintendo itself”. You don’t see the consoles flying off the shelves with third-party exclusives.

        The thing is that while Sony and Microsoft’s consoles tend to sell in a steady line, with only price drops giving them a boost, Nintendo’s home consoles (the portables are an entirely different history) tend to have many boosts at certain game launches and then large periods where they sell poorly.

        Following Shamus’s affirmation that all hardware discussions need some silly car analogy, it’s like Sony and Microsoft ride faster cars, while Nintendo rides a slower one, but it’s able to use a nitro boost more often.

        Again, none of this has anything to do with the quality of their console and all with the marketing and third-party support. Plus, they do have one major negative, and it’s that they still push for regional restriction (i.e. you can’t play japanese games in american consoles and such), which is absolutely ridiculous in this day and age. That means that, effectively, you’d be paying for a console in which only about a third of the available games for it can be played.

        1. Wide And Nerdy says:

          Its more like Nintendo made the car and trained their in house driver on the exact performance capabilities of the car. Then they designed race tracks that the car can handle and look good running on. Meanwhile Microsoft and Sony bought a bunch of high horse power cars and let anybody take a turn driving them on any track but with a brick on the gas pedal.

          Even that analogy doesn’t feel quite adequate.

          The thing is, Nintendo limits their graphics so that the games can run at a rock solid 60fps. I think the only place they miscalculated was the gamepads. Rumors indicate they wanted to look into multiple game pads but the problem is the system didn’t have enough horse power for that many screens.

          I’m really hoping the NX plays with something like that, a souped up Wii U where people can interface more gamepads for amazing asymmetric play. Maybe the gamepad could function as a portable when away from the base running a more 3ds version of the game, the act as a streaming device when near the console. Or something.

      2. Thomas says:

        Except for the ‘hardware flying off the shelves’ part. In terms of consoles, the Gamecubes final sales were abysmal, the N64 total sales were abysmal and even right now, with all the Wii U hype you feel around you, the actual sales of the Wii U are still abysmal. It’s competing with the gamecube.

        The Wii sold like hotcakes from release, and Nintendo handhelds always sell well, although I believe the 3DS was actually a real disappointment compared to their previous handhelds. The 3DS sold a third of what the DS sold, and half of what the gameboy sold.

        Even the Wii wasn’t a world beating success, it ended up being the 5th best selling console behind the PS2 and the PS1, and it was only 15 million ahead of the PS3 and Xbox 360, compared to the Wii being 50 million behind the PS2.

        The PS4 has _already_ outsold the Gamecubes lifetime sales, and it’s sold two and half times more units than the Wii U despite being on the market for a year less. The Xbox One has managed to outsell the Wii U’s lifetime sales
        https://en.wikipedia.org/wiki/List_of_million-selling_game_consoles

        1. Wide And Nerdy says:

          They don’t have to though. They’ve already announced they’re running a profit on the console cycle. Just didn’t happen as fast as they would have liked.

    2. Retsam says:

      See for me the Nintendo console’s wildly different specs and peripherals sort of reminds me of Shamus’s criticism of the PS3: the idea that it might be an attempt at trying to get developers to create exclusive content for the WiiU by making it hard to develop cross-platform titles. Regardless of whether it’s their intention or not, it’s certainly the effect, and I really don’t think it’s a good strategy.

      They’ll do alright (unlike the PS3) on account of their fanbase and quality of their first-party titles… but that doesn’t really mean that it’s a good strategy.

    3. Jonn says:

      I see some gamers claiming that 30 fps games are “unplayable”, and as much as I bash my head against the wall I really can't figure out what's wrong with those people.

      If you want to use aggressive phrasing, then the reason is simple: it isn’t anything wrong with them, but with you.

      Or in terms appropriate for discussion, its down to what games you play and how important response time is. A visually fast paced game like a first person shooter needs to be responsive, or you can’t aim accurately – not just because you have less information to work from, but your input is delayed as well. When the game only updates 30 times per second, you are waiting far longer for what you ‘do’ (like move forward or aim slightly left) to happen, than if you have a reasonable framerate.

      Simple example, imagine your mouse only sends information about where it moves and what buttons are pressed 30 times per second. Your cursor will judder, stutter and skip instead of moving smoothly – or will have some sort of smoothing or acceleration applied, meaning the movement is not accurate. Sure, you can make do with it, and find ways to reduce the issue, but you can also get a mouse that isn’t crippled by not updating properly.
      Obviously how important that will be varies with what you are doing.

      The objective fact is that a higher framerate means more responsive gameplay, so mesh that with the style of game you play compared to what others play.
      Should give a fair answer to why they aren’t happy.

      1. Jonn says:

        There’s also the purely visual side, and despite the claims of people who never actually try it, there is a very real visual advantage to higher refresh rates. Different argument, different reasons.

        Hopefully no one reading this believes the dishonest “people only see about ten images per second” guff.

        1. Bryan says:

          There’s also what happens when, every so often for whatever reason, the game can’t quite get a frame rendered in the time required to get it out onto the display.

          When the normal rate is 30fps, one missed deadline means the game drops down to 15fps once, then jumps back to 30. When the normal rate is 60, it means the game drops down to 30 once, then jumps back to 60.

          Whether or not you notice the difference depends on how far off from normal it is and how this gets assigned a number, how tuned you are to inter-frame latency, and maybe a couple other things I’m not thinking of. If the observer has some kind of floor on noticeable framerates (as in, they can notice the difference between frames if they’re over some fixed time from the previous frame) then more people are going to notice a 30->15 drop than a 60->30 drop.

          (In particular, ~none of the people that play at 30fps normally would notice the 60->30 drop. Large chunks of them would notice the drop to 15, though. Large chunks of the people that play at 60 would probably also notice the 30->15 drop. So by the metric of “more people would notice it”, at least, 30->15 seems worse than 60->30…)

          Basically, being able to get more frames onto the screen in a given time ends up turning into a safety buffer for engine stuttering. The bigger that buffer is, without any added cost of making it bigger, the better off you are — except that added cost is always nonzero. :-/

          1. Jonn says:

            Thought it best to avoid that much complexity. Might be worth pointing out to others that you’re referring to some form of v-sync, which is either simple – lock game/program framerate to screen refresh rate, halve for sudden drops – or one of the complex versions not worth trying to cover in a comment.

            There’s progress on a cure for that, with several approaches to adaptive sync, but they mostly require a monitor that specifically supports it. Better off searching for a proper review if you want to know more.

            On the plus side, monitors with 120+ Hz refresh rate are becoming affordable, so it may not be too unreasonable to upgrade in a few years time.

      2. Peter H. Coffin says:

        I admit to having a tough time following the justification that 60 FPS makes so much difference when 95% of people have reaction time (measured, analyzed statistically, and plotted on a curve) of no better than 0.2 seconds (or 6 frames at 30 FPS) WHEN they’re ready and prepared for the testing. (Curve shown here: http://www.humanbenchmark.com/tests/reactiontime ). It just takes that long to react to an event and push a button that your finger’s already on, so I’m not at all clear how that gets improved by losing 12 frames to nerve speed and muscle twitch instead of only 6 frames when it’s the same amount of time.

        1. Jonn says:

          There are some problems with that assessment, especially using that linked site as strong evidence. It is pointed out that there will be input delay on most systems, and then a rough guess stands in for an answer to how much. The main response delay issue for people is not a singular ‘see bullseye then click’, rather it is things like general moving around or waving the cross-hairs over a moving target; things that need repeated adjustment, where speed is important.

          Slowing one click may not be so bad, however slowing every minor adjustment really adds up.

          Aside form which, among all/most people, 5% is not that small a number (consider how many people 5% of global population is). Narrow it to enthusiast gamers who focus on fast paced games, and that % will be larger still.

          The following is not intended as an accusation, but stems from frustration at people who can’t be bothered trying things, instead ignorantly claiming them worthless:
          It’s fine to have no interest in such games, but entirely unreasonable to claim the delay is irrelevant.

          No offence directed to anyone respectfully noting a different experience. If you can’t accept people having a different range of abilities than yours, though, perhaps you should stay off the discussion pages.

  4. Volfram says:

    I admit I’m not really consistent with reading the Experience Points columns, but honestly, this is the first one that I think has more clickbait than fact in it.

    On console delays: I actually suggested back in 2007 that Sony would have been wise to delay the PS3 by a year instead of trying to beat the Wii to market. It was clearly the weakest of the 3 consoles at launch, and all of the “killer apps” didn’t show up until a year later. Being late to market? Fairly simple to deal with: “You’ve seen the rest, now it’s time for the best.” Or something similar that doesn’t sound like it’s straight out of the 90s.

    Now as for threading, the main problems with threading are that most programmers don’t think concurrently, and most languages are horrific for writing parallel processes in. Most circuit designers, however, do think concurrently(an integrated circuit is in effect “multi-threaded” by nature. Each trace on the board and each transistor behaves as a separate thread.), and there are languages that make concurrent coding easier(OpenCL, Cuda, GLSL, …D), and they’re going to improve, and there will be more of them.

    The example you gave for why parallel processing is a terrible idea for most games is actually a counterexample. You know what *really* benefits from parallel processing? Physics engines. Because a physics engine, at its core, is calculating the interactions of several (often thousand) individual particles at a time, and each of those particles depends, classically, on the state of the particles around it *now*, not later. In an idealized world, the behaviors of the particles in the present are independent of each other, and that’s a problem which is very easy to parallelize.

    You know what else benefits greatly from massive parallelization? AI. Once again, AI is the actions of an individual based on the state of the game now and what it has been in the past. Again, in an idealized world, the order in which actions are calculated is irrelevant, and consequentially, mass AI can be easily parallelized.

    A 32-core CPU could potentially produce massive boosts in system performance. Your programmers just need to design your software to take advantage of as many separate processes as the hardware can handle, without making any assumptions as to how many processes that will be.

    1. guy says:

      Unlike circuit design, parallel software operations may complete at random intervals in random orders. If not explicitly synchronized at the cost of some parallelism, it is entirely possible for a process to start and run halfway, then pause while another process starts and runs to completion. While physics calculations can run on the prior state, it would end very badly if they run on halfway between the prior state and the new state some but not all of the time. Preventing this is not a trivial exercise.

      1. nm says:

        I don’t know if it’s still done this way, but when I was in school we designed operations to complete before the next synchronization point (clock tick) and called it good. While it’s possible to design some operations to complete in time for others in hardware, it’s way easier to calculate a worst case from the switching times on your transistors and set the clock speed appropriately.

        Software’s easier in this respect, because you can guarantee that the gewgaw calculator doesn’t get any input until the input’s ready by having it read from a queue of completed pre-gewgaws. Of course, that doesn’t imply that you’ll have your gewgaws on time.

        1. guy says:

          Hardware is generally synchronized by a unified clock tick, yeah. Software can block, but if you block on everything you have a very high-overhead serial process, so picking what to block on and how is a very important design decision.

          1. nm says:

            Sure, but if you block at your fan-in point with something like select, you’re essentially blocking on “ready for the next frame” or its domain-appropriate analog.

            1. Volfram says:

              I don’t know why you would do it any other way, honestly. That’d be looking for a bad time.

              1. mhoff12358 says:

                Because that’s the way you get the best performance gains? Imagining a game as broken up simply into a world update thread and a rendering thread. One cycle for either of these threads can take a notably varying amount of time. When there are lots of objects on screen, the render thread takes significantly longer than it otherwise might, and when the player causes an explosion and a bunch of physics objects need to move the update thread takes longer than it otherwise would.

                You can’t (and don’t really want) either of the threads to block for the other to be ready. The better (but more complicated) solution involves both threads working independently but sharing some sort of state resource. But then that needs to have smart synchronization to make sure that it doesn’t have any race conditions.

                1. Volfram says:

                  performance gains, perhaps, but at the cost of inconsistent program behavior, which is the “bad time” I was talking about earlier.

                  I mean, I understand. I tried writing my game so that it would simply record the amount of time spent between frame updates and calculate for that, but as it turns out, you don’t want to do that, even if your game is single-threaded, and especially not if you want to be able to do network multiplayer or replay saving.

                  Desynchronize the draw and process threads? Sure, that’s a great way to get your game to run at whatever speed you want(and you’ll mostly be doing asynchronous reads, which are perfectly safe). Desynchronize the process threads from each other? DON’T DO IT!

                  1. mhoff12358 says:

                    Here’s a pretty straightforward case where having rendering and processing wait for eachother might not be the wisest:

                    If you’ve got a particle simulation doing lots of physics calculations. As you talked about, its got many threads working to do the physics updates for each particle. One solution for when to do the rendering would be at the end of the simulation, but there’s a possibly faster solution, as each particle finishes its update pass it off to the rendering code and begin making calls to the GPU as parts of the state get finalized. That way the CPU and GPU are more fully utilized throughout the entire process. Due to pipelining, doing one then the other might have a similar throughput in the good cases, but having less scheduled blocking makes the system more robust against unexpected performance issues in one part of the code.

                    And things get more complicated (with more opportunities for optimization) as the process gets more complex. If there’s a wall being rendered with bullet holes, you need to wait on both the bullet hole placement logic and the positioning of the wall itself. Do you hold off rendering until you’ve got the player’s shot logic resolved in case they put a new bullet hole in a wall? Or do you start by rendering everything that is ready to go and that cannot get bullet holes placed in it?

                    When doing hardware, computation is done in relatively small chunks so doing something like synchronizing to a clock is feasible, but when designing something like a game with much larger (in terms of total time spent computing) chunks there’s a lot more opportunities for improvement lost by just throwing up your hands and blocking to make the timing easier.

                    1. Volfram says:

                      Bro, do you even read? I JUST SAID “Desynchronize the draw and process threads? Sure, that's a great way to get your game to run at whatever speed you want(and you'll mostly be doing asynchronous reads, which are perfectly safe).” OF COURSE you’re going to desynchronize your draw and logic threads, and keeping them from interfering is as easy as double-buffering your draw objects.

                      Don’t desynchronize your logic threads from each other. Just like with hardware, make sure all the processing chunks are small enough to be calculated within your selected tick interval.

                      Either read what I’m actually saying or stop being deliberately obtuse. You’re being more sensationalist than Shamus’s article right now.

                2. nm says:

                  Sure, you might have separate rendering and simulating threads, but you don’t need to do loads of intricate locking magic to have synchronization points and consistent state. You just need to have defined sequence points where ownership of state from the simulation passes to the renderer. It can be copied or locked, whichever is better for the use case.

                  My point is that this problem has already been solved. Massively multicore is probably where things are headed because we’ve run out of ways to make hardware just run faster. GPUs are already there, and while Larrabee failed, I suspect we’ll be seeing more like it in the near future. Maybe I’m just used to hardware people telling software people to suck it up and use what we’re given.

                  1. mhoff12358 says:

                    Yes, when you’ve got just those two threads you can do synchronization to manage access to a shared resource. And although doing that correctly in a way that doesn’t impact performance has been solved already, that’s only really with two threads. Expanding this idea to take advantage of 4 cores, much less 32, as you have to break the logic down into smaller parts that have more complex dependencies.

                    Writing good parallelized game code means ensuring that each component is able to keep working even when another component has a hiccup, and the synchronization for that gets more and more complex as more and more components get introduced. Even if its possible to get away with just blocking and waiting, that’s not the best solution. And when you’re discussing ways to squeeze every last bit of processing power out of a computer, dismissing a faster but more complex solution as “looking for a bad time” is silly, solving those sort of murky problems the point of computer science.

                    1. nm says:

                      I disagree. Scaling to 32 or 64 cores for computationally intensive tasks doesn’t add as much extra complexity as you might think. There are a few different ways of going about it, and which one is the right one depends on the problem, but hardware techniques like pipelining and modern CS techniques like having no shared state or using transactional memory can get you pretty far. Games in particular usually have obvious sequence points where the simulation clock ticks.

                    2. Volfram says:

                      There aren’t any murky problems to solve, they’ve been solved already.

                      I honestly don’t know what system you’re trying to promote(it doesn’t make sense), but the one you’re trying to criticise is a straw-man. The one I’m talking about is the standard model for game development.

        2. Volfram says:

          It was when I was still in school. For my last project, I remember that I spent some time coming up with(finding) a divide circuit that could work without a clock, because I wanted it to run fast enough that it would work between frame updates on my LCD panel.

          This turned out to be a really good thing when the circuit I found took up most of the space on my FPGA, and I ended up having to put it on a multiplexer to get digits to display properly. I believe the design went from taking up 160% of my board before the MUX to 40% after.

      2. Volfram says:

        Logic levels in circuits often complete in random intervals and at random times. About half of circuit design is making sure your clock tick is long enough for all the levels to normalize before they’re read, and the other half is making sure the levels will normalize fast enough that they’re ready in time for the next clock tick.

        This is, by the way, the real reason why overclocking is a bad idea. The heat may damage your system, yes, but it’s the computational errors due to reading values that are still in flux that causes glitches.

        1. guy says:

          There is a degree of variance, but we’re not talking +/- several hundred cycles. That is fairly typical in software parallelization, as four cores are shared between up to a hundred processes. Heck, I’m typing this with firefox, skype, and Task Manager as my only open programs and I’ve got 81 processes running, split into 1227 threads. Hardware gets to set a clock cycle and guarantee that every segment takes a clearly defined number of clock cycles; that is the entire point of the clock.

          1. Volfram says:

            How many cycles does your CPU run in 1/60 of a second? Because that’s the tick time we have to work with here, and it turns out that in computational terms, it’s an eternity.

            1. guy says:

              Plenty to run a single thread several times and another thread never. The only time guarantee you’re likely to get is “finite time”.

              1. Volfram says:

                So what you’re saying is that people shouldn’t be writing multi-threaded software? That is, to be extremely blunt, moronic. Software should be written to take advantage of the resources given to it by the hardware. To do otherwise is to deliberately cripple yourself.

                The clock ticks in hardware are the same thing as the update ticks in software. The un-clocked signals can be considered the same as the un-synchronized threads running at-speed. In hardware, you set your clock tick long enough that all of the un-clocked signals will normalize before they’re read, and you ensure your un-clocked signals will run fast enough to be normalized by the next clock tick.* In software, you set your update tick long enough for all the threads to conclude, and you ensure that your threads will conclude fast enough to be ready for the next clock tick. It’s not an approximation or a faulty example. It’s an exact, 1:1 correspondence.

                This both maximizes the system resources your program is able to use and eliminates the problems of thread collisions. Your arguments thus far have assumed that it is either impossible or impractical to ensure a thread has concluded before trying to interact with the data it delivers. Having written programs where one thread free-runs while another is forced onto a fixed clock**, I can say definitively that this is, quite simply, extremely untrue.

                *this is why you don’t use ripple-counters, by the way. They take ages to normalize and it’s hard, if not impossible to tell when that is.

                **fluid simulation. The graphics updates were run between 15 and 60 times per second. The simulation updates were run in excess of 1000 times per second, and never suffered a threading-based delay or collision. I was also able to offload the simulation to the GPU and run it over 10,000 times per second, with the same level of safety and stability.

                1. guy says:

                  The key difference is that a software designer does not get to ensure that all their threads complete in an update tick. The operating system decides which threads run when and for how long, and is selecting from a large number of threads written by unrelated people. If you run thirty fluid simulations on that same hardware simultaneously, do you still get 1000 updates per second in each? If you do not, it does not have a guaranteed-length update tick.

                  1. Volfram says:

                    On a 32-core system, yes, I would get 1000 updates per second. On an 8-core system like the one I have, I would expect to see a reduction proportionate to how many threads have to be shuffled to each individual core.

                    To avoid arguing over hypotheticals, I actually hacked up a test case. I don’t have an image server, so you’ll just have to trust that I’m copying the output accurately.

                    Keep in mind that each instance has a main loop running at 60 updates per second, and the thread sleeps between when it finishes a tick’s processing and when the next tick needs to start. Currently about 95-97% of the time.

                    1 thread running a wavepool simulation: 1020 updates per second
                    4 threads running wavepool simulations: 980 updates per second(each thread, so a total of almost 4k updates per second)
                    8 threads running wavepool simulations: about 810-820 updates per second each, average(varying between 760 and 850), host program responsiveness is still nominal.
                    16 threads running wavepool simulations: 910 updates per second each, average, host program responsiveness is still nominal(though the duty cycle is significantly increased from before). Note that I expected 400 updates per second on this test, since each logical core now has to run 2 wavepools in addition to the rest of the system.
                    32 threads running wavepool simulations: 500-700 updates per second each, and the host thread performance has degraded slightly but noticably, and my CPU fan has sped up from idle levels. The wavepools themselves are still running more than twice as fast as I expected.

                    I could kick it up to 64 threads, if you’d like, though since the host thread would then have to share a core with at least 8 other threads, I predict performance would be notably decreased.

                    Your concerns over thread starvation are valid, in theory, but in theory I also have unlimited computing power to throw at the problem. In practice, it doesn’t seem to be an issue. This is why we test.

                    I’d like to note: the performance output was a GUI text object which asynchronously asked each thread how many iterations had run since the last update about once per second.

                    [edit]So I just did a test where 1 thread ran un-throttled, and apparently the wave simulation can push upwards of 10,000 updates per second running on the CPU. To be fair, I wouldn’t try to run several threads like this, because then I’d definitely encounter starvation, but I also remember from earlier testing that the GPU is still 10x as fast.

                    1. guy says:

                      Okay, now your tick rate is half of what it was designed to be. If you are dependent on the expected tick rate of 1000 per second for synchronization, this will result in errors, much like if you took a piece of hardware and doubled the clock rate. If it has not resulted in errors, you are not dependent on the tick rate for synchronization and are instead using a software synchronization construct. I suppose we can’t properly verify it without pushing beyond the 60-per-second limit, but if you’re using a standard library that’ll probably be fine too, just slow.

                      In practice, thread starvation tends to happen on end user machines running processes from multiple creators. For instance, when I’m playing a game and also watching a Flash video, the video has a nasty habit of crashing and dropping the game to a frame every several seconds or less for a period of several minutes. If a program is dependent on time-based synchronization and thread starvation occurs, it will have unpredictable failures some of the time, which are very difficult to root out in testing.

                    2. Volfram says:

                      “Okay, now your tick rate is half of what it was designed to be.”

                      Sure, let’s just keep moving the goalposts to support our own position.

                      Yes, my tick rate is now half of what it started as. It’s also twice what I predicted(and therefore what I would have designed it to be). Run the same test on a single-threaded instance and tell me if you’re able to keep the same tick rate.

                      No. You aren’t. For any metrics you use to try and fail my test case, your system will fail worse. But unlike you, I’m at least competent enough to write the code and prove it, instead of belittling work I could never duplicate in a field that I only pretend to understand.

                      Note, also, that my tick rate doesn’t have to have anything to do with my frame rate, which itself was a solid 60 frames per second throughout the test.

                    3. guy says:

                      No, a single-threaded system would not keep the same tick rate. That is not the point. A single-threaded system does not have to synchronize with anything and will not have errors at any tick rate. If you were correct in believing that your program is dependent on the tick rate for synchronization, dropping the tick rate would have resulted in synchronization errors. If you wish to demonstrate that you are correct, drop the tick rate until it begins producing inaccurate outputs. You will also have proven that a random end-user’s uncontrollable demands can break your program, which is precisely why software uses synchronization constructs.

                    4. guy says:

                      In practice, I do have my browser randomly and arbitrarily starve every other process, so this does come up when people aren’t deliberately trying to break something. It’s rare, which translates to being very hard to diagnose, and the effects depend on the exact order, making it worse.

                      I suspect in your case the worst you can do is make the iteration counter clearly off by one, because the count is updated either before or after the work is done and the watcher can check between the work being done and the counter incrementing. That’s technically a race condition but not an important one.

                    5. Volfram says:

                      “If you were correct in believing that your program is dependent on the tick rate for synchronization”

                      When did I EVER say that my program was dependent on tick rate for synchronization? I never said that. This is you making assumptions again. Why would I rely on the tick rate for synchronization? That’s stupid. It’s asking for errors.

                      You use sync locks to ensure that every synch-critical task returns before you do anything with that information. You set the tick rate to a value that every process is guaranteed(or at least reasonably expected) to return in that amount of time, and then you design your threads so that they all will return in that amount of time. Just like on a hardware circuit.

                      Ensuring each thread has returned safely before executing the information within can be done either via flags or message passing, in which the host thread pauses until it’s received updates from all subthreads. In these cases, the threaded environment has no worse stability than a single-threaded environment, and still runs up to N times as fast, where N is the number of cores the system makes available.

                      For example, in this case, I now know that on this system, the wave simulator will run single-threaded at up to 10khz. I want to target a system with at least 4 cores and at least 1/4 the clock speed that this system has, so I set the tick rate to 100hz, and the thread count to 64. Then I use message passing to keep all of the logic threads synchronized*, and now I know the system will be stable and responsive on anything with a clock speed of at least 1.2Ghz and at least 4 cores to run on.

                      *You don’t actually have to synchronize the draw thread with anything. All it has to do is go in 60 times per second, check what the state of all drawable objects in the system is, and draw whatever it sees.

                    6. guy says:

                      You said the tick rate is 1:1 analogous to a hardware clock rate. The hardware clock is used for synchronization. Therefore, you have said the tick rate is used for synchronization.

    2. Xeorm says:

      While I think you make good points that both physics and AI benefit plenty from parralization, your last bit is where a big problem with gaming software design as relates to the hardware. Gaming cares massively about the worst case scenarios, so programming without a care to how many processes the hardware can handle leads to major problems. If a computer can’t handle the physics and AI calculations in time, that computer can’t play the game.

      This isn’t too bad with graphics usually, because of the wealth of settings that can be turned off to ease the load on the hardware. But physics? That often matters to game state, and needs to be calculated on time. Same to AI.

      Which is part of why there hasn’t been a huge push from gaming towards more cores, the same as with graphics in the past. A gaming company has to aim for a specific core minimum, which will be set by the lowest denominator usually.`

      That and reducing complexity is invaluable for decreasing the time it takes to program the entire thing. Games software development often has pretty set deadlines. Even if something can be done better, it may take too much time to properly make, and the customers tolerate only so many bugs. They’re also noticeable, in ways that a working AI isn’t.

      1. Volfram says:

        I said it can’t make assumptions about how many cores it has, not that it doesn’t care how many cores it has. If you have 15 tasks to do, and they can be done concurrently, but you only have 4 cores to run them on, then you run 4 sequentially on each core, with one core only getting 3 tasks. This really isn’t a difficult thing to do by any measure. Java, I know for sure, and I assume most programming languages, have tools for it built into their threading libraries.

        Massive parallelization is becoming more common now, and I expect it to be ubiquitous 10 years from now, though possibly as few as 5.

        But then, everyone still programs in 32 bits when 64 bit processing is available, so who knows.

        1. Jamie Pate says:

          I suspect now that the consoles are onboard with 64bit (they are right? they have more than 4GB ram?) that 64bit will be utilized. Like i mentioned below, momentum is HUGE since these companies sink millions into the engines, they can’t turn on a dime and radically change their engines until way after it’s mainstream.

          1. Volfram says:

            I know part of it for me is that the Digital Mars(the “official” one) compiler for D only recently added 64-bit support, partly because the linker they were using couldn’t do 64-bit. I don’t know if they got a new linker or finally rewrote the one they were using. I can’t really prioritize checking whether my game compiles in 64 bits right now, though, so… 32 bits it is.

            But you’re right, and hopefully 64-bit software will become more common in the coming years.

          2. Blake says:

            Yes the compilers for the current gen consoles are absolutely 64-bit, you cannot compile a 32-bit game on PS4 or Xbox One (heck, on Xbox One you can’t even get a VirtualAlloc in the 32-bit range).

            For the curious, the PS4 compiler is a version of Clang, the Xbox One uses Microsoft Visual C++, pretty much the same as ships with current versions of Visual Studio.

        2. guy says:

          I don’t think you understand software parallelization; you seem to be more of a hardware guy.

          The number of physical cores is irrelevant to the software design except that more cores make parallelization better. If you have fifteen threads (and the OS schedules by thread rather than by process) and four cores, it is actually fairly unlikely that you’ll get an even split. There cannot be any programming language that does that, because core scheduling is the exclusive domain of the kernel.Each thread will get scheduled onto open cores in a random order for some interval or until it blocks for a disk access or something. Since your game probably isn’t the only running program, it likely won’t have all four cores all of the time. In the extreme case, all fifteen jobs will happen on one core. So you could write a program that scales to any number of cores and also runs on a single core. However, for performance reasons you usually don’t want to have vastly more threads than cores, because there is a performance cost to managing them.

          Now, here’s what’s really fun: what if a thread doesn’t arrive at the fan-in before every other thread reaches the fan-out? Specifically, what if it announced that it’s finished writing and then stopped? That’s fairly manageable if the number of threads is known at design time, but very difficult to distingush from not having that thread at all if the number is unknown.

          1. Volfram says:

            I don’t think you understand software parallelization either, you seem to be more of an armchair researcher. Oh look, I can make ad-hominem accusations in ignorance, too!

            Duh, any threads that can’t be run concurrently will be set to run synchronously. That’s how multi-tasking works on a single-core system.

            Frankly, you sound like me when I was 19 and was parroting what people smarter than me(who also didn’t know any better, as it turns out) were saying, before I started doing my own research.

          2. Richard says:

            Scaling worker threads to the available cores is a long-solved problem. There are a great many off-the-shelf toolkits that handle all of this for you!

            As long as your workload can divide into more pieces than the scheduler thinks is the ‘optimum’ number of threads, the programmer neither knows nor cares.

            Look up “Map-Reduce”.

            I have personal experience of this – about two years ago I wrote a multi-threaded data processing application using the QtConcurrent toolkit.
            It maxed out all 4 CPU cores and remained responsive during the processing, and took X minutes to complete. (Excluding finally saving the result to disk.)

            Last year I got a new laptop that had double the CPU cores at the same nominal clock speed. I ran the application on it, and it again maxed out all 8 CPU cores – and took slightly under X/2 minutes to complete.
            Presumably the cores or memory bus is also slightly faster.

            I’d made no changes whatsoever to the application – it was the exact same binary – yet it took half the time.

            I have no doubt that a 16-core CPU would roughly halve the run time again.
            This particular application processes ~10,000 mostly-independent pieces of work, so will keep going faster until it reaches the memory-bandwidth limit.

            Obviously long before that point the final serialisation to disk will mean it’s not worth adding any more cores. Many programs don’t need to serialise much data, so this limit won’t apply to them.

            1. guy says:

              Map-reduce is a much easier problem, because you don’t have to worry about threads writing back into your input. The only thing that needs to be synchronized there is making sure you don’t have two threads try to process the same piece of data, which is trivially resolved with a single semaphore.

              A physics engine changes the game state it is referring to and would need to ensure no thread writes out before every thread has finished reading, or it is possible that a thread which starts late or one that finishes and loops to examine more of the state will calculate based on a nonsensical, half-updated game state. Making sure no thread writes mid-read is also relatively easy, but the mechanisms for doing that are poor at distinguishing between “finished reading” and “has not started reading”. Thus, there are readily-available libraries for multithreaded map-reduce and game engines tend to be single-threaded, except for the parts which are like map-reduce and get sent to the graphics card to be massively parallel.

              1. Volfram says:

                Just have the physics simulation read from a different memory buffer than the one it’s writing to. Which you should be doing anyway if you want stable thread handling(and mandatory if you’re doing your physics processing on the GPU). Double-buffering: been around since 1993.

      2. Daemian Lucifer says:

        Actually physics usually has a few settings as well,because most of it is cosmetic.Does the wall crumble in 5 pieces or 25?Doesnt matter really,as long as the result is a gaping hole you can use.

        Ai also has a few settings as well,though those are usually hidden.But you can see them in some older rts games where you can set how much power will be used for pathfinding.Again,plenty of it is just cosmetic,so that the ai run characters dont move around jankily.

    3. Atle says:

      I used a physics engine in a project a long time ago. What that engine did was not as simple as just calculating the next step for each element based on the current. Depending on what happened with the interactions between the elements, it would have to find the point in time between the steps that something “important” happened.

      What this means is that the different elements cannot always be calculated independently if the physics has to be correct. I imagine this can be exponentially complex if there are large systems with at lot of interacting parts.

      This might not be relevant for game physics, where close probably is good enough.

      1. guy says:

        That can depend; if the game doesn’t interpolate between simulation steps you can sometimes walk through walls. What makes it particularly difficult, at least in Unity, is that there isn’t a fixed step length. Instead, each time it recalculates the game state, it checks the system clock for the elapsed time since the last step started. So you can’t simply do collision detection by checking if objects overlap at each step, because they might pass entirely through each other between steps if the game is running slowly even if they can’t possibly move fast enough for that to happen at the expected rate.

  5. nm says:

    Nitpicky pedantry: Moore’s law is (now) that the density of transistors doubles every 18 months. Transistors aren’t circuits, they’re components used to build circuits. 18 months is a year and a half, not 2 years.

    1. wheals says:

      Well, if you REALLY want to be nitpicky Moore’s Law, as originally stated, is that the number of components on a chip (not just transistors) that minimises the cost per component doubles every year. Yes, the original figure he gave was a year, which turned out to be overly optimistic. Around ten years later, he postulated 2 years, which was correct for processors but actually pessimistic for memory, which follows the 18 months rule. The IEEE magazine has a good article on the subject if anyone’s interested in further pedantry.

  6. Jamie Pate says:

    Two points:

    A) Crosspoint will hopefully vastly improve memory access speed (one of the biggest bottlenecks for a cpu is waiting for data to work on) Having near instant access to a vastly larger array of data has potential for incredibly deep simulation that was previously infeasible. I think the pressure to improve performance will overflow to other areas than more compact, faster cpus.

    B) I think that massively parallel programming has not caught on yet because the number of cores currently is very low (2-4 + hyperthreading?) and the momentum behind engines that do things serially is so great (25 years-ish?). You have to change how you do /everything/ to effectively utilize, say 32+ cores, but I don’t think it’s a dead end, it just needs a paradigm shift to take advantage. It’s kind of a cart/horse situation, since that type of programming generally doesn’t have as much advantage on 2-4 cores, which is what all consumers currently have.

  7. nm says:

    Fun tangentially related fact: I used to program 64-core general purpose computers for work. In a situation where nontrivial but fixed latency and high throughput is the order of the day (like in network packet processing), you can pipeline heavily for the stuff that can’t be parallelized and fan out for the stuff that can.

    Where games have serious computation requirements (like KSP doing its physics modeling or DF doing its pathfinding) there are lots of gains to be made by adding support for multiple cores. Even collision detection in a game with multiple players could be sped up. Of course, there are other issues like cache coherency that can both cause horrible hard to find bugs and make your game even slower than it was on a single core.

    Unfortunately, multicore development is not one of the things they really teach in college (or wasn’t 10 years ago) so most of the techniques for doing this stuff without excessive locking (and bugs) are still dark arts. There are papers and books on the subject, but unless you really need it, it’s a distraction from the part where the fun gets put into the game.

    1. Volfram says:

      I was frankly upset at how many critical things weren’t ever taught in class while getting my degree. My approach to external libraries at the moment is basically “follow the tutorial and cross your fingers.”

      1. Veylon says:

        Even Shamus complains about the things and he did this stuff professionally for a decade. I have an AAS in Programming and I wasn’t taught about libraries either. C# seems to have a decent handle on them, but for C/C++, it’s full-on dark ages.

      2. nm says:

        I tend to pontificate at work, and recently had to explain to a bunch of people with (recently granted) advanced degrees how the C preprocessor works at a really basic level. The basics are so simple, but nobody bothers to teach it in school so people end up with weird half-understood rituals that make things “work.”

        CS students (and EE if they take any software classes) should really be required to build a simple compiler and linker for a C-like language.

        1. mhoff12358 says:

          Agreed. My compilers class was based around functional languages and so was way more informative about higher level understandings of things like grammars than about more practically understanding what something like GCC is doing when it makes an object file. I think a bit more application should have been in order.

          1. nm says:

            That’s the problem with computer science degrees: They teach you computer science, which isn’t what you need to be an engineer. I work on embedded stuff, so things may be different for people making web sites, but new grads with computer engineering degrees seem to be a little more prepared to do actual work than those from CS programs.

            1. Volfram says:

              Yep, everything I learned about threading I learned in my hardware classes. My software classes were more on algorithms and several dozen ways to sort things, and why Quicksort is a terrible algorithm but you should use it anyway.(I prefer Mergesort personally, but there are other hybrids that integrate Quicksort and are too fast for words)

              Everything I learned about practical programming application(how to actually do useful things, vs. just learning a dozen variants of the Traveling Salesman), I learned in my embedded classes.

              IMO at this point, most high-level CS courses could be replaced by a couple of hours on Wikipedia, and most low-level CS courses are remedial for the kids who didn’t figure it out on their own in high-school(like me, admittedly)

  8. Decius says:

    Moore’s law was based on the hardware having had few generations of improvement, and I think that the exponent will start to fall in the next 50 years.

    But right now software design has had few iterations of improvement, and we can get 2x performance increases fairly easily be developing better software, better tools to write software, and better tools to make the tools that we use to make the tools that we use to write software.

    That last is why we need philosophy/CS dual majors, and the former is why we need more App Academies.

    1. AileTheAlien says:

      No philosophy required; Recursion is already taught in computer science.

  9. Florian the Mediocre says:

    The current consoles are kind of an interesting case as far as parallel processing for videogames goes – both the Ps4 and the XBone have a pretty high (eight) number of cores with fairly low clockspeeds (1.6 and 1.75 Ghz, respectively, I believe) and a not exceptionally clock-for-clock efficient architecture.

    I’d imagine that that will force developers of future console games to focus a lot more on getting their games more parallelised, and I wouldn’t be surprised if we made major strides in this area over the next half decade or so. Afaik, DirectX 12 and Vulcan will also help use more cpu cores more effectively in the PC space.

    1. Richard says:

      Absolutely!

      It’s very hard to feed a GPU from a multi-threaded CPU at present, mostly due to the fundamental state-machine concept that is DirectX and OpenGL.

      You’re more or less pushed into creating a single ‘feeder’ thread that squirts the necessary into the GPU. I’m very interested to see what Vulkan does to solve that.

      1. Blake says:

        Yeah and on top of that it’s sometimes faster to wait for more stuff to be ready to go and then fire it off in one chunk than to have different things coming in one at a time.
        Like if you have 1000 polygons rendering texture A and 1000 rendering texture B, if you end up with the GPU rendering ABABABAB it’s going to spend so much of its time waiting on the VRAM that it’d be much better off waiting for the CPU to finish preparing all 2000 polygons then rendering all 1000 A’s then all 1000 B’s.

        I too would be curious to see if Vulkan is useful in that way.

    2. Blake says:

      Yeah I kind of wish Sony had just pushed harder with their Cell architecture. While the SPUs were a bit fiddly to work with, they were also FAST.
      At work we try to break up as much of our logic as we can into little tasks that we can farm off onto the different cores while we’re rendering, and a lot of that code still runs faster on PS3 than on PS4 or Xbox One.

      Of course the new consoles are much more powerful overall, I feel if Sony had’ve refined the Cell and ironed out a few of the annoyances they could have had a much more powerful machine than Microsoft.

  10. Thomas says:

    Btw, as a small nitpick, Sony said the PS3 would be a ten year console in the same way the PS2 was a 8-10 year console. Whilst the new generation came in the 6th year of the PS2’s life, people were still buying PS2s for a few years afterwards. Persona 4 came out for the PS2 in 2008, two years after the PS3 was released and 8 years after the PS2 was released.

    As they’re still making mainstream AAA titles for the PS3 I think Sony were pretty close. It’s been 9 years and publishers are now only just cancelling the PS3 versions.

    The actual quote was “I think that we are offering a very good value for the consumers. We look at our products having a 10-year life cycle, which we’ve proven with the PlayStation. ”

    As in the PS1.

    1. Trix2000 says:

      Oddly enough, Persona 5 is getting a PS3 release this year (assuming no delays, and they haven’t set an exact date that I know of). It’s also getting a PS4 release so it’s not QUITE like P4, but it’s a little funny how that works out in retrospect.

  11. ehlijen says:

    I’m looking forward to a time where computers can be built and designed for longevity and robustness, rather than needing to keep up with the tech curve and being discarded as soon as they fall behind.

    1. Mephane says:

      For general purpose non-gaming use, this is basically already the case. Well, almost*. When you just need a computer to communicate (chat, skype, email, social media) and retrieve information (read text, show images, play video) then we have surpassed that point, I think, around the time the first or second iphone came out.

      *What is not the case, however, is said long-endurance consumer computers materializing. It is just more profitable, even if you can’t bring normal customers to buy a more powerful computer every 2 years, they will need a new one once the warranty has expired and some component has succumbed to wearout.

      1. AileTheAlien says:

        Well, that’s the problem, aint it? The phrase shouldn’t be “I'm looking forward to a time where computers can be built and designed for longevity and robustness” It should be, “I’m looking forward to a time, when companies have an incentive to build computers for longevity and robustness.”

  12. Don Alsafi says:

    If you had two cars (with drivers, obviously) you could do both tasks at the same time. One car gets the food, the other gets the kids. But if your errands were instead “Get kids at school” and “take kids to soccer practice” then they can’t be done simultaneously, and the extra car would be useless. In computing, tasks that must be done in order like this are referred to as “serial”.

    But really, you should be moving those kids around in a bus. ;)

  13. MadTinkerer says:

    What will happen? Something finally worthy of the power of the hardware will happen.

    Since Quake, there hasn’t been a major breakthrough in computer graphics (Oh? You object? Do you really want to argue that a breakthrough as significant as true 3D texture-mapped polygonal graphics* has happened since the introduction of true 3D polygonal graphics? Yeah, I thought so.), and game design has gone backwards. Marketing has conquered and driven most of the industry since the mid 90s.

    But if we’re stuck with the same hardware (more or less), then the software is going to need to be more efficient. Whoever is clever enough to realize the opportunity and has the skill, will create an equivalent jump to the one we saw in the early 90s, but instead of 2D to pseudo-3D, it will be from the current kind of 3D to whatever the next kind of rendering will be.

    (Personally, I’m hoping for efficient volumetric rendering and/or a breakthrough in realtime raytracing.)

    Additionally, we’re going to see something from the Indie space that will have a similar impact as Minecraft, but in AI design. Without the need to ship before the engine becomes obsolete, someone with the time to do so will finally be able to concentrate on some sort of previously-unseen content creation with graphics that are good enough. Maybe Dwarf Fortress with better graphics and a good interface? Maybe something with a handful of characters in a small but detailed setting (like if Broken Age wasn’t trapped in a three-decade-old design paradigm but instead actually pushed the medium forward)? Something like that.

    In any case, without all the distraction from NEEDING TO SHIP THE LATEST ITERATION OF AN OLD DESIGN THIS YEAR, someone is going to do something we don’t yet realize is possible. Again. Remember when that used to happen all the time?

    *EDIT: I mean efficient true 3D texture-mapped polygonal graphics. Of course I know of the many true 3D games that came before Quake and not a one had a decent framerate.

    1. AileTheAlien says:

      1. Indie games are pushing out new concepts. So, the hope would be that AAA budgets would start having long-term innovations. i.e. Better engines.

      2. Actually, I would like to argue, that we’ve had “breakthroughs” in graphics, besides just moving from 2D to 3D. In roughly chronological order:
      – 2D -> 3D
      – mipmapping
      – transparency
      – normal-mapping
      – bump-mapping
      – recursive views into an infinite scene (i.e. line up the camera to look into a portal in Portal, that’s looking into the next layer of recursion…)
      – fur simulation (many individual short hairs)
      – graphics that aren’t brown

      There’s probably some I’ve missed.

    2. Richard says:

      There absolutely have been breakthroughs in graphics since Quake!

      The programmable pipeline for one. Being able to write (almost) arbitrary code that actually runs on the GPU is (I’d argue) far more significant than what Quake did.

      1. AileTheAlien says:

        That’s a good one – is what makes possible lots of effects in games. :)

  14. Daemian Lucifer says:

    Seeing how the chief graphics software only now has the ability to really split the work to multiple cores,the bottleneck seems to be in how we program stuff,and not what multiple cores are used for.

  15. Daemian Lucifer says:

    Dear god!Why did you put that picture in the middle of the second page?Its creepy as hell.

    1. Shamus says:

      Heh. I never saw the image until the article went live. (Surprised me too.) Credit foe all of the article images and graphics go to the Escapist team. I just make words.

      1. Daemian Lucifer says:

        Credit foe indeed.That thing should be the new face of uncanny valley.

  16. Mephane says:

    nobody realized 60fps was going to be a big(ish) deal

    You mean except for PC gamers/enthusiasts who for a long time already have been knowing that 60 FPS is a vast improvement over 30 FPS; a similar thing is starting to happen regarding >60FPS, but the jump from 30->60 is much more remarkable an improvement than 60->120.

    So 60 FPS may be something new to consoles, but isn’t actually knew to gaming overall, and I would even claim that is the de facto standard on the PC. Normal displays are made for 60Hz refresh rates and PC games usually aim for 60 FPS, plus some of the adaptive software (dynamic FPS, see for example Borderlands 2) and hardware (gsync) solutions that try to keep it smooth even when it drops below 60 FPS at times.

    Anyway, smooth fluid movement is one big point, so that the eye/brain is unable to distinguish individual frames and to not even just subconsciously feel that what they are experiencing is not, in fact, a fluid movement but a rapid series of still pictures.

    (It helps that while we have a finite temporal resolution, this is not applied in a discrete manner, but analogously, continously. Otherwise we’d get weird problems like having to sync the precise timing when our screen refreshed with the timing when our brain takes an image. Like video footage when a camera is pointed at an old CRT and the frame rates between that screen and the camera do not match up, and you see all kinds off odd flickering, flimmering, or broken images.)

    The second big point, which is starting to arise and I do hope the next console generation will incorporate from the start (alongside 60 FPS), is the whole topic 4k/5K/retina, which essentially boils down to beating the spatial resolution of the eye/brain, so that we become unable to distinguish individual pixels. Of course this also depends heavily on multiple factors – the distance from the screen, the size of the screen, and the number of pixels – but the goal is, of course, to optimize each device so that in the normal use case, you cannot see any pixels at all without going closer to the screen than normal, or using technical aids like a magnifying glass.

    My personal estimate for this to become mainstream for PC gaming is ~5 years, for consoles it may take longer. I am really looking forward to how these high pixel densities will affect certain graphics features we have so far become accustomed to. For example, given sufficient pixel density, features like antialiasing and probably also anisotric filtering may become completely unnecessary, which would be a boon to the many game engines that implement the former very badly (example: Elite Dangerous – gorgeous graphics but the antialiasing the engine does is a catastrophe) or not at all (GTA 4 comes to mind – is GTA 5 still as bad in this department?).

    (I seriously still don’t understand how any modern game engine can be designed without antialiasing at all. I get that doing it well is hard, but even badly done it is much better than not at all. It’s like, what use are all this spectacular lighting effects, reflections, shadows, smoke, particle systems etc. when the screen is full of the jagged edges from all the geometry?)

    The good news is that, as Shamus article mentions, graphics hardware is very effective at doing things in parallel, so more pixels on the screen can be offset by having more cores to calculate those pixels, whereas higher FPS also need a CPU able to keep up with it all.

    The third big point I am looking at, but for the longer term, is raytracing. As far as I know, there is a point regarding scene complexity (polygons as well as number of rendered pixels etc.) above which raytracing actually becomes more efficient than scanline rendering, plus raytracing can be equally well, if not even better, run in parallel.

    Plus, there are some very cool experimental graphics engines out there right now who do a very novel approach to rendering in general, one that apparently is only possible with raytracing: when the GPU can’t keep up, instead of dropping frames, it stops tracing rays earlier. Everything remains smooth and fluid at all times, but the image quality automatically degrades, but not by special features suddenly turning off, but the image simply becomes more grainy. I will try to find a video that demonstrates this.

    Update: Found one, although not the one I had in mind: https://www.youtube.com/watch?v=BpT6MkCeP7Y

  17. DanBanan says:

    Graphene transistors are around the corner. They will be cheaper and smaller and when it comes to speed and heat better on a level of magnitudes

    1. alfa says:

      As someone who has built a graphene transistor (or rather a small chip with about 300 of them) – no, they aren’t.

      While Graphene is amazingly cool and has some really nice properties, the big problem we currently have is that GFETs just don’t switch off well enough – with the batch I produced (in a small university lab, with a channel length of about 20 micrometers) we got an off-current of about half the on-current, which is about par for the course.

      Graphene as a direct replacement for silicon logic just doesn’t seem workable – the researchers seem to be looking in other areas (like sensors), but that means it probably won’t be used in processors.

  18. Saying that GPUs aren’t the limiting factor is wrong.

    I got a mid range GPU that is a few years old now. If Anti-Aliasing is cranked up and there is a lot of shaders enabled, tesselation is on, then a game or program will make the GPU struggle to keep framerates above 20 FPS.

    While my CPU is a midrange 6 core CPU a few years old and it’s rare for me to ever see any game or program max all of them out.

    Now a high end and recent (a year old?) GPU would probably move the bottle neck to the CPU.
    The interesting thing is that a modern game at 4K resolution and AA and shaders maxed will cause even a year old high end GPU to drop to like 30 FPS.

    Once you get up to 6K resolution you would only need 2XAA at most or maybe no AA. And at 8K you no longer need AA as the resolution is high enough that “jaggies” should no longer be visible.
    Once GPUs can run at 60+ FPS at 8K (with no AA) that is when GPUs are no longer the bottle neck.

    Other advanced like Adaptive Sync (aka Freesync) will allow GPUs to render at refresh rates other than multiples of 60 or 30. So a FPS could be synced to 57 Hz refresh rate and run at a steady 57 FPS instead of bouncing between 60 FPS and 30 FPS.

    When it comes to multiple CPU cores. Unless the language used directly support it (C++11 spec?) you have to code support manually.
    Input and GUI on 1st, audio on a 2nd, AI on a 3rd, rendering on a 4th.
    In some cases Input, GUI, Rendering is on the same core.
    So a game usually only use 3 or 4 cores.

    Another issue is that while AMD was early with consumer 6 core CPUs, Intel remained at 4 core CPUS (although with hyperthreading giving thus a virtual 8).

    So that held back utilization of multiple cores to just 4.

    Then AMD came with their 8 core CPUs. And now AMD has CPUs with something similar to hyperthreading, and you get a 6 core appearing as 12 core (I can’t recall exactly how that worked).

    One benefit of multiple cores (like 6 cores) is that the OS (and any background tasks) can get to use 1 or 2 cores for their stuff and a game can then use 4 cores which will improve performance.

    Another issue is bandwidth, moving stuff from the CPU to the GPU and maybe back to the CPU and then back to the GPU destroys the bandwidth of most systems.
    Better game engines, and better GPU APIs mitigate this somewhat.

    AMD has some new high bandwidth memory their most recent stuff uses, and new cards will use a version two of this. And word has it that Nvidia is might use this memory too. (it’s much faster than GDDR5 memory).

    I would not be surprised if GPUs had 16GB of very high bandwidth memory in the near future.
    The next gen consoles that I’m sure is being research on as I write this will most likely use this new memory too.

    I wonder if AMD GPUs (and CPUs) will end up in Playstation 5 and Xbox Two as well or if Nvidia manages to make a deal with one of the console makers this round.

    Myself I’m hoping AMD gets the deals as consumers do need AMD to be competitive with Nvidia to push prices down and push technology forward.

    But to go back to the start of this comment. CPUs are not holding back GPUs (yet), unless you have the beefiest high end GPUs (possible SLI setups with multiple cards) then GPUs will still be a bottle neck until you get to the lower midend, or the high lowend range.

    On budget gaming PCs (and current gen consoles) the GPU is still a bottleneck. AFAIK neither PS4 or XBone uses vsync since that would drop the framerate further. I’m not sure if ny console games let you toggle vsync or not (I don’t own a console so I wouldn’t know).

    My prediction for a PS5 or XBTwo is 30-144 FPS (with Adaptive Sync / Freesync) support, and 4K video playback support (not 8K as I predict that for PS6 and XBThree).

    For PCs the budget Adaptive Sync monitors are now arriving this September/October, and AOC being one of the first. Cheap TV panels with Adaptive Sync are sure to follow soon after (they’d need DisplayPort 1.2a inputs though).

    One thing I’m uncertain about is the cores in next gen consoles, I’m guessing at 8 or 12 (physical or virtual), and I think it will remain so even for the console generation after that.

    The benefit of the current gen consoles was that they matched pretty closely to midrage PCs which meant that you would not need a insane PC rig to enjoy a game.

    The future of gaming will probable flatten out some (that I do agree with Shamus on), and instead we’ll see hardware iterations with extra features added on rather than major leaps on CPU or GPU.

    Next generation (or maybe the one after) will be generation where 3D headsets/visors will be considered another peripheral just as motion controllers and cameras/sensors and whatnot.
    By that I mean a PS5 or XBTwo bundle with a 3D visor shipped in/with the box and launch games that support it.

    1. Cilvre says:

      well written post, and iterates on a lot of the items i was hoping someone would bring up before i needed to.

      1. Duoae says:

        Well, this whole conversation really hinges on your visual display and how far away you are from it. I agree with all the points but those two caveats really control everything – experience-wise.

  19. Cybron says:

    Very interesting article. I too know very little about hardware and it’s always interesting to read about how it’s influencing the direction of the industry.

  20. Duoae says:

    I’ve had similar sorts of thoughts in the recent past on this. I had a post on my blog (not widely read) titled “PC was all about the hardware, now it’s all about the Hardware…

    Actually, looking that up, it was in 2013! Wow! Where did the time go? Well, anyway… I saw the trend that can basically be summed up thusly:

    The PC world is very quickly becoming the next TV world.

    However, this sums up how I’ve felt about PC gaming for the most part over the last ten years!

    For me, this new status quo is pretty great. I’m not going to miss the 90’s where I needed to buy a new PC every two years just to keep up. And I’m also not going to miss the aughts, where I needed a new graphics card every two years. We’re finally entering a time where we can worry less about hardware and more about the games.

    …with the caveat that we will need to start to increasingly worry about the Hardware.

  21. bbot says:

    I tried HTC Vive at PAX.

    It was pretty good, but it could easily use twice the resolution. I saw lots of people complaining about the screen door effect for the Oculus DK1, but it was just as noticeable on the Vive. 1080×1200 per eye is not enough. As it is, the Vive is a novelty.

    Doing 4K @ 60Hz with AAA graphics is a couple hardware cycles away, though. Maybe I’ll buy a VR headset for christmas 2018.

    (Carmack and Abrash are saying that 60Hz is an absolute minimum for VR. Supposedly the Vive does 90Hz. I didn’t notice any smearing or judder, but I was mostly absorbed by the resolution problems and bad controls)

  22. Scerro says:

    In my opinion, right now is also a poor time to build a computer. The first generation of DDR4 mobos are hitting the market, DDR4 RAM is more expensive, and next generation memory technologies are starting to heat up competition between Nvidia and AMD in the GPU market. On top of that, the push for better performance won’t begin in earnest until VR headsets hit the market.

    That said right now if you get DDR3 mobos, you’re locked to last generation’s processors. That’s probably not a terrible thing, but for those smoother frames the faster memory is going to be something you don’t want as a potential bottleneck.

    On the other hand a 500-600 dollar PC can run virtually everything on high to very high settings. Just don’t expect the upgrade potential or being able to pull VR stuff off.

    1. Daemian Lucifer says:

      Its never a good time to build a computer.When memory is cheap,we just hit a new processor innovation.When that is settled,graphics cards are new.When those are cheaper,software has given us something new.When all of those are cheaper,theres a new display technology.

      1. Scerro says:

        Nah. Buying a set of technology about nine to twelve months is optimal once it hits market. That way you get better revisions, and competition and production has spread through the market.

        It’s more about the mobo in this case. DDR4 and the new sockets are probably worth it. Then again, Intel changes their socket every generation anyways.

        I only say this because I got a DDR2 mobo and system at the edge of DDR3 being a main thing. It really bottlenecked my processor and system for the six years of it’s lifetime, a GPU upgrade would only do so much, and it was limited to 8GB of ram total.

    2. ThaneofFife says:

      On the other hand, Labor Day and Black Friday are the best times to buy a PC at retail in the U.S. So, you could get the new hotness from Intel on sale if you buy right now (though I think it’ll probably be even cheaper come November).

      1. Scerro says:

        Yeah, I upgraded my mobo/processor/RAM last fall. I have plenty of power for the next four to five years. You can get deals, but I find most deals to be irrelevant to me anyways. If I’m not searching for that thing, it doesn’t matter how cheap it is. I’m not gonna buy a laptop if I have no use for one.

        1. ThaneofFife says:

          Good points. In 2011, I bought an 2010-model Alienware laptop off of Ebay, and it’s still running most of my games at medium or high settings.

          The only reasons I’m thinking about upgrading now are because (1) the 1st-gen Mobile Core i7 runs at a stupidly-slow 1.73GHz (w/ not much turbo), which is bad if you play Civ 5 and similar games, (2) the thing is a beast for heat build-up, such that the sound in Bioshock Infinite starts cutting out after it’s been running 10-20min, and (3) if I went back to a desktop, I would get a massive performance boost (somewhere between 50% and 200%).

          1. Scerro says:

            Desktops are amazing. A 500-600$ will get you running most games at high settings, and if you throw a bit more cash at it, you can get it upgradable and a good bit more future proof.

            Honestly, the only option in my opinion for laptops is to get one with a Core i7. Intel is the furthest ahead in terms of TDP reduction, and laptops run horribly hot even when not at load.

            Even without deals, I’ve been tempted by Arstechnica’s deals that they show off for Lenovo laptops. There was a laptop that was exactly what I would look for in a laptop that was about 850, compared to it’s normal 1350. 14inch, Core i7, 16GB RAM, and decent mobile GPU. Problem is, I simply don’t need a mobile gaming unit. My crappy Vista-era laptop does what I need it to when I’m on the go.

            Eventually I’ll end up with a nice laptop, I think. Just not for at least a year. My desktop that’s essentially a year old does the trick for me.

  23. ThaneofFife says:

    Shamus,
    Loved the post. It strongly reminded me of a speech that Charlie Stross (fascinating SFF author, if you’re not familiar with him) gave a couple of years ago: http://www.antipope.org/charlie/blog-static/2012/08/how-low-power-can-you-go.html (summary/transcript of speech at link).

    In a nutshell, Stross predicts that a decline in Moore’s Law would ultimately lead to ultra-low-power CPUs being sold at commodity prices. As a result, it would become economically feasible to embed sensors and a CPU with roughly the same computing power as a current-generation smartphone into every few square meters of city pavement. This, in turn, could open the way for both innovating government and an unprecedented level of surveillance. The first half of the article, in particular, came to pretty similar conclusions to your post. Hope you’ll give it a read!

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply

Your email address will not be published.