Reset Button: Playstation 3

By Shamus Posted Tuesday Jan 20, 2015

Filed under: Movies 87 comments

This video is mostly a more organized version of things I said throughout the last console generation. I guess I wanted to give the PS3 a proper send-off by gathering all these points into one video.

Link (YouTube)

I had another whole section in here talking about multi-threading, but it made the video LONG and I don’t think it was needed. (Although at one point I say, “You’ve probably guessed that the big hurdle here is the need for lots of parallel coding.” That doesn’t make total sense to expect non-coders to “guess” anything without the long rambling section on multi-threading.)

Another thing I didn’t get into is the idea of processing latency. As I read it, your program gives tasks to the PPE and it delegates to the SPE. If your program is really parallel friendly then the PPE ought to be able to keep all the little SPEs busy all the time. However, this introduces a middleman to the process. Even assuming you can keep all the little sub-processors busy, you’ve still got this annoying lag.

Let’s say you’re trying to send a package. One courier uses a bike. He can take one box across the city in a day. The other outfit has a van that can take ten packages across the city, but they spend some time sorting packages at the distributions center or whatever. They deliver in two days.

The van has higher throughput: Ten packages in two days instead of one package in one day. But sometimes you don’t want to wait two freakin’ days. So arguments over which one is “faster” are kind of confused. It sort of depends on what you’re trying to send and when you need it.

More importantly, processing latency is really bad for games. It’s fine for rendering farms and crunching primes, but in a videogame latency matters. I don’t care if a game has mega graphics at 200 frames a second, if it takes five seconds between the point where I hit the jump button and the moment when I see my character jump on screen, then this computer is not useful for gaming. Obviously the PS3 wasn’t that bad, but the processing latency is yet another unwanted thing coders might have to worry about. I didn’t bring this up in the video because I have no idea how bad this problem was.

I am really glad these next-gen machines have settled into more or less standard quasi-PC hardware. This ought to make them nice and cheap over time, and should make cross-platform work less of a headache.


So the next console generation is now this console generation, and the Playstation 3 and Xbox 360 are behind us. I’ve been in the awkward position of being a Sony fan but a Playstation 3 critic. Lots of people have criticized the machine as a bad design, but I think it was the result of something far worse that just bad engineering. I think the machine was actually the result of corporate scheming gone horribly wrong.

But to talk about that, we need to talk about programming.

Part 1: Programming

Programming is a strange discipline. We’re very quickly running into the limits of what we can do with our intellect. Sure, lots of things in the universe are hard for us to figure out, but programming is one of the few places where we make MORE things we can’t figure out. Composers don’t accidentally write music that’s too complex to be played and engineers don’t routinely build bridges that are so complex that nobody can build them or drive over them. It’s one thing if you can’t understand quantum mechanics, but it’s another if your coding team can’t understand the code they wrote six months ago.

Countless books, classes, and even entire programming languages have been created to help us cope with the problem of runaway complexity. In movies they al ways show us programmers furiously typing away at high speed, writing thousands of lines of code an hour. In reality, any project beyond the prototype stage will have coders spending more time reading code than writing it.

Let’s say you want to hang some Christmas lights. But not just, like, one string of lights. Let’s say you want some Griswold-level monstrosity that warms the house and keeps the neighbors awake. It’s a major engineering project. You need to limit the amount of power you draw from each outlet to avoid a fire or a blown fuse. There’s a limit to how long chains can run and you need to hook it all up in a system that makes sense and conforms to whatever the specs are for this project.

One coder solves the problem with brute force. There are cords everywhere and the whole thing is a crazy fire hazard. The floors are thick with extension cords and tripping over a plug on the back porch ends up cutting power to lights in the front yard.

Another coder keeps it all clean and organized. The wires are labeled and color-coded with the banks they belong to on the fusebox. From the outside, both of these jobs look the same. It lights up and it seems to work okay. But you can see the difference when someone wants to change the blue string by the upstairs master bedroom or when someone smells hot wiring in the garage. The clean job is just a matter of reading the labels on the wires and tracing where they go, and the messy job is a ridiculous maze where it takes hours just to figure out where the problem is, and longer still to untangle things enough to solve it.

“So hire good coders!” you say. Well, we’ve been doing that. Or trying to. But there’s a finite supply of super geniuses in the world and not all of them want to spend their intellect on video games. And no matter how carefully you label things and how neatly you tie those cables, eventually projects will spiral out of your control. When your Christmas lights need to cover the neighborhood? Or the city? The county? Sooner or later you’ll hit a threshold where it’s impossible to get anything done because your coders spend all their time reading code and trying to find the one outlet where they should plug their new code in at.

John Carmack, former lead programmer of ID software and one of the super-geniuses we were talking about a minute ago, has even said that computer graphics programming is more complex than rocket science. And to be fair, he might be the only person to have done both professionally. I’ll add to that by saying that the level of complexity increases with each new graphics generation.

And what if your development house can’t afford a John Carmack? What if you have to make due with rank-and-file mortals like me? You probably want to ship your game in the next couple of years and don’t want to lose it in development hell. Which means you don’t want to mess with anything that will make the code more complicated.

Part 2: Hardware

I swear we’re going to get to the Playstation eventually. But before we can do that we need to talk about how normal gaming machines are built.

First you’ve got your main processor. That’s a general-purpose processing unit. It handles the operating system, input, networking, all that good stuff. If we’re talking about a PC, then you can lump all these jobs together under the heading of “running Microsoft Windows”.

Somewhere else in the machine you’ve got your GPU. Your graphics processor. But unlike the general-purpose CPU that can do anything, the GPU is really stupid. It’s been specially designed to do one very simple job. Or at least, simple by the standards of computer processing. The only job it has is to turn polygons into pixels. That’s it. It just crunches numbers and renders graphics. The narrow focus of this kind of processing means the chip can do its one job really really efficiently.

So this has been AAA games programming for almost two decades now: A CPU, a GPU, and a pile of memory. PC, Mac, Xbox 360, Wii… doesn’t matter. This basic design premise permeates our development. Some machines have more GPU, some have more memory, sometimes there are other tradeoffs made during design, but the basic design idea still holds. Our graphics engines are built around this idea, our programmers are all familiar with it, and our games are optimized for it.

And then we have the Playstation 3.

Part 3: The Cell Architecture

The Playstation 3 uses the Cell Architecture. I’m not remotely an expert on this, and explaining this properly would get us deep into all sorts of side-discussions. But let me give a simplified view:

The Cell processor has eight (not really eight, it’s complicated, but let’s say eight) of these sub-processors, called the Synergistic Processing Elements, or SPEs.

The SPEs are not as robust as a CPU but they’re more sophisticated than a GPU. They share what’s called a circular memory bus. You can’t get detailed specs on this unless you’re a Playstation developer, but from what I’ve read it almost sounds like they pass blocks of memory back and forth between them, like peers on a network. As a programmer who finds manual memory management challenging enough, the very idea of a circular memory bus fills me with a sense of panic. I’ve read the description several times now, and every time I come back to it I have more questions about how this is supposed to work.

Presiding over this madness is the Power Processing Element (PPE). It’s in charge of giving tasks to the SPEs, getting the results, and passing them back to your program.

The idea is that if you wanted to manufacture a more powerful machine, engineers could add more SPEs to the mix and the load would be distributed among them. If this sounds like a crazy and unwieldy system… well, it is. This is not a variation on the old computer paradigm. This is something new and strange. Processing clusters like the Cell are more commonly used for large-scale things: Locating and factoring massive prime numbers, running huge simulations, and rendering farms. Nobody had ever tried to use one for videogames before, and at first glance it seems like a bad fit.

Wikipedia says that, “The Cell architecture includes a memory coherence architecture that emphasizes efficiency/watt, prioritizes bandwidth over low latency, and
favors peak computational throughput over simplicity of program code. For these reasons, Cell is widely regarded as a challenging environment for software development.” Those are not comforting words for an industry already struggling to write stable, comprehensible code on far simpler systems.

So let’s bring this back to the Playstation 3. I’m sure you’ve guessed that the big hurdle here is the need for lots of parallel coding. Games programming is complex enough as it is, and here we have a machine that demands a whole new level of parallel coding in order to use it properly.

Developers who wanted to make games for the upcoming Playstation 3 found themselves in this difficult position. They had to wrap their heads around this alien new system. At the same time Sony was hyping the device to consumers and talking about how much raw processing power it had. The devs were working just to do basic things in this strange new world, while consumers had paid big money for their consoles and they had been told the machine was “Practically a supercomputer!”

The “supercomputer” line is one of those claims that is incredibly deceptive while also being technically true. It’s like me saying, “In my garage I have twice the horsepower of a NASCAR vehicle!” But then it turns out what I’ve really got is a fleet of 14 Nissan LEAF cars. Yeah, it’s twice the horsepower, but you can see how the claim is kinda misleading.

Part 4: What Happened?

So what happened here? Did the Sony engineers go mad? Are they somehow brilliant enough to design something this complex but too dumb to see the inescapable development problems?

Well, no. I’m inclined to believe that the engineers built exactly the machine they were told to build. I don’t think the Cell was an engineering decision. I think it was a strategic one. I think this machine was hobbled and nearly ruined by corporate politics.

Remember that when the PS3 was designed, Sony was riding high on the glorious Playstation 2, which to this day remains the best selling console of all time. The library was massive and still growing. It had the most clout, the most exclusives, the most fans, and made the most money.

Sony’s only worry was that there was an upstart on the field. The launch of the Xbox was weak, but Microsoft is a scary opponent. Usually they win. (Windows.) Once in a while they lose. (Internet Explorer.) But they always play aggressively and they have a seemingly endless supply of money. You don’t want to sit across the table from Microsoft if you can avoid it.

This is conjecture on my part, but here is what I think happened: Sony decided that they wanted to smother the Xbox in its crib, before it could grow into a serious threat. Sony already had a huge market share, they just needed to lock it down and prevent Microsoft from taking anything away from them.

The Cell design would make the Playstation 3 very, very difficult for people looking to port their games. Once you have your game optimized for the Cell, it will be very hard to port TO another platform without going down into the guts of your engine and doing some major re-engineering. It’s like changing a NASCAR into a semi truck. Even if they have about the same horsepower, you can’t make one into the other with a simple tune-up.

This would effectively put a wall around the Playstation 3. In a world where the Playstation 3 was the dominant platform, most developers would start there. And many wouldn’t want to go through the expense, hassle, and opportunity cost of porting to the PC or whatever the second Xbox console would be called. Sony would get lots of de facto exclusives without needing to negotiate and work for them, and the upstart Xbox would die before it got big enough to be a threat.

My thinking is that Sony didn’t choose t he Cell architecture because they thought it was good for game development, they chose it because they thought it would be a good way to stonewall Microsoft.

It was a perfect plan, except it was ruined by another, mutually exclusive perfect plan, which was to use the Playstation 3 to cement the Blu-Ray as the next-gen format of choice. Remember that at the time the Blu-Ray was in competition with HD DVD, very similar to the VHS versus Betamax war of the 1980’s. There were two standards, only one would survive, and whoever won would get to print money for the next couple of decades by being the gatekeeper on the dominant format. And a great way for Sony to make sure that there were lots of BluRay players was to make it part of their games console.

But the Blu-Ray player made the Playstation 3 far more expensive than the alternatives, and that expense eroded the userbase that their first plan depended on. Then in order to bring down the price, Sony dropped the backwards compatibility. This had the added effect of cutting the machine off from the largest and most impressive games library in the history of consoles, which means that the PS3 needed to be able to survive on the strength of its launch titles. But because the cell was such a pain in the ass to figure out, there wouldn’t be that many launch titles.

Early sales were low. The first round of games were late and failed to justify the huge pricetag the device commanded. Yes, there are people out there who will pay $600 for a machine to play the latest Metal Gear. But there are not enough of those guys to support a multi-billion dollar console. This machine needed to go mainstream to be a success, and everything about it – from the huge price tag to the tiny library to the bulky hardware – kept this machine from having mainstream appeal.

Low sales meant fewer adopters. Fewer adopters, combined with the difficult development environment, meant that developers weren’t eager to make games for the Playstation 3.

The console was in a downward spiral. Developers targeted the Xbox 360 first,with plans to port later. And then they didn’t bother porting at all because the PS3 was a headache and the audience was so small. That wall that was supposed to keep developers in ended up keeping them out instead. Sony was undone by their own avarice. They tried to have everything, and nearly wound up with nothing.

This downward spiral could easily have killed the PS3. It’s hard to imagine what that would have done to Sony. A failure of that magnitude could have had repercussions far beyond the gaming world. It might have changed the outcome of the HD vs. Blu-Ray competition. If the machine had flopped hard enough, it could have done to Sony what the Dreamcast did to Sega in the late 90’s, and knocked them out of the console market for good.

Luckily, Sony was rescued from disaster by the most unlikely party.

Part 5: Microsoft saves the day.

Sony was saved from their foolishness by the more extreme foolishness of Microsoft. The Xbox 360 had shockingly high failure rates. Microsoft tried to downplay it, and in the 90s this might have worked, but with the help of the internet consumers were comparing notes and realizing that the machine had a lifespan barely long enough to push it past the warranty period. It was both a financial and PR disaster for Microsoft, and their PR bungling and spin only made them look worse when the truth of the problem became inescapable.

This failure gave the Playstation 3 time to recover and carve out a viable portion of the market. Consumers might think that a $600 console with a tiny library is a terrible deal, but it’s still better than a $400 brick.

Like I said before, I don’t have any proof that Sony based their system on the Cell in an attempt to kill the nascent Xbox, but I find this explanation more pl ausible than “Sony engineers suddenly went mad”.

Sony did wring a partial victory out of this mess. Their Blu-Ray format became the industry standard. While that doesn’t mean much to games directly, it will provide Sony with a nice steady cashflow for years to come. On the other hand, their overreach probably gave the Xbox the foothold it needed. If the Playstation 3 had been a more conventional device with some backwards compatibility, then Microsoft might have been driven out of the market by the Xbox 360 Red Ring of Death disaster. With an affordable alternative and access to the massive PS 2 library, this alternate universe PS3 would have been perfectly positioned to soak up all those disillusioned Xbox users. Microsoft might have been forced to back out of this crazy console dream and re-focus their attention on core products. You can only lose so many billions before the shareholders will tell you its time to cash out while you’ve still got some chips left.

But maybe this is the best outcome in the long run. We’ve got the big three competing for our attention and dollars. I have much better hopes for this new generation of devices. The hardware makes sense, the prices feel right, the libraries are off to good start, and there haven’t been any major failures or disasters.



From The Archives:

87 thoughts on “Reset Button: Playstation 3

  1. Lachlan the Mad says:

    Whoa, second comment.

    Also, blue screen of death joke, anyone? :)

    1. Eric says:

      I nearly spit my drink over my monitor when that flashed up…

      … yes, I’m easy to get a laugh out of.

  2. Eruanno says:

    Hey, Shamus! Speaking of consoles, it would be really cool to hear your opinion on the Playstation 4 now that you have been introduced to the World of Next-(Current?)Gen Consoles.

    Also I’d love to hear your old man grumblings about Destiny. They echoed so well my opinions of the game and there’s nothing nicer than being validated.

  3. Ingvar M says:

    I’m not sure. At the time the PS3 was in its pre-release stage, Cell was the up-and-coming IBM architecture, destined to (ahaha) become the general computing horsepower of the discerned gentleman in need of extreme amounts of computational power.

    So, in theory, there would be this really good compiler support, from IBM (who, as happens, are surprisingly good at compilers) and in theory that, plus some careful writing of useful libraries for things, would clearly Save The Day and make the PS3 easy (or, at least, easier) to write code for. I was looking forward to the release, with the hope of having a chance of noodling on writing compilers that seamlessly utilized the SPEs for map/reduce type stuff, transparently to the coder. But, as happened, the PS3 came with a relatively crippled SPE (I think you’re guaranteed 6, or maybe 8, of the 16 that the IBM original spec said). And I was temporarily short on both cash and time. So, I never got the chance to do that.

    However, the decision to (mostly) leave out a PS2 emulation system probably did not help (the original PS3 had that, but I think by the time it was released in Europe, your PS2 library was useless without a PS2; I believe the same is true for the 3->4 transition). And that is a shame, because if that had been there, I would have been much keener then, and would be much keener now, to buy the next gen.

    1. Daimbert says:

      Yeah, they removed the backwards compatibility right at the time that I was going to buy a PS3.

      So I bought a couple of spare PS2s instead. I didn’t buy the PS3 until after they had won the DVD wars and I got a High-Def TV to play High-Def DVDs on …

    2. MrGuy says:

      I think this is an important point, and related to Shamus’ belief that the decision to use the cell architecture was a management decision (as opposed to a technical one). That said, I’m not sure the “walled garden to keep developers in” idea was really the correct one.

      The cell architecture concept was dreamed up at IBM, and developed as part of an alliance IBM formed with Sony and Toshiba. The alliance formed in 2001, just after the PS2 had launched, when Sony was looking for what the “next big thing” would be.

      The idea has a ton of appeal for people dreaming about “the next big thing” for graphics programming. The idea of having multiple co-processors was to make it easy to “hand off” tasks like shading, vector processing, or other multi-media-intensive tasks. It sounds good on paper – you can do a lot more with it than you could with a dedicated GPU (for example, one idea I heard at the time was alpha surface handling could be a lot more robust).

      Sony committed to the new architecture on promise before it was even designed, and before they could really project just HOW complicated the architecture would be to use. I don’t think the idea was just to be “different,” it was to be better and more capable.

      But that’s not how it turned out. The programming problems were bigger than they intended. The tools weren’t as easy to build as they’d hoped. The “magic multimedia engine” they’d wanted to build was hard to use (Toshiba, for example, never brought a commercial product to market, despite clearly anticipating that they’d be able to when they started).

      But Sony was already committed to riding this train wherever it was going – they were heavily invested in the design and development of this cell processor thing. So when they finally got to the point where they could really see what using it would be like, they either a.) had to hold their noses and pray for the best, or b.) abandon millions and millions of development dollars, admit defeat, and start completely over on designing their next gen console to be a modest incremental improvement on the PS2 and not the revolution they’d been promising the industry loudly for years.

  4. arron says:

    I’m as old as Shamus, and I started programming in the 8-bit era when machines came with BASIC, but the way to get maximum speed out of a machine was to write in assembly language which in turn meant learning to express your creativity in either 6502 or Z80 machine code. This stopped when I got a PC and I realized that PC memory architecture was a massive pain in the rear with segmented memory and the like, so I then discovered that C and C++ were a much better option for programming.

    Programming down to the bare metal for application programs these days is pretty much now a silly thing to do. One of the hardest machines to develop for was the original Atari 2600 – a machine that didn’t have screen memory, and you effectively had to write your own OS kernel to handle the screen display with the electron beam scanning down the screen. This is a magnitude of difficulty harder than the sort of stuff I was doing on the Commodore 64 and BBC Micro where you were doing stuff using interrupts to sync the vertical blank so you got nice consistent frame rates and the music/input happened on time. The reason for this is that the Atari 2600 was never designed for ease of programming, it was designed down to a price point for the customer. And hardware design that makes a programmer’s life harder frequently doesn’t feature in the spec sheet when they’re designing the thing.

    Sony’s mistakes with the PS3 hardware merely followed on from the dumb things they did with the PS2. I never got to develop for that machine, but friends of mine (who are cleverer than me, and developed around the brain-dead hardware mistakes on other consoles that happened before) said it had massive limitations. The PS2 had an issue with having virtually no texture memory but incredible fast graphics pipelines. So rather doing what you do with a PC game where you load the textures you need at the start and then referencing them as you walk around the game level – the PS2 had to constantly squirt the data you needed down the pipeline and hope that it all got processed before it hit the screen. Again this is what happens when you build hardware by accountant. Rather than adding a few dollars to bump up the memory to make the developers life easier (and reduce dev time/effort/costs) you have system that just about works and then expect the programmer to jump through a lot of hoops to make their program work on your hardware. Sony merely did this again with the PS3, and then found developers couldn’t cope writing across platforms when building for their consoles.

    The way around this I feel is that is to write smarter tools to take away trying to build a effective solution to fit the architecture and focus instead of specifying what the program what it has to do. Then the compiler has the job of finding the most optimal solution for whatever machine you are writing for, and you merely have to focus on specifying the program in a language that can cope with all the possible things that a programmer will need write.

    For example – there is a thing in C called pre-increment where you add 1 to a variable before you use it (++a) in that statement, vs adding one from a variable after you use it (post-increment or a++). if you are using this in a loop, there are few cases where pre-increment will make sense, but post-increment should be the norm. But “Wait!” (said my university lecturer) – “pre-increment addressing is faster! You should use that always!”

    The stupid thing is that I don’t really need to know this. I just want to add one to my variable and as that occurs before the statement I need to read it again, the addressing mode shouldn’t matter. The compiler should be able to look at my code and say “Ah, this variable isn’t used here, we’ll assume pre-increment when building the object code!”

    It’s like this for all compilation. I can imagine that there will be cases where the code may have several possible builds to fit the hardware, some of which are better than others. The compiler should be smart enough to build them all, see which is the fastest/most efficient through some kind of Monte Carlo evaluation process and then give me the one that fits my designed performance profile. Or if they’re all bad, tell me what’s wrong and tell me I need to fix it – like reduce the number of textures to fit in the texture memory available.

    I’ve recently been experimenting with something called HaXe and games programming libraries called OpenFL and Flixel. The whole system is designed to allow someone to write games that compile to a multitude of platforms – write once, run anywhere. I happen to think that if you’re going to face up that you’re not going to be hand tuning instructions to ensure your program runs according to a finite amount of CRT raster time, you should turn it over to the compiler and let it interpret your code to run on all the target platforms. One advantage is that as new machines come out, your code will still work on them. Only the compiler changes, the description language should stay the same.

    You tell the compiler which bits of the program are running simultaneously, and the compiler builds an optimal solution i.e if you were to use a similar system for PS3 development – it uses every bit of the PS3 cell architecture it can without having to worry about the underlying hardware complexity.

    The advantage of this is that the complexity of the code is massively reduced. You’re interested in describing your program, but not the implementation. You’re describing what you want in precise but abstract mathematical statements to get what you want in terms of solution but then you lose control over exactly what your code is doing.

    This might be an abhorrence to the natural code control freak, but given most of us can’t read what object code is doing anyway, is this such a bad thing? The only real problem is if the compiler doesn’t do a good job writing fast code or if it can’t understand what you want. But that’s a tools issue, not yours as the creative developer.

    1. The Rocketeer says:

      You know, I’ve heard plenty of stuff about how the PS3/Cell worked (because everyone is so interested in why it fell down), but I’d never heard anything about the PS2’s abnormalities, or even that it had any. Now I’m really interested.

      The rest of your post is sorcery to me, and I rebuke you as a witch and as a tutor of witches.

      1. Zidy says:

        If you want to learn more from people who actually worked with it and have a good chuck of time, there’s a series of Let’s Plays on YouTube for the Ratchet & Clank games from PS2 done by a two of the developers who worked on the games. They occasionally talk about the difficulties they had working with the PS2 – including IIRC that Sony basically banned or limited their use of loading screens as that was one of the biggest complaints people had with the games on the PS1. It’s a great series and gives a lot of insight on what goes on behind the scene, so well worth listening to. Probably my favorite Let’s Play after Spoiler Warning.

        1. Steve C says:

          No loading screens was probably my favorite thing about Ratchet & Clank. It was the first time I’d seen a game that managed to avoid them. Obviously I tell they were loading things in the elevators etc but I still thought it was great. If that was a management decision from on high then +1 for Sony.

        2. arron says:

          Watched the first few of these last night. Very interesting :) Many thanks for the recommendation :)

    2. Zak McKracken says:

      small nitpick (but oh, does that spot itch!):

      Back in the day, everybody (I knew) said “Assembler” (and some even used it — I only ever copied some lines from a listing in a magazine into an editor). Since I was learning English in parallel with learning Computers (before learning it in school, because back then they didn’t translate stuff). I would probably have noticed the difference.
      Years later I learned the word “assembly” in an entirely different context, and only a few years ago, I’ve noticed people using the latter word to refer to what I only know as the former …

      … so, were my books wrong, did I live under a rock in between, or is this some error that has crept in in the mean time and become the rule?

      1. 4th Dimension says:

        Are you refering to compiler vs assembler difference? From a hight point of view they both do similar things, they both translate code into executable files that contain commands for the computers.
        As far as I can understand the difference is this. Assemblers translate actual processor commands (like copy the content of registry 8 into registry 15) and are allow direct interaction with hardware with allmost none of the abstractions necessary for ease of use. Compilers on the other hand translate “high level” programming language code (C#, C, C++, BASIC etc) into those same executable files. Since the programming languages are more or less platform agnostic compilers have to do quite a bit of thinking on how to explain your high level command into something processor might understand. In my old textbooks they said the compiler calls the assembler at the end of the compilation to produce the executable.

        Today assembly is also used to refer to .NET executables/DLLs, since they don’t contain the machine code, but CIL commands (low level language) which are compiled/assembled into code suitable to the machine it is run at just before execution by Just in Time Compiler.

        1. Phill says:

          I think he is referring to back in the ’80s when the terms “assembly language” and “assembler” were used interchangeably – at least in my experience. I have books and magazines referring to typing in assembler, and giving examples of assembler code.

          I don’t know if it was just a BBC-micro related thing – the machine’s version of BASIC included an assembly language compiler so you could write hybrid basic/asssembly programs, which may have encouraged more use of “writing code for the assembler”, becoming assembler code (rather than assembly), and shortened to assembler. Or maybe it is a US/UK difference where different words for new concepts took off in different places, and eventually one won out as everyone converged on the same terminology.

      2. Canthros says:

        Assembly languages are basically wrappers for the underlying machine code (binary!) instructions. An assembler is the tool used to turn assembly code into machine code, which can be run more-or-less directly by the CPU. An assembler is, in effect, a simple compiler. Assemblers and assembly languages are usually specific to the instruction set or processor family.

        A compiler is generally used with a high-level language, which is really any language that isn’t machine code or assembly. (C is sometimes referred to as a ‘mid-level language’, because it is so close to being a wrapper for assembly.) A compiler may convert the source code directly into machine code, into assembly, into a different language entirely (unusual!), or into an intermediate representation (like MSIL or Java bytecode) for later interpretation. It does not need to be written in the language it compiles or compile to the platform on which it runs.

        (For my part, I’m not sure I’ve ever heard anybody refer to .NET’s MSIL code as ‘assembly’, but it does look pretty similar.)

      3. Mike S. says:

        Back in the 80s when I was learning what little I know about programming, “assembly language” was the formal term for that class of languages, and an “assembler” was the program that translated assembly language to machine code. But in everyday speech, people often talked about writing “in assembler” or “assembler code”, rather than “assembly”.

        So they were somewhat interchangeable dialect terms, which I wouldn’t be surprised to discover varied regionally or subculturally.

      4. Ingvar M says:

        I think it’s (1) “assembler” == “the thing that takes your symbolic representation and does a fairly direct sequence of ‘macro expansions’ and ‘generate machine code'” and (2) “assembly” (or “assembly language”) == “the thing you write and feed to the assembler”. But, I’m perfectly happy with seeing “assembler” for both, but would find it odd hearing “assembly” used for thing 1.

    3. David says:

      There are a couple problems with your pre-/post-increment argument — the first is that pre-increment and post-increment actually do different things, and there’s absolutely no way for the compiler to tell which one you meant. One will give the correct behavior, and the other will exit your loop prematurely (or maybe it won’t exit your loop at all and your program will spin forever). The compiler can’t know, so it can’t just optimize away at compile-time.

      The other problem isn’t just with addressing. In C++, at least, post-increment actually makes an extra copy of the object you’re incrementing, which is basically negligible for integers, but really really expensive for your enormous class that you wrote to do whatever. So in C++, the rule is “always use pre-increment unless you specifically need post-increment, and even then be careful what you post-increment”.

      So actually, you might make the argument that it’s stupid that pre- and post-increment do different things, and I actually agree with this, but that’s a complaint against the language, not the compiler. In my experience, actually, trying to use either pre- or post-increment inside another statement leads to really subtle bugs really fast, so I try to avoid ever doing this (even though it often yields really slick-looking code). I just increment my variable on its own line of code (pre-increment to avoid the copy problem), and then do whatever I need to do with it on the next line of code. It’s simpler that way.

      1. Richard says:


        The thing about compiler optimisations is that they can have all kinds of unexpected results.

        Search for “Nasal Demons”.

        Before your compiler can make any optimisations at all, it has to be able to ‘prove’ that the optimisation won’t make your code behave differently.

        It has to do this without actually ‘knowing’ very much about what the code does.

        It doesn’t necessarily know what goes on inside a different translation unit because it might not have compiled it yet, another instance of the compiler might be working on that unit in parallel or it might be a pre-compiled library that it simply can’t have a parse tree for.

        Thus it’s useful to give the compiler hints and tips on what might be a good optimisation.

      2. Alan says:

        Perhaps you overlooked where arron said

        The stupid thing is that I don't really need to know this. I just want to add one to my variable and as that occurs before the statement I need to read it again, the addressing mode shouldn't matter.

        which is to say


        Indeed, in such a case there is no difference because the resulting value is discarded unused. I’d be very disappointed in any non-trivial C compiler that missed that optimization.

        As for C++, what is the world are you creating that are simultaneously large and expensive and that ++ is meaningful for? If you’re doing anything beyond incrementing a number by 1 or going to the next element through an iterator I’m going to be very suspicious. For those cases, making a copy tends to be incredibly fast. In many cases the implementation is simple enough that it can be in the header file, the compiler can inline it and apply the same unused-return-value optimization. Sure, there are cases where it will matter, but they’re just not that common. Fretting about the efficiency of pre-increment versus post-increment smells of premature optimization.

        1. arron says:

          That was the simplest example that I could think of as (1) I know that a++ and ++a have different outcomes in the object code and (2) if I use the increment in such a way as it does not affect anything else in the code, the compiler should be able to optimize the fastest outcome knowing that it’s a simple variable.

          The issue with this is that my professor explicitly told me this in my physics programming class I took at Uni, and in most languages would probably wouldn’t care – the compiler would sort it out for you.

          There’s lots of similar examples in C++ where using one convention over another makes code faster or use less memory though reuse of existing variables rather than creating a copy. I don’t usually optimize my code until I’ve got it working and then find out how bad the performance is in the most processor intensive bits are.

          One thing that made me laugh about early Android games development was how hard it was to make it run quickly without GC stalls and the issues with the Java language they’d chosen for development – especially given Java has no unsigned support, but the underlying VM did. And the horrible line of flaming hoops you had to jump through. Chris Pruett (ex-Google and now Robot Invader) did this amazing talking showing how writing games on Android was far from easy. The hardware/OS situation has changed a lot since this, but I find this the funniest talk on how not to write a game on a mobile platform :)

          The point that I’m trying to make is that the programming language, compiler and OS support can make your life relatively easy or extremely difficult. And having to do the programming equivalent of hammering spikes in front of a wooden door to keep out the demonic Garbage Collector (GC) monster threatening to break through is something that programmers shouldn’t have to micromanage either.

          I went through the Replica Island code, and there’s something called “Allocation Guard” that immediately logs if any memory is allocated, so you can keep track of what is being used and whether it might fall out of scope and then GCed. It’s quite a clever tactic to solve a problem that shouldn’t be an issue. An option to turn the GC on and off should be the quickest option for this sort of application if you’re trying to keep your code running fast. You keep it off whilst you’re running the game, and turn it on again once you change levels or are in an another non-critical performance bit of the code.

        2. David says:

          Again, the point here is that operator++() and operator++(int) (that’s the operator overloading syntax for pre- and post-increment) are different function calls. There’s nothing in the language to stop, say, ++a from doing a simple increment and a++ from being this really complex function with multiple different side effects. This is a stupid thing to do, naturally, and I dearly hope nobody would actually do that in production code, but the fact is that the compiler can’t just swap out one function for the other. The fact that the language is designed in a certain way limits what the compiler can optimize.

          As for your second question (“What would you actually design that copies matter for?”) how about a bignum class? It’s certainly conceivable that you might want to increment your 10,000-byte number by 1, and you definitely don’t want to waste time making a copy of it when you do so. I can imagine lots of other possiblities in scientific computing applications which might be large but in which the “increment” operator makes sense to implement. I could also picture some large state machine or something like that, where it would make sense to “update the state” by calling operator++.

          But you’re right. In most cases it doesn’t matter, and the preference for pre- and post-increment is an idiomatic choice. However, there’s a reason that the standard C++ idiom is pre-increment: so that you don’t get tripped up by the uncommon cases where it does matter. If the idiom is “do the right thing all the time”, then you don’t have to remember when it’s the right thing and when it’s a moot point.

      3. MrGuy says:

        So in C++, the rule is “always use pre-increment unless you specifically need post-increment, and even then be careful what you post-increment”.

        Which, of course, makes the name of the language itself ironic…

      4. Ingvar M says:

        Pre- and post- de- and/or in-crement of integers can, on the right machine architecture, be had for free when copying things. Especially back in the day, it was not at all unusual having all of pre/post de/increment addressing modes, for both register and memory sources.

        I think the most common combo today is one of “pre-decrement / post-increment” or the other way around, since that makes it easier to fake a stack pointer.

    4. Jimmy Bennett says:

      I wanted to comment on your discussion about compiler optimizations. One thing to be aware of is that in a lot of cases the compiler does do a lot of optimization work for you. In the case of deciding between pre and post increment, assuming you’re not using the value in that line (in which case the difference matters) then most compilers will replace it with whichever option is fastest. If you’re working in a mature language like C/C++/C# or Java, your compiler is going to do a lot of work to make your code run as fast as it can. People sometimes refer to these aggressive optimizers as opti-manglers because they will absolutely mangle your code to get the fastest result possible (without changing what your code actually does).

      However, despite all the work that has gone into writing better optimizers, there’s a limit to what the compiler can do. The compiler you describe in your post sounds a lot like the mythical Sufficiently Smart Compiler. Maybe some day we’ll have compilers that allow us to describe exactly what we want in high level terms and have the compiler produce some amazingly fast machine code, but that dream is probably a long way off.

    5. silver Harloe says:

      The cell architecture seems to beg for programming in Erlang. But I think Erlang is much better for stateless systems than games (well, for single player games or game clients. it can be great for online-game servers).

    6. Groboclown says:

      Back when the PS3 came out, I was becoming interested in parallel computing, and thought about looking into programming the thing. A Dr. Dobbs article came out (in print, no less) that discussed implementing an efficient breadth-first search that took advantage of the parallel processors in the PS3.

      I quickly went back to my normal programming habits.

      1. guy says:

        You have to manually transfer things in and out of the cache?

        That’s just… I don’t think you have to do that when programming in x86 assembler.

        1. Ingvar M says:

          The data exchange between the PPE and the SPE? Ayup, you have to account for that. The SPEs can only “read” from the main CPU, but I don’t think you have to explicitly manage the cache lines between the PPE and RAM.

          I mean, on the x86 I don’t think the FPU can read directly from RAM, you have to get the data into a register and load the FPU from there, no? Maybe they can, these days, I never actually did floating point in asm.

          1. guy says:

            I’m just going by that article, which says that coders need to manually load 256KB blocks into the local store. x86 requires loading RAM into registers for some things, but it doesn’t matter to the coder whether that’s in the L1 cache or RAM. It is important to think about it and not constantly go back and forth between things that can’t be in the L1 cache simultaneously, but if you access something not in the L1 cache the processor gets the data and stores it in the L1 cache for next time for you.

            That article is pretty old, so it’s quite possible they’ve made their dev tools better since, but they should have released in a better condition than that.

            1. Ingvar M says:

              Yep, that’s “load it onto the CPU, so the SPEs can get at it”.
              The SPEs have no “read from RAM” instruction, only “read from CPU local store”, so you need to start by grabbing all that data you want to operate on from RAM, to somewhere the SPEs can get at it.

              I suspect it’s mostly for predictable timing and making the SPEs simpler to implement (random delay reading from RAM makes the silicon more complex and instruction timing super-complicated to reason about).

  5. MechaCrash says:

    There is a video by Bob Chipman about the poor market performance of the PS3, and also the PSP. The original posting date was in August of 2010, the date on the actual video is because it’s a re-upload. Anyway, the super short version is “the PS1 and PS2 dominated not because they were so great, but because the competition was busy screwing up really hard, the PSP and PS3 are what happens when Sony faces a real fight.”

    While Sony seems to have fixed their worst problems, MS and Nintendo going right back to screwing up does seem to have given them another boost this generation.

    1. Joe Informatico says:

      Yeah, my issue with Bob’s argument is I was a good 5 years older than him during the NES’ rise to power, read video game magazines other than Nintendo Power at the time, and remember the real reason Nintendo utterly dominated console gaming in the mid-80s to early 90s: monopolistic corporate tyranny.

      The Nintendo Seal of Quality might have helped prevent alleged Crash of ’83-causers like Custer’s Revenge, but it also allowed Nintendo to block third-party cartridge manufacturing (until the likes of Tengen found a way around that), take a big chunk of third-party publishers’ profits (I remember one allegation saying it was as high as 60%, though admittedly that came from a magazine extremely critical of Nintendo’s business practices), and make it almost impossible for third-party developers to do business with competitors (i.e. Sega) without being denied Nintendo’s massive install base.

      The Nintendo fall from the throne in the mid-90s predicted Sony’s cock-up with the PS3 in some ways. Namely, the arrogance of being the front-runner. If you look back, it looks like every console manufacturer that sticks around long enough manages to bungle their third release (Nintendo’s been around long enough to do it twice):

      Sega held on in Nintendo’s world with the Master System, posed a serious challenge with the Genesis, and flubbed with the Saturn. The Dreamcast almost redeemed them but the damage had been done.

      Nintendo became king of the world with the NES, held on to the contested crown with the SNES, but the N64 was crushed by the PlayStation and Nintendo lost some 3rd-party publisher support. The GameCube held its own and the Wii was the early champion of the 6th generation before some petering out near the end, but the WiiU has been a far less stellar follow-up.

      Sony had a hit with the PlayStation, the best-selling home console ever with the PS2, nearly had their teeth kicked in for the first few years of the PS3, but seem to have a very healthy lead with the PS4.

      Microsoft found a place for the XBox, really made themselves a contender with the 360–and messed up the One’s release royally.

  6. gkscotty says:

    Great video, though I think the comparison to Dreamcast would be better off as a comparison to the Saturn – the Saturn having a dual-processer architecture that developers found difficult to deal with compared to the simpler and yet more powerful PS1 architecture. Oh and also the far too expensive launch price with limited software compared to a cheaper machine with better games.

    Those willing to put the effort in could do wonders with the Saturn architecture (Saturn Quake and Duke 3D for example) but a great many just did lazy ports if they did ports at all. (Saturn Doom was a shambolic port of the PS1 version that could be worse than the 32X version)

    Mad that Sony fell into the exact same traps that let them sail to victory against the Saturn, if for different reasons.

    1. Psy says:

      Well the problem with the Saturn was lack of development tools, dual CPU systems have existed since the 1980’s and were very common in workstations and servers when Sega made the Saturn.

      Yet Sega didn’t want to have their good programmers spending time coding tools rather then working on game engines so it took a long time before the Saturn got proper development tools. Once the Saturn did get decent development tools the Saturn only really had a niche market in Japan so all it did was give the Saturn a few more years in Japan as small 3rd parties jumped in at the tail end of its life there.

      Same could be said for the PS3, what it lacked was development tools so the developer doesn’t have to worry about the specifics of the hardware.

      1. Merlin says:

        I’m not familiar with the Saturn’s actual architecture, but the summary I’ll always remember is that it basically got shanghai’d into being a 3D box in the first place.

        As the story goes, it was designed as a logical upgrade to the Genesis and SNES: a system that was awesome at working with really pretty sprites. But when Sony started showing off the Playstation’s CUTTING EDGE 3D GRAPHICS TECHNOLOGY!!! (retrospective note: lol) Sega panicked about getting left behind by being perceived as “last-gen” right out of the gate. That left them with a short window of time and money to revamp the architecture into something that wouldn’t fart and explode if you told it to render a 3D model.

        Assuming that’s accurate-ish, I can see why it took them a while to get half-decent dev tools in order.

        1. Psy says:

          The Gigadrive (Saturn’s working title) was originally suppose to do 3D but Sega was aiming for a watered down Sega Model 1 arcade board yet the push to make it backwards compatible required a redesign (that and politics in regard to suppliers). Also the Saturn does not work in sprites, it has to fake them with flat textured polygons, what got developers confused was that the Saturn works with quadrilaterals rather then triangles since that is what Sega Arcade boards used at the time and Sega thought if was good enough for Sega then it was good for the industry and everyone should simply learn to use quad polygons.

          What caught Sega off guard was the 3D performance of the Playstation, yet by this time Sega already had signed contracts with suppliers and they already designed the VDP. To get the 3D performance Sega hardware engineers went with distributing the load thus the two CPUs and the VDP getting split into two halfs in their own separate processor.

          Yet if you look at today we have multicore CPUs with graphics cards hooked together through SLI which is no different that how Sega hardware engineers were trying to get more 3D performance out of the Saturn.

  7. Mephane says:

    Thanks for the transcript, Shamus, I really appreciate that. :)

  8. Zeta Kai says:

    But if that courier has a way of storing his kinetic energy on the downhill, he can release that energy on the highway on-ramp & get to the TriOp building in less than 30 minutes, deliver the package, & get a contract from Actio’s…

    1. ZzzzSleep says:

      Assuming he doesn’t get busted by the cops for riding on the highway…

  9. Peter H. Coffin says:

    Seconding endorsement for the transcript — I can read a lot faster than I can listen.

    Touching on the supercomputer aspect of the thing, there’s a lot of examples, even different ones, where the PS3 made impressive contributions to places where supercomputers typically play. The most famous of these is the US Air Force building a cluster of nearly 1800 of the things, and ending up with the 33rd most powerful cluster in the world for about a tenth the cost of the 34th most powerful cluster in the world. Other places where the PS3 helped out is the Folding@Home project, which was first an add-on application and then OS-installed option/feature that could be turned on. A significant portion of the F@H results came from PS3s. And finally, some fellow in 2007 turned a single PS3 into an MD5-cracker (well, hash-duplicator anyway) that got results in hours while then current PCs were still in the “It takes weeks or years” stage.

  10. Darren says:

    “The big three competing for our attention”

    Is one of those Nintendo? Because, as much as I love my WiiU, it’s difficult to say that they’re really a part of the conversation. Nintendo has gone from a cultural touchstone to more of a boutique company (which is fine, but let’s be realistic about the cultural impact). They’ve got the mobile market pretty well cornered, but I honestly don’t know what kind of impact that has.

    1. Patrick the sleep deprived says:

      Not so much in the US but they still hold a considerable market share in other markets. The north American market is still the brass ring, but if I was running MS I wouldn’t ignore the Asian market completely.

      Beyond that the Wii has had a major impact on Sony and Xbox. Its been almost 8 years since the original Wii was released and Sony and MS are still trying to design some kind of user-friendly and functional motion sensor into their products. If Nintendo dies tomorrow their legacy and impact will be the next great battle ground between Sony and MS: the fight for motion design.

      1. Except I can’t think of a game that’s enhanced by the motion controllers apart from maybe Wii Sports. Every other Wii game I played that had any motion control used it as a gimmick that could’ve been replaced by a button press most of the time.

        The Kinect was neat for its hackability and some of the non-game uses people put it to, but other than rhythm and dance games, there hasn’t really been a killer game for motion control on any platform, has there?

        IMHO, it’s the fact that you can’t really have force-feedback with motion controls that stymie a lot of the games that have been tried with motion control. That is, you can’t swing your light saber and feel it hit your opponent’s, so your “swing” feels like it cut through nothing, even though the game is telling you that your weapon has been blocked and you’d better get ready to dodge or parry.

    2. Wide And Nerdy says:

      Their sales have picked up substantially lately owing to the launch of their tent pole titles and the lackluster PS4 and XBox One launches.

      And Totalbiscuit and his crew on the Cooptional Podcast were discussing (and I agree) that right now if you’re a PC gamer and you want to buy a console to expand your options for games, Wii U has the best console exclusives. Many of the console exclusives for XBox One and PS4 are really just “the other big console can’t have it” exclusives that still get a PC Port (helps that its easier to port to PC now thanks to the architecture.)

      I’m a PC Gamer and this is exactly what I’m thinking. Wii U is the smart console buy for me because of the console exclusives, the lower console price, and the lighthearted fun titles that are more popular on that console than they are on the others. (Remember when games were lighthearted and fun?)

      I also think the Amiibo is really going to catch on with kids.

      Add to that, Nintendo can afford this one failure (if a failure it ends up being), as Super Bunny Hop pointed out. They’re sitting on a ton of cash because Wii was successful and sold at a profit.

      1. Daimbert says:

        The issue with the Wii U for many people — or at least for people who are like me — is that I bought one for Wii Fit Plus, bought a couple of games for it that I never played, and when I look for interesting games for it in the stores I don’t find any. Sure, there are some lighthearted games, but often they’re all a bit TOO lighthearted to really play … and the ones that aren’t are likely on other systems.

        So the Wii U has to overcome that, but with the other systems out there it’s hard for me, at least, to muster enough interest in seeing if it has overcome it.

        1. Wide And Nerdy says:

          The Wii U is as of now profitable for Nintendo. They’ve recovered. And they had 15 billion squirreled away. Nintendo can stay the course now with it’s strategies. There’s no need for it to try to muscle into the stubbly hardcore space where the two big boys are coughing out 30 fps games.

    3. guy says:

      The thing with Nintendo consoles is that their big selling point is having Nintendo games. The Wii U did not have a very good initial lineup and thus sold poorly. Now that Super Smash Brothers and Mario Kart 8 are out, sales have been spiking.

      Pretty much exactly the same thing happened with the 3DS. You’d think they’d learn.

  11. Patrick the sleep deprived says:

    1) Competition, in any industry, is a good thing for consumers. So the console wars might appear problematic to consumers right now, but it’s far better than the alternative. EA has no competition for many of its titles, and their work speaks for itself. Be glad that Microsoft has Sony challenging them, and vice versa.
    2) This scenario isn’t really unique. Not even a little. This war has been played out in the Auto industry, lightbulbs, electric motors, the aforementioned VHS/Beta max and even AM/FM radio. This kind of ‘format war’ happens all the time.
    3) From my limited understanding of all things electronic, Cell architecture was/is an integral part of TV design. Probably something to do with streaming and/or On-demand type stuff. I imagine part of Sony’s logic is console design is somehow related to this as well? Maybe?
    4) I have played 5 titles on the PS4: MGS Ground zeroes, NHL15, AC Unity, AC Black Flag and MLB 14. Every single one except MGSGZ crashed at random times. No rhyme or reason, they would just lock up. So the PS4 seems to me as just as unstable and unweidly as its predecessor.

  12. Vegedus says:

    I think the good, ol’ Japanese market also had a good hand to play in the PS3s survival. Microsoft spared no expense buying high profile JRPG exclusives and to pretty much no effect. They seem to have pretty much given up with the XBone. Meanwhile, some of the middle-big Japanese developers, like Atlus and Nippon Ichi stayed on the Sony team, while only the biggest ended up porting what was otherwise looking to be PS3 exclusives (MGS5:Ground Zeroes and Final Fantasy XIII). I dunno if something like the JRPG genre had any effect on the international market, given that they can barely be called AAA. Titles like Disgaea are relatively obscure, but make consistent profits. But at least domestically, the PS3 had a consumer base that was pretty much uncontested, what with the Wii having a different focus and market altogether, and the xbox being shunned, perhaps out of simply xenophobia, culture differences, or a lack of faith that the console would output games the japanese market wanted.

    1. Joe Informatico says:

      Probably. Had Japanese developers maintained a high level of quality and innovation (or had their distributors released the good and innovative games, if they existed to Western markets), the draw of Japanese exclusives might have boosted the PS3’s numbers in the West. Alas, the former Chrono Trigger and Final Fantasy fans either outgrew JRPGs or weren’t impressed by more recent offerings and that market has become niche.

  13. Vermander says:

    I feel like the last generation of consoles had an amazingly long life. I bought my X-box 360 before I was married and I’m currently playing Dragon Age Inquisition on it while my oldest kid attends Kindergarden.

  14. swenson says:

    It’s funny you should mention just how little programmers actually write code… seeing as in the past two days, I spent about five hours adding or modifying sixteen lines of code in total. I just checked. (okay, okay, so that five hours includes emails back and forth with the requestor and testing, but still)

    The point is that writing the lines of code isn’t hard, it’s figuring out what lines to write and where to put them that’s hard. And you can have all your fancy standards about formatting and naming stuff and organization, but the best tool for understanding code is still sitting down with the person who wrote it and praying they remember enough about it to tell you where to start looking.

    In related news, I do not envy anybody who has to write something for a PS3, because it sounds nightmarish. I am not good with the parallel processing. X(

  15. SlothfulCobra says:

    In retrospect, there was something weird going on with every console in the last generation. The PS3 tried something weird, the 360 kept exploding, and the Wii redefined everything.

    Sony’s learned their lesson, I think. There’s supposed to be a weird kind of backwards compatibility on the PS4, whereas Microsoft tried to lead with Big Brother watching you as a sales pitch.

  16. Dev Null says:

    Let's say you're trying to send a package. One courier uses a bike. He can take one box across the city in an hour.

    I’m not sure this is quite the analogy you were looking for? If the bike can do 1 an hour, and he works an 8-hour day, he does 16 in two days while the van does 10? So the bike is faster no matter how you measure it?

    I guess if we assume that he gets it there in an hour but still has to spend another hour getting back to base to get the next one then he’s slightly slower.

    1. Shamus says:

      Pffft. Yeah. It was supposed to be “a day”.

      Thanks. Fixed.

  17. Robyrt says:

    Ironically, the first 2 pieces of music I ever arranged were too complex to be played. It’s just that in this case, instead of shipping an unsingable song, I canned it halfway through development for something more sane. :)

    1. Neil D says:

      To sing the unsingable song,
      To strum the unstrummable chord,
      To tap the untappable rhythm,
      To play what nobody has scored

      This is my quest,
      My unquestable quest…

  18. The Rocketeer says:

    Shamus, do you still have that section on multi-threading saved somewhere? I’d be interested in reading it.

    I only have a layman’s understanding of hardware and coding, and while I find the machine really interesting I kind of have to re-learn everything about the PS3’s functions every time it’s brought back up.

  19. If you’d upgrade your Christmas Light Architecture to LEDs, you won’t burn the house down and probably won’t have to worry about tripping breakers. Keep up with Next-Gen, man! :)

    1. Exasperation says:

      The funny thing about this is that I’ve seen a couple of discussions about LED vs. fluorescent lighting and input latency recently.

      1. I don’t know about that, but the LED Christmas Light chains I’ve bought either don’t have limits on the number of strings you can daisy-chain, or they’re 10 or more (which is what people were doing with the old ones, fires or no).

        As for light bulbs themselves, I do notice that fluorescent bulbs do tend to need a bit of a warm-up before they’re at full brightness. The LED bulbs in my office ceiling fan seem to have no such difficulty.

  20. tmtvl says:

    It’s all about microkernels versus monolithic kernels.

  21. James says:

    yah 3rd time the 360 broke and I couldn’t get it fixed for free, I gave the middle finger to ms. By the time my 3rd red ring the ps3 was much cheaper and sony had won the format war so seemed like the better choice. The ps3 has never once given me trouble and my old ps2 from like 2002 still works fine so no backwards comparability wasn’t a deal breaker (would still like to have it). As for the ps4, I get why they can’t do ps3 games as well the cell architecture of a ps4 was made by sane people, but how hard would it be to get the ps4 software to emulate ps2/ps1 games?

    1. CJ Kerr says:

      PS1 games would be trivial – you can get a perfectly good PS1 emulator running on a smartphone.

      PS2 games are harder, but Sony could do it easier than anyone else. The PCSX2 emulator works pretty well on modern PC’s, and the PS4 is basically just a slightly mediocre PC with some interesting memory architecture.

      Since Sony know the exact specifications of the the PS2 hardware, a team experienced in writing emulators could produce a PS2 emulator for PS4 pretty easily, and optimising for a very specific set of hardware would probably bring performance up to acceptable levels.

  22. Joe Informatico says:

    So with the last console generation, the home electronics manufacturer made a product that was a nightmare for coders, and the software company released hardware with design flaws. Meanwhile, the dedicated gaming company made a decent console with some innovative ideas and some good games for it, but had burned most of their bridges with 3rd-party devs and couldn’t build a decent library. Sounds about right.

    We’re almost at the point where home consoles will have an industry standard. Arguably, we’re already there: they’re all (except the WiiU?) boxes with similar specs made with off-the-shelf PC components, using the basic DualShock/XBox controller layout. The Steam boxes are going in a similar direction. If it wasn’t for these walled-garden OS ecosystems around each console it practically wouldn’t matter which one you bought. I once envisioned a future where a console standard had been agreed upon, and the hardware company (Sony) stuck to building consoles, the software company (MS) made OSes and non-gaming apps for them, and the games company (Nintendo) stuck to publishing games. Unlikely, but interesting to speculate.

    1. Robyrt says:

      That sounds like a great plan, but unfortunately the justification for spending billions on new hardware or new games is to get a slice of that middle “non-gaming apps” piece: the monthly subscriptions, the app store / console licensing fees, the targeted ad revenue. A single yearly subscription to Xbox Live brings in more profit for Microsoft than the Xbox and a dozen games. So the competing walled gardens are, from the stockholders’ perspective, a feature, not a bug.

  23. RCN says:

    Ah… I remember when the PS3 launched here in Brazil. It was R$ 8 000,00. That was roughly US$ 4 500,00. That’s not a typo. There’s no extra zeros in that figure. And yes, the import tariffs are ridiculously high here, but the 360 was “only” US$ 1 500,00 (the tariffs are high thanks to laws passed in the mid-90s in an attempt to nurture an internal gaming market… but that gaming market was smothered in the womb by our own culture, aping the US game-phobia, but multiplied. And yet the laws are kept in place thanks to our Senate even two decades later).

    These prices did fall off, but the PS3 continued to stay very unpopular until this new generation.. where it stayed unpopular because Microsoft was smart enough to build factories here in order to dodge the most murderous tariffs while Sony just kept wishing the tariffs would go away. And now Nintendo just decided to cross us out of their map, so the obvious winner of this generation, at least as far my country is concerned, is the X-BONE. Actually, it is the PC, but if we’re going to only count the consoles…

  24. Volfram says:

    “This is not a variation on the old computer paradigm. This is something new and strange. Processing clusters like the Cell are more commonly used for large-scale things: Locating and factoring massive prime numbers, running huge simulations, and rendering farms.”

    Well tough luck, because that’s where video games development is going these days. As clock speeds become increasingly limited(it generally took multiple clock ticks for a signal to propagate from one end of a CPU to the other six years ago when I was learning about them, let alone now when the nominal clock speed is nearly double what it was then and the traces are even narrower). We can’t keep jacking up the processor speed. That statement has been true for nearly a decade.

    What we CAN do is keep throwing more cores on. It’s complicated, sure, but you yourself wrote an entire blog post about how much more powerful a GPU is than a CPU when handling the right kind of calculations, even though the average GPU core is often as much as 1/20th as fast as the average CPU core, and doesn’t have things that CPU programmers take for granted like branch lookahead to make if/else statements perform at a reasonable speed. And if you architect your program correctly, it should all fall into place.

    I credit my EE classes as being critical to my understanding of object-oriented programming. I don’t care what goes on inside a logic module, all I care is that I feed it inputs and it gives me correct outputs. Whether it’s creating those outputs via a rat’s nest of nanotransistors, a jar of magic pixie dust, or advanced quantum superposition calculations doesn’t really matter. Object-oriented paradigm is the same way.

    And threading is also the same way. You know how you make a 32-bit adder? Slap a pair of 16-bit adders together. Which themselves were made by slapping a pair of 8-bit adders together, each of which is a pair of 4-bit adders, each of which is a pair of 2-bit adders, each of which is a pair of 1-bit adders, each of which is just a binary accumulator that translates the number of inputs that are turned on into a 2-bit binary value.

    1. Shamus says:

      I agree 100% that parallel processing is the future. But you can’t make that future happen instantly by giving game developers multiple cores and telling them to figure it out. These kind of changes need to percolate through the infrastructure. New tools. Language extensions. Coding practices. Books. Just getting our pool of coders up to speed will naturally take years.

      1. Volfram says:

        Hmm, the post did come out harsher than I had intended, but it was also marked for moderation, so I decided to leave the decision on whether or not it should appear to the blog owner. Who is generally better at deciding what he does and doesn’t want on his blog than I am.

        The point of my post was that, given proper coding, the CELL architecture actually makes sense. It was intended to be near-infinitely expandable, so that if you want to add more processing power, you just add another CELL chip to the stack. I actually have an idea bouncing around in the back of my head similar to this for multiplayer gaming. Say a Minecraft-like game where every player who connects to the server alleviates the added server load by contributing their own processing power to the cluster and increasing the aggregate power of the server itself.

        I don’t think it was intended as a wall to keep developers in or out, though Sony execs(being the idiots they are) may have seen that as an added bonus. I think the CELL architecture was really just ahead of its time. It’s a vicious technological cycle in that the techniques for massively parallelized programming required to properly take advantage of CELL-like processors won’t be widespread until the hardware itself is also widespread, but the hardware won’t be desirable until the techniques to take advantage of it are also widespread.

        1. guy says:

          It sounds like someone needs to take many millions of dollars and write a threading compiler. Even a crude compiler that takes straight-line code and parallelizes it in ways that provably won’t break would be massively useful.

          1. Volfram says:

            Unfortunately, there are some algorithms which are simply not parallelizeable, and automatically parallelizing arbitrary programs is tricky to say the least. I namedrop it an awful lot, but once again, D comes to the rescue.

            It’s a bit of a read at 17 pages, but This article by one of D’s co-creators outlines several reasons why threaded programming in D is easier than in most languages.(and now that I’ve dropped a link to it on this forum, I shouldn’t have as hard a time finding it again next time I want to brush up.)

            Shamus once wrote an article about a programming language designed for game development. I am of the opinion that D is, if not that language, awfully close.

          2. 4th Dimension says:

            That is a VERRY VERRY hard problem to solve since one of the key ideas when we talk about single threaded code is that at one time only one command is being execudted and it is being executed only after the commands above have completed their tasks.
            But to parallel that you would need to break up this ordered list of commands into multiple lists, each of which doesn’t need any of the results produced by the other ones. That is ussually quite impossible since you ussually need the the results of the previous code for the next part.
            The compiler might be able to split the single thread into multiple ones if it cuts up the program into single commands and than it might be able to see which commands can be run now since they don’t need future data. But that could backfire if the programmer tries to do something fancy with manual memory access. And most importantly since oyu wil have milions of differetn tasks scheduling their execution and synchronization of the results would incur a massive overhead.
            The only other locaton where you should be able to cut up your program into threads would be in the loops, but even then if current loop requires the result of the previous one the problem becomes allmost impossible.
            Simply put converting code written for singlethreaded execution into code that uses paralelization is a MASSIVE pain.
            FInally our brains don’t work that way. When you solve a problem you ussually solve A than B both of which give you a result you need for C, but you don’t at the same time solve both A and B and after you finish you solve C.

            1. guy says:

              Yeah, I am aware of how ridiculously hard of a problem it is, but the thing is that it’s hard to do parallelization by hand too. We’ve experimentally verified that if you tell most game designers that they need to solve that problem by hand to get maximum performance, they just aren’t going to do it. We’ve already got some systems for identifying dependencies to use in optimizing compilers and chip-level pipelining that could be applied to multithreading, and manual multithreading has interthread synchronization tools.

              I don’t expect that it’s even possible to write one that threads as well as doing it by hand properly, but one that does even some of it would be a huge help, and using more cores allows the game to soak a pretty high overhead before actually being slower. Or it could be solved at a higher level where people make something like Hammer or Unity that hides the threading from the game designer.

      2. Steve C says:

        If movies have taught me anything about programming it is that all a programmer really needs is motivation. So all we really need to do is put all the programmers together and threaten to kill them if they don’t solve parallel processing. They’ll naturally work together in parallel and solve it in no time. Maybe the bad programmers will be casualties but it will be done in 2hrs tops.

        I volunteer you all as tribute.

        1. silver Harloe says:

          2 hours?

          make it a montage.
          5 minutes. tops.

    2. Daemian Lucifer says:

      Sure,it is the way of the future.But even now,when we had multi core processors for over a decade,only a fraction of the games utilize them,and only a fraction of those utilize them well.Making a multi core machine back then when PS3 came out,and one that only plays games,is akin to commercially releasing the first oculus(the one that didnt have motion tracking)and expecting it to compete with the rest,only without the gimmick of virtual reality.Basically,it was not a smart decision made by those that know the technology.

  25. Gilfareth says:

    Only partly related to the video, but… Oh man, I hadn’t listened to 512k yet. That is a damn good song, Shamus! I really hope you keep making stuff like this.

  26. Dork Angel says:

    A very interesting and informative video. I bought my PS3 after I heard the next version wasn’t going to have back-wards compatibility so I got one of the last of the original fat ones. Ironically, since I got it, I’ve never had a PS2 game in it – even though I have a couple I’ve never played and several I’ve never finished. PS2 sits in my room unplayed as well. The PS3 served me well (though it was a little short on memory meaning deleting a game and it’s DLC once I had finished it) until late last year when it refused to switch on. It’s last year however let me play some amazing games that I’d missed out on when they were released (most of what I was playing was COD, Fallout, Skyrim and Tekken) and I experienced Last Of Us, Bioshock 1 & Infinite, Dishonoured, ME2 & 3, Metro Last Light and Spec Ops – The Line. Got it re-flowed (which got it working again) but the guy said the graphics processor was on it’s way out and kindly didn’t charge me. Still works for now and I’ve only a couple of games left to play on it. Dead Island Riptide to finish, Deus Ex – Human Evolution and Resident Evil 6. I now have a PS4 though and with the new COD, Alien Isolation and Shadows of Mordor to play, they may have to wait…

  27. Don’t focus too much on that ring bus. That’s purely an implementation detail; it doesn’t really affect software developers. The PPE sits on that same ring bus. Lots of other chips – including modern Intel CPUs – use ring busses internally (because they have all sorts of advantages in silicon)

    One of the interesting thing about the last gen of consoles is that not many people cottoned on to the fact that the processors in them – even for the time – were pretty crap. Yeah, 3.2GHz looks pretty good on the surface – but they’re in order multithreaded cores (The PPE in the Cell bears quite a close resemblance to the core that the 360’s Xenos uses). Couple this with relatively high latency GDDR3 (360) or XDR2 (PS3) memory, and you have a processor with pretty “meh” performance.

    To compensate for this, of course, you end up with multiple cores. In that light, the decision in leaning towards the architecture either Microsoft or Sony chose is less obvious – do you have three cores with six mediocre performance threads to run everything, or do you have one core with two mediocre performance threads to run control logic, and six little cores with blistering compute performance (but slightly weird-ass architecture) to crunch data?

    I think Sony’s designers jumped the gun on concurrency a bit. It was clear at the time, at least to those deeply involved in semiconductor design, that the continual increases in single threaded performance were not sustainable at anywhere near the rate they had been previously, and that multiple cores and concurrency were going to become incredibly important. The kind of design that the SPEs require is very much commonplace today – GPUs have grown up to be nearly as general purpose as the Cell was in practical use, and developers make greater and greater use of this in the quest for better graphics and new effects.

    Before around ~2005, you could rely on increases in CPU performance to let processors juggle more and more objects and push them down to the GPU. The end of single thread scaling was pretty much the end of that – yes, you get some extra performance, but memory latency is basically stalled (therefore increasing in relative terms) and processor speed isn’t going up that quickly (while GPUs are inherently parallel and their performance does continue to scale with silicon area). Your early 2000s engine contains a hierarchical tree of objects pointing to each other, forming a scene graph (and often use octrees – cubes in cubes in cubes… – to speed up culling) . This worked well enough at the time, but with increasing memory latency, if you do that then your processor spends all of its’ time stalled, and your GPU gets bored.

    Those octrees look really good on paper – you just culled that big block and discarded half of the objects in one operation? Great! But they don’t work so well in practice – because there are a lot of objects which they don’t discard, and that each time you descend to a lower tree node you miss in caches and your CPU spends ~150 cycles waiting for your RAM to come back to it with the data. It actually turns out to be quicker to just dump every object into a big, linear, cache friendly array and just iterate through it in order – ~5cycles/object is a lot faster than 150cycles/object.

    And you do this, and then you discover new ways to exploit parallelism – that big list? Hand 1/Nth of it to each of your N cores. And then you realise that, actually, this problem is embarassingly parallel – so you can hand it to the GPU, and now you’re also generating the command list that the GPU needs to read in GPU memory, and the structure that points the GPU at that command list and tells it how many commands are in it, and so now the bit of your render loop from frustum culling on looks like “GPU, run the culling program; wait for it to finish; then render from this buffer in your local memory”, and you’ve just freed up 10% of your CPU time to do actually interesting things like, say, simulate a game world.

    That last bit’s a bit beyond most engines yet, and it’s hard and not possible to do for everything (but certainly it’s an appraoch which can be applied to most objects), but it’s kind of the eventual endpoint of this slow evolution we’ve been having into exploiting parallelism in more places, and wasting CPU time less where we can’t, and realizing that often these two things go hand in hand, and also realizing that the O(N^2) algorithm is faster than the O(N) algorithm when there’s a big hidden constant in front of the N in the latter that isn’t in the former and your value of N is on the order of ~500-1000.

    It’s all about the data, stupid, (or rather how you think about that data), but developers had only started realizing that in 2005 and it took them another 5 or so years to really get it into their heads and fix their code. That’s why the PS3 started properly coming into its’ own in 2010 or so – especially for exclusives – because developers were really starting to get to grips with that on PC and therefore on consoles. (It didn’t help cross platform titles as much because developers getting to grips with the SPEs just revealed the weakness of the non-unified-shader architecture GPU: You pull load off the vertex shader side and then the fragment shader side bottlenecks, and there’s nothing you can do to relieve it there, and the vertex shader cores are just sitting idle)

    Microsoft decided to focus on the comfortable then rather than the uncomfortable but more capable future. They made the right decision – they’d be ahead for the first many years of their console’s life.

    I guess Sony’s hardware designers – not being software designers – underestimated the time that transition would take.

  28. Shamus, there is something odd about the voice in the video.
    Almost like the left and right channel are out of phase, did you apply a faux stereo effect to the vocal?

    It may not sound so bad on speakers but with headphones it doesn’t sound so good (voice inside head effect).
    Having the voice mono (identical in left and right) would sound better.

    It’s also possible that Youtube audio codec is kind of messing things up (if there is a wide stereo effect applied to the voice).

    It’s not horrible it just that I get a feeling of “gah, darnit Shamus, stand in-front of me when talking not inside my skull” *laughs*.
    The voice is also slightly shifted to the right so it’s not really centered.

    Just something to keep in mind for the next video.

    PS! While editing your video/recording dialog make sure to turn off any “Surround Sound” processing/effect your soundcard or headphones might have.
    The voice in the video might sound better using Dolby Headphone processing or similar but will sound “phasy” with normal stereo/headphones.

    Anyway, I’m just nitpicking (I’m far more tolerant to visual glitches/issues than audio ones in media).

    1. Shamus says:


      The video editor I have is annoying in that giving it mono audio ends up with all the sound in one ear. So I had to make it stereo, and since I was doing that I figured I make it more “interesting”, since some people claim mono is awful, etc.

      That’s the problem with being an audioslob. I don’t know what audiophiles will like or hate.

      We’ll see what I come up with next time around.

      1. *nod* I see.

        If you are able to move it a little more to the center and forward.

        Basically just duplicate the mono audio in left and right channel, then apply a stereo reverb if you have that.
        That way the voice will be front and center but there will be a sort of “room” feel to it as there will be a reverb that is slightly different in the left and right.

        That should sound fine regardless if people are listening using speakers/headphones/virtual surround/whatever.

      2. It sounds like you put the two channels out of phase. I generally have Dolby ProLogic II enabled on my system, and DPLII interprets “180degree out of phase” as rear.

        So I had to turn that off, because your voice was coming from behind me :-)

        1. Shamus says:

          At one point I saw a YouTube tutorial saying to turn mono into stereo, you should move the channels a tiny bit out of sync and invert one of them. Not knowing any better (and not caring one way or the other myself) it sounded like the thing to do.

          Spoiler: The next video will have the same problem. Just finished 15 hours of video editing on it, so re-doing the audio is not practical at this point.

          But for the video after that, I’ll do something different.

  29. arron says:

    As someone who does a lot of Monte Carlo scientific computing, I’m keeping a close eye on using graphics card for parallel computing (GPGPU)

    I’m also interested in micro-boards like this that have a parallel processing set up that one could cluster together into giving a lot of processing power for the same power requirements of your desktop PC.

    It is an interesting time in having access to such hardware for doing computing on. Once upon a time, this would fill a room, and now it can be a stack of circuits in a box on your desk.

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply

Your email address will not be published.