Let’s Code Part 3

By Shamus Posted Thursday Dec 9, 2010

Filed under: Programming 42 comments

Part 3 is up. Check it out, then my comments below:

Ugh. The example Microsoft code Michael shows us for DirectWrite is abominable. That’s about what I remember for DirectX programming as well. HUGE variable types, all over the place, mixed case, and cumbersome syntax. The end result is that even simple lines of code end up being very long, which is bad for readability. This is probably the #1 reason I work in OpenGL. I evaluated both in the 90’s, and found OpenGL to be far, far nicer to work with. Since then OpenGL has (reportedly) struggled a bit to keep up with the cutting-edge stuff. I suppose this would be bad if I rode the cutting edge, but I’d rather have it be slightly more cumbersome to load a pixel shader in one part of my program than have the ENTIRE program look like a wall of incomprehensible gibberish.

It’s really interesting how much my recent work is matching up with what Michael is doing this week. I said last week that I wrote my own program to produce Drawn to Knowledge. If you remember, that was this video:


Link (YouTube)

My program isn’t nearly as clean or as polished as what Michael is doing. It took me over three days to create, and is still rough to the point of being unfit for public viewing. But I had to deal with both the file-format question and the font-drawing stuff.

For file writing, I took the easy and cheap way out. As Michael said, you have to remember to mark your files somehow so that they will be readable in the future. Like, if you are making a save game system for your game, and later in the project you decide to add a bit of data for keeping track of how long the player has played this particular character. If you don’t have a way of keeping track of versions, then the newer version of the game won’t be able to read old save game files. It will look for that “time played” stat, and (because it’s an old savegame) find some other bit of data in that spot. This will throw off reading of the file and generally lead to some sort of hilarious failure.

For my own saves, I need to save all of these little doodles. They’re actually pretty small. (The entire presentation you see above is just ~317k.) So I took the lazy way out and gave the file a needlessly huge header file. How that works is this: The project settings record stuff like what background I’m using and what line color I’m using. (The next video will show this off a bit more. It’s done in purple “crayon” on white paper.) That bit of information is actually just a few bytes. But when I save the file I write a great big 2k block of (mostly empty) data at the top of the file. If I add some feature in the future then I can just throw that data in that big empty space and not worry about the file size changing. It’s wasteful to do things this way. But hey: two thousand bytes. Screw it.

Here is a shot of the program used to produce the video:

chalkboard1.jpg

The font business was more annoying. I don’t actually need fonts for my program to do any drawing. That’s really my handwriting, although messier than my usual handwriting because I’m still getting used to the idea of a tablet. But for debugging I DO need to draw some stuff on screen. When something malfunctions, it’s helpful to have the program able to print text to the screen so it can tell me about what’s going on. Luckily, I didn’t need to worry about it looking pretty, since it’s just for my own use.

And because I know people will ask: The next Drawn To Knowledge will probably come out next week. I still thinking about when I should put it out, how often, what topics I should cover, how it should look, and how much time I’m willing o put into it.

 


From The Archives:
 

42 thoughts on “Let’s Code Part 3

  1. Ooh, net neutrality! Exciting stuff.

      1. MichaelG says:

        I hope you say something about “regulatory capture”. You know, where they *say* they’ve passed net neutrality, but in fact have gotten cozy with industry and written something with huge holes in it. Over at http://www.techdirt.com/ they’ve been pretty skeptical of the FCC.

      2. Amarsir says:

        Don’t do net neutrality. You won’t like the conversation that arises unless you’re harboring a desire to turn these discussions into Shamus’s Conspiracy Corner. (In which case you might as well be timely and cover WikiLeaks first…)

        1. Shamus says:

          You’re right, it is a hot-button issue and filled with rhetoric. My video is more of a gentle “here is what people are arguing about” and less “here is what I think about it.

          We’ll see how well it goes.

          1. Amarsir says:

            Shamus I have 100% confidence in your ability to present an impartial and fact-based summary. And the people who read you are about as thoughtful and reasonable as you can get on the open web. So please don’t think I was impugning. I’m just saying I intend to duck and cover. :)

    1. Andrew B says:

      OK, I’m impressed you got that from the screen shot. I was staring at it and thinking: “huh, I don’t recognise that from the previous video”.

  2. Specter says:

    frequency: once every 2 weeks
    coverage: anything hard to understand for laypeople you can explain in a simple way :D

  3. Irridium says:

    I have an idea for a topic.

    Error messages, and how much they suck due to how little they explain the actual problem.

    1. Daemian Lucifer says:

      I agree.

  4. Van Tuber says:

    It’s alright Shamus; you don’t have to start up yet another project. You can have some free time. We’ll forgive you. (mostly)

  5. Scott says:

    I want this program; I think it could be an effective teaching tool for any subject (not just the ones you are using it for). Are there any plans to release this?

    1. Rustybadger says:

      What he said. But for Mac, please.

  6. Jamfalcon says:

    Have you considered pitching Drawn to Knowledge to the Escapist? It seems like the sort of thing they’d be interested in running, and something that a large chunk of their audience would watch.

    1. AGrey says:

      well, this is what i get for not reading all the comments first… (see below)

      anyway, I am completely in favor of this

  7. SteveDJ says:

    Commenting on save files – wouldn’t using XML also solve the problem of expansion? When you have new data to save, it just goes in a new tag/attribute/whatever — something that is basically ‘ignored’ by the old program.

    1. Zock says:

      You don’t even have to go XML, but can use binary TLVs instead. This is a common concept in protocol design (and basically what Roger describes below with different terminology).

      PS. Please don’t advocate reserving space in front of a package (in your case a save file) for future purposes. This has proven to be a non-future-proof method time and time again, and has caused many a headache for people designing and extending network software.

      1. decius says:

        I don’t understand why you need to reserve space for later, or even why it would be advantageous. Surely the header will include at least
        A) The version information of the file.
        and
        B) Some indication that the end of the header has been reached. (xFFFF…)?

        The software needs to know
        A) Do I know how to parse this version of file?

        If the file is too different, either a later or of an earlier, no longer supported, version, then refuse to load it.

        Loadfile(file)
        if ver=0.95 then parsefile(file, 0.95)
        else if ver=1.0 then parsefile(file, 1.0)
        else if ver=1.1 then parsefile(file, 1.1)
        finally barf(File version not supported or file corrupt)

        Again, I’m not a coder, but a math geek.

  8. AGrey says:

    I’m a regular visitor to the escapist, and watch almost every video series posted there regularly (I love extra credits particularly)

    Are there any plans to submit this to them as a regular series for them to run?

    movie bob just got a second series, and they already publish your articles. This seems like something right up their alley.

    if there’s anything you need in terms of a write-in campaign to make it happen, just let us know!

  9. Rodyle says:

    Oh boy… Me and a friend of mine also had issues with that safe stuff, and that was for a relatively easy program (we dubbed it ‘paintshop noob’)…

    This question is most likely going to sound totally stupid to anyone with any kind of real experience in programming, but wouldn’t it be possible to insert a try/catch, and just ignore the whole time played aspect if it doesn’t work?

    1. X2-Eliah says:

      That would work with maybe one new tag/parameter.. Once you introduce five or more, the try-catch will get incessantly cluttered, compared to a switch or whatnot.

    2. MichaelG says:

      If new data is only added to the end, you could define zero as the default value for all your new stuff, and it would work OK.

      Or you could do what Microsoft does, and use the length of the data structure to indicate which version it is. The new structures just add fields, and since the size increases, the code knows which version it’s dealing with.

      If you are inserting data into the middle of the structure, or changing the meaning of an old field, then you have problems. If it’s just unformatted data, there’s no way for the loading code to know what has happened. It will misinterpret the new values, or even everything after the inserted value.

      XML is a better solution because you can just add new attributes that weren’t in the old file. It’s more trouble to read though. A version number at the start of the file, so you know which version you are dealing with, works fine too. So does Shamus’s solution, as long as he never changes the meaning of an old field.

  10. Pyroka says:

    ‘Microsoft, are you really forcing all of your programmers to write “static_cast(A)” instead of “(int) A” now?’

    This I have an issue with, this is not a Microsoft thing, it’s a C++ thing, and it’s a good thing. C++ style casts (static_cast, dynamic_cast, const_cast and reinterpret) have many benefits over the c-style cast ( (int)etc.) mainly they’re safer, a static_cast has defined behaviour that is what most people want when they cast, as in, ‘Right now this is an int, but I want it as a float’, also very useful for polymorphism. const_cast will fail to compile for pretty much all invalid casting that will get you into trouble.

    dynamic_cast, likewise is useful, as it will check the cast is valid at run-time and return a null pointer if not.

    const_cast is useful when you’re dealing with a poorly coded library, but should be avoided elsewhere.

    reinterpret_cast is evil, it says ‘This thing here is now this type, I don’t care what the compiler thinks’ which you should only do if you really, really know what you’re doing, and even then you probably shouldn’t.

    Now the thing with ye-olde c-style cast is it will start of trying a const_cast, and if that doesn’t work it tries dynamic_cast, etc. All the way down to reinterpret_cast. This allows you to do evil you probably don’t want to do.

    Also, it really must be a while since you’ve done games coding if you even considered GDI/GDI+ etc, which are horrendously slow and nasty.

    And I couldn’t agree with you more about Microsoft’s odd function/variable names, terrible to type and one of the reasons I prefer OpenGL. Although DirectX does have more cutting edge features (it took the Khronos group forever to decide on a specification for the next OpenGL) and it also has the advantage of you not having to check if the graphics card you’re running on supports any of the un-official extensions you may want to make use of.

    Apologies for the casting rant, I know it looks ugly and cumbersome but it’s good-practice.

    1. MichaelG says:

      On casts, I’m sympathetic to what you say if we’re talking about complex types. There, it really does matter if you corrupt pointers with a bad cast from one type to another.

      But for casting an int to a float? Please… this is just ugly visual noise added to your program for no reason. C++ code will never be beautiful, but you don’t have to whack it in the face with a crowbar.

      Oh, and what do people use these days for Windows graphics if not GDI, GDI+?

      1. Pyroka says:

        I get that it’s pretty pointless for intrinsic types, but better to use the same all over the place? I dunno (I’m one of the strange people that think C++ is beautiful, in a strange way, and find C# / Java kinda ugly)

        And for windows graphics? Possibly still GDI/GDI+ (I assume that’s what .NET uses under the hood) but for games it’s pretty much OpenGL/DirectX/Both, pick your poison on that one though, essays can (and most likely have) been written on the merits of both.

        1. Svick says:

          Actually, the old .Net GUI library (called Winforms) does use GDI+, but the new library called WPF (and related to Silverlight) does its own rendering.

          1. Simon Buchan says:

            Direct3D 9 actual output

      2. Kdansky says:

        Well, if you want to use the less safe c-cast instead, you still can! Nobody will stop you! So please guys, don’t complain about stuff that is optional (and more often than not a huge improvement). Also doing a for-loop and then casting the int to float isn’t what I would call clever to begin with, so it’s pretty much down to “that example is just bad style”, and has nothing to do with DirectX itself. If you look at the calls themselves, they are quite sensible, much more so than the cryptic OpenGL (or even worse: MFC) stuff of old.

        Complaining about static_cast being cumbersome only shows how little the complainer knows about C++. I’d rather spend two seconds to type “static_cast” than I spend half an hour searching for that mysterious segmentation fault. And as for readability: Number of letters doesn’t matter as much as how easy they are to understand. Sure,
        “if (a > 0) and (x = a) then begin a := a + 5 + x else a := a+n endif”
        (Pascal) is long, but

        a+=(a>0&&x==a)?5+x:n;
        is way harder to read, despite being significantly shorter. And everything that has more than one meaning is hard to understand. (int) has FOUR possible meanings (ignoring possible operator overload), whereas static_cast has exactly one.

        I would also point the more experienced programmers to http://blogs.msdn.com/b/oldnewthing/ for more “ARRGHHH?” ;)

        1. Pyroka says:

          Actually, since I’m more used to it, I read the ‘a+=(a>0&&x==a)?5+x:n;’ much faster, but then I’m pretty sure my brain parses C++ better than English

  11. Knight of Fools says:

    I think any subject that makes Computers, Games (Even table-top ones!) less ubiquitous and easier to explain to the uninitiated would be great. Generally, if you can take any subject that’s nearly incomprehensible and fairly boring, but interesting to know, and make it fun (Chalk, Crayons, and Humor) and easy to understand, you’ve got an audience.

    True intelligence isn’t being able to understand something, it’s being able to take a subject that is vast and complex and explain it so a child could understand.

    1. WJS says:

      Except for the most arcane subjects, if you can’t explain it to a layman, you probably don’t fully understand it yourself. This goes for just about everything.

  12. General Karthos says:

    Not sure if this is where it belongs, but how long has that winter theme been up at the top? I like it.

    1. Jamfalcon says:

      Only a few hours. And I agree, it looks suitably Christmas-y :)

  13. Shamus, as some have pointed out xml is very practical, and it’s even editable with a editor like Notepad++ so very convenient for development testing.
    Alternatively just a variant on the generic .ini could be used.

    If you wanna go binary then you should do the following instead:

    HEADER ID/FORMAT ID,VERSION,LENGTH,CHECKSUM
    ID;LENGTH,DATA

    ID;LENGTH,DATA
    ID is a 32bit is a signed id, negative id’s are for testing or future use, this still leaves 2 billion possible id’s for normal use.
    LENGTH is a 32bit unsigned length of the data that follows.
    DATA is the binary data, the type of data and if it’s text or not and if it’s terminated or not is determined by the ID.

    As the the ID range $80000000 to $FFFFFFFF is reserved or for testing/etc and possible future extension. That leaves the range $00000000 to $7FFFFFFF.

    $00000000 is the id for empty or unnused space, LENGTH will also be 0 and DATA will well be empty/not there at all, this also makes it practical and safe to null in memory as any saved array will be safely ignored if included by accident.
    $?xxxxxxx mask indicate the data category, and can be the following:
    $0xxxxxxx is for informative
    $1xxxxxxx is for user data
    $2xxxxxxx is for auto generated data (like a PRNG seed or similar)
    $3xxxxxxx is system data, useful to re-check against to see if an important system setting was changed and if you need to auto-reconfigure or ask the user about it. (not many programs do this)
    $4xxxxxxx to $7xxxxxxx not defined currently.

    $x?xxxxxx mask indicate the data type, and can be the following:
    $x0xxxxxx is for 32bit binary signed
    $x1xxxxxx is for 64bit binary signed
    $x2xxxxxx is for 32bit binary unsigned
    $x3xxxxxx is for 64bit binary unsigned
    $x4xxxxxx is for 32bit float
    $x5xxxxxx is for 64bit float
    $x6xxxxxx is for a null terminated UTF-8 string
    $x7xxxxxx is for a 32bit signed followed by a 32bit signed
    $x8xxxxxx is for a 32bit signed followed by a 32bit unsigned
    $x9xxxxxx is for a 64bit signed followed by a 64bit signed
    $xAxxxxxx is for a 64bit signed followed by a 64bit unsigned
    $xBxxxxxx is for a 32bit signed followed by a 32bit float
    $xCxxxxxx is for a 64bit signed followed by a 64bit float
    $xDxxxxxx is for a 32bit signed followed by a null terminated UTF-8 string
    $xExxxxxx is for a 64bit signed followed by a null terminated UTF-8 string
    $xFxxxxxx is for a null terminated UTF-8 string followed by a whatever the string part says it is. (so string+1 byte then just subtract from the length to know the size of the data, this is for arbitrary long textual id’s with text or binary var:data sets)

    $x0xxxxxx to $x6xxxxxx is mainly intended for direct value storage.
    $x7xxxxxx to $xFxxxxxx is mainly intended for [variable,data] pairs.

    $xx?????? mask indicate the id for the data.
    This gives a range of $000000 to $FFFFFF for each data category and type.
    So you could have 16777215 different ids of 32bit signed values stored.
    This allows what you worried about Shamus, I seriously doubt you will exhaust over 16 million id’s by changing/storing millions of different font ids.
    Also if you really should ever need to do that, then that is what the $x7xxxxxx to $xFxxxxxx types are for, when you know the id’s may change or you do not know what the id might be etc.

    And in case anybody are wondering, I just came up with this storage format right now (I do crazy stuff like this all the time), and if you wish to use it or something then consider it by Roger Hà¥gensen / Creative Commons Attribution

    And if you are unsure on how to do the header part then try this one:
    http://www.emsai.net/projects/binid/details/
    So that the header of your savefile is:
    BINID
    LENGTH OF DATA FOLLOWING,CRC32 OF DATA FOLLOWING
    ID;LENGTH,DATA

    ID;LENGTH,DATA

    And in Shamus’s case the BINID could be:
    Drawn to Knowledge Project

    And all settings and actual project data could easily be stored in that single file,
    it would be extensible and expandable, future versions can read previous versions,
    redundant/deprecated data is simply ignored (alternatively converted to new version under loading)

    Heck, by simply changing the BINID you could re-use this for almost any kind of project/program/game etc.
    In a way it even works as a crude archive format of sorts.

    And if too outdated/old many years from now it’s possible to parse and rescue what you can from it since you know thanks to the format, what is supposed to be exactly what, even if using a completely different programming language.

    Basically what I just created here was a sort of binary concept of what XML is for text.
    Although that being said I guess you could also just use XML and do it in a similar way to how I outlined it all here, it’s a bit more messy to to store binary inside a XML file though.

    1. Kdansky says:

      I’d rather have human-readable XML. Might be slower and fatter, but way easier to debug. And debugging is where I spend 90% of my programming time, so optimisations count. I want to point people (again) to pugiXML. I rewrote our import/export this week, from TinyXML to PugiXML. It took me about 30 minutes per 100 lines of code (which is LUDICROUS SPEED) and spent another 10 minutes to find all errors I made. It’s a great piece of code.

      1. Simon Buchan says:

        I’ve always been VERY disappointed with XML parser libraries – I’ll have to check out PugiXML then. I’ve actually found text formats – even my preferred JSON – to be much harder to parse than binary formats, there is *way* more book-keeping to do. Also, round-tripping floating point is very dangerous, inexact parsers are *very* common (David Gay’s dtoa.c is pretty much *the* implementation of strtod() and dtoa()).

        EDIT: re the proposed format – no need to be so complicated – you can get away with just { uint16 kind; uint16 length; byte data[length]; } for almost all formats, SWF for example is pretty close to running DEFLATE over that (they optimize for short tags). You do have to think whether you want to be able to reference other tags (add “id”), or have forward compatible tags so older readers can read newer files (I don’t recommend this, but if you want, add “version”). The point is that normally, if you don’t know how to read it, it’s because you cant use it – so there’s no point in a self-describing file. If you do want something like that, check out .blender files (yes, the 3d modeller’s format). I also think OLE structured storage (old style binary word documents) are cool, they essentially implement a filesystem in a file, with a page table, sectors, folders, file types, and a bunch of other stuff. Unfortunately, the partial updating it gives you also means its easier to get a corrupt file :(.

  14. Oh and since we’re talking about programming and etc.
    Mark Russinovich’s videos is a must:
    http://www.msteched.com/2010/Europe/WCL401
    http://www.msteched.com/2010/Europe/WCL402
    Those two links cover Windows memory use and management, learn what the Task Manager numbers actually mean.

    1. Jansolo says:

      “Silverlight is required to view TechEd videos online”. What a pitty!

      By the way, it’s the first time I see a site with this technology, that even Microsoft consider futureless.

  15. Avilan says:

    Just wanted to point out something that has nothing to do with this post:
    I like your Christmas decoractions!

  16. Brendan says:

    Can I make a suggestion?

    Remember RedLetterMedia’s wonderfully psychopathic review of Star Wars: the Phantom Menace? It had a subtle musical back beat behind it at all times (much like your intro song.) It was below notice, but made the whole thing infinitely more watchable, because it gave the video a pace, and some flow.

    What you have is wonderful, but educational videos get boring to non-nerds real quick. If you included a beat, the wider audience’s interest might be retained better.

    Just a thought, anyway.

    1. Amarsir says:

      What you have is wonderful, but educational videos get boring to non-nerds real quick. If you included a beat, the wider audience's interest might be retained better.
      You are a wise man. I’ve made tutorial videos of my own and I never realized that glaring omission until just now. I thank you and so does anyone who will have to watch my “wish-I-could-do-better-than-powerpoint-capture” presentations.

      1. MichaelG says:

        It’s educational, but can I dance to it?

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply to MichaelG Cancel reply

Your email address will not be published.