Notes on the autoblography

By Shamus Posted Sunday Oct 23, 2011

Filed under: Notices 49 comments

Last week I claimed that the autoblography would end at #34. Lies. I’ve been editing, inserting, and expanding. It will not end in the coming week, although I really, really am trying to end it the week after. Right now I’m looking at #37 or #38. (If you haven’t noticed, I post these Tues, Wed, Thu, and Fri. That’s four posts of 1,000+ words every week. This may be the most content I’ve ever put out on a sustained basis.)

Also, I’ve restored the long-dead tagging system for posts. You’ll notice at the end of this post, it’s filed under “boring announcements”. If you want to link to or browse the autoblography posts, you can use a link like this:

http://www.shamusyoung.com/twentysidedtale/?tag=autoblography

See?

The advantage of the tag system is that I can group posts without proliferating the number of categories. (Those little icons at the top.)

The downside of these tags is that I didn’t start using the tag system until the site was several years old. (It wasn’t even available in those early versions of WordPress.) I didn’t use it in the last year or so because I forgot about it when I rolled out my new theme that didn’t show tags. So most stuff on the site isn’t tagged, and combing through the archives of all 2,983 posts on the site to tag them would be prohibitively time consuming.

I might be able to auto-generate some tags for common topics (like Project Frontier, Hex, and such) with a bit of creative MySQL, but that’s also a good way to break something. (SQL commands don’t have an undo button. Yes, you can restore the dB, but that’s a massive pain for a bunch of stupid reasons. (There’s a cap on how much you can upload through PHP, there’s a cap on how big SQL commands can be, and there’s a cap on how long a PHP page can take to do a thing. It’s very, very likely that my 300Mb (!!!!) database will hit all three limits, which would require me to do things the very, very hard way. Basically, botching a MySQL command in a way that requires a dB restore could result in hours of frustration, fighting to get the site back up.))

TL;DR: There are tags on some posts, but not on others, and the tag system sucks and is not likely to improve soon.

 


From The Archives:
 

49 thoughts on “Notes on the autoblography

  1. Joshua says:

    Can you use the WP export function to get a flat file, then massage that to produce a load data file to import just for the tags table? It’s still a bunch of work, but you should be able to get it working incrementally without needing to worry about not being able to undo or hitting the limits (or two out of three of the limits)…

    …actually, I see WP gives you an XML file, which is even easier to work with. Not that I’m really saying you ought to go spend a lot of time doing this, just that there’s more than one way to skin the proverbial cat.

    1. sab says:

      And my guess is that an xml export of the site database would be even larger than an sql export. Not to mention what that would do to potential duplicate entries. With a backend like phpmyadmin (though I recall Shamus mentioning he doesn’t have access to such a tool) you can at least choose to backup/restore only the tables that you botched up.

  2. Chuck Henebry says:

    I’ve started using WordPress recently to help revamp the websites for groups that my son’s involved in. I have a fair amount of experience with web design, but this is my first excursion with WordPress. I’ve really fallen in love with the platform, even though the blog metaphor isn’t perfect for event-driven sites””sites where the important date is generally not the date of posting but rather the date on which the event is scheduled to occur.

    So I’m excited when you turn your attention, even briefly, to wordpress and MySQL. I’d love to know what’s involved in running your blog on a local machine, for testing (something I recall you describing yourself as doing a few years back). It’s not clear to me, for example, why you can’t do the SQL commands for auto-tagging old posts on a downloaded version of the real DB, and then upload the corrected DB via ftp.

    Do you know of any good blogs that focus on wordpress design? There’s a ton of people out there doing stuff with wordpress, just as there are a ton of people talking about gaming and programming. But finding a blogger who’s smart and insightful, that’s tough to do. Thoughts?

    1. Shamus says:

      Yeah. I just set up my local copy again after going some months without it. I run a wamp server locally. (The latest version of wamp was completely turnkey. It was awesome.)

      “It's not clear to me, for example, why you can't do the SQL commands for auto-tagging old posts on a downloaded version of the real DB, and then upload the corrected DB via ftp.”

      That upload is the tough step, as I described in my post. Although, I’ve never updated a dB via FTP. That sounds like it would be the way to go. I wonder if there are any risks to doing a brute-force clobber of the dB files like this?

      This is why I’m always a little gunshy of MySQL. It’s actually a really solid piece of software, but I’m always afraid of what I don’t know.

      It occurs to me now that this is why old-timers learn more slowly. If I was a dumb kid, I’d probably just say, “Bah. It can’t be that hard, I’ll figure it out.” Then make a huge mess that would take a couple of days to sort out. I’d learn very fast like that, to the obvious expense of uptime.

      Interesting. I hadn’t considered that.

      1. Rick C says:

        Don’t do it that way–if you’ve got a local database, text your query on that until you’re happy with the way it works, and then apply the debugged query to your live database.

        1. 4th Dimension says:

          And then watch the site implode because you didn’t take into account some small difference like different MySQL versions. Yeah I’m a pessimist.

          1. silver Harloe says:

            So keep yourself to basic sql that hasn’t changed in dozens of years. I’m not even sure what the issue is. Auto-tagging your “project hex” posts, for example, has probably got to be something like this:

            select id,name from posts where name like ‘Project Hex%’;

            adjust the where clause until you have just the set of posts you want, then something like

            insert into tags post_id, ‘project_hex’ where post_id in (repeat the select you worked out before);

            even if you get it all wrong, you won’t explode the site – before the insert you weren’t doing anything, just looking. the insert itself you can either do as a transaction if you set up your tables to be InnoDB, or test on a copy of the DB first.

            Of course, it’s likely that WP’s table structure isn’t so friendly as the one I have in my head, but in all cases, select without joins is unharmful to your db.

          2. Ian says:

            SQL is a relatively static beast, really. Unless you start diving heavily into the logic side of things you’re generally pretty safe.

            If you’re paranoid about mis-keying, you can always begin a transaction and quadruple-check everything before applying it. Doing that while manupulating production data is always a good idea.

            1. Tizzy says:

              Don’t most dB software have a “I’m just kidding, don’t really do this but tell me what would happen” option to their commands to avoid doing something irreversible?

      2. Adam says:

        Does your website run on a shared host? If so you might be able to grab a clone of the virtual machine and test on that and then you’ll have debugged the query on exactly the same system so you won’t have the problem 4th dimension suggests. Also you can take a snapshot of the real VM before you apply the DB changes and restore to the snapshot if anything goes wrong.

  3. Chris says:

    I would like to point out that before group blogging on Twenty Sided, your content general content was one short post or link dump in the morning followed by substantial content in the afternoon, before the comic strip my disk up would not download and the videos I could not watch

  4. skeeto says:

    This sort of thing would be great for crowdsourcing, where you let your visitors tag for you. You wouldn’t want to accept just any tag, just the ones that have been suggested by several different people.

    1. Shamus says:

      I wonder if there’s a plugin that does this. Hm…. (goes off to the plugin repository.)

    2. Amstrad says:

      This was my thought completely. There’s all sorts of people who like to archive crawl when they find a blog they like and I’ve seen crowdsourcing work for things like transcribing webcomics with archives several years deep.

      1. SteveDJ says:

        Just as long as someone doesn’t introduce a tag with a typo… :-) (Or, can others correct such?)

        Meanwhile, I felt obligated to post something. The counter said 9.5 (the .5 being the comment being typed ‘right now’). Since it wasn’t changing, I figured it meant ME… :D

        1. Zukhramm says:

          The point is, of many people do it, it’s easier to ignore the ones that are wrong, trolling and typoes.

    3. Irridium says:

      I was thinking the same thing.

      I regularly browse the archives, and I doubt I’m the only one. I could add a few tags to things.

    4. Jethro says:

      Dammit, I had the same idea.

      As it is written, so let it be done.

    5. Kaeltik says:

      My first thought as well. I’d be willing to put in half an hour or so. Too bad there’s no ‘Random Post’ button.

    6. Blake says:

      This is where my mind went also, it’s a good community here, just so long as you verified everything or only made sure you picked tags that were suggested by multiple people.

      Maybe while you’re getting people to do it, change your archive script to put an asterisk next to each post that doesn’t have any tags, that way the community could go through each like that and add a couple.

  5. Kelhim says:

    I wouldn’t have anything against your autoblography exceeding the 40ies mark …

    1. It’d be kind of fitting to have it work out at exactly 40…

  6. X2-Eliah says:

    Outsource that stuff to Rutskarn. He’s got like a whole week’s worth of nothing to do if his blog is right, and he’s younger than anybody so he can’t even object.

    1. 4th Dimension says:

      It’s time that whipersnaper started respecting his betters/elders.

    2. Destrustor says:

      Nah, it’d just inevitably vanish and be lost forever, or forgotten about, or just abandonned without a word…

      Yeah, I’m looking at you, lord of the scraps.

      bah, nevermind, I don’t want to live in a world where Ruts has time to finish things. Brrr… :)

  7. Aelyn says:

    I confess… I read the first two paragraphs and dropped to the bottom for a summary. Lo and behold, I found a TL;DR feed. Magical.

  8. JPH says:

    See, this is why I hate words. You know how much easier the world would be if we didn’t have words?

    Whoever invented words sucks.

    1. X2-Eliah says:

      Word, yo.

  9. Rowan says:

    It’s weird that I’ve got you pegged as a super-coder that does all these super-hard things (3D graphics, procedural content, etc.) on his spare time, that I’ve never even dared to dabble in (got introduced to the math side in uni, no thanks after that ;) ).

    And then you go and reveal that there’s something I deal with every day (mucking around in databases), that you shy away from. Heh.

    There have been few good tips already. One I haven’t seen yet: make a backup copy of a table you are about to modify (insert,update,delete):
    create table [tablename]_snapshot20111023 as (select * from [tablename]);
    Then you can do your stuff in and if it doesn’t work just do:
    delete from [tablename];
    insert into [tablename] (select * from [tablename]_snapshot20111023);

    And then you run into referential integrity constraints and transactions ;P

  10. Usually_Insane says:

    Why not outsource it, have someone you know or think you know do it for you for a modest fee or recognition of some kind.

    1. Ramsus says:

      Being forever known as “That sucker I conned into doing busywork for my convenience.” =P

  11. Ian says:

    Do you have shell access to the server? If so, you should be able to do a mysqldump on the database, make your changes, and, should anything go wrong, drop the database, jump into the mysql console, and “source my_db_backup.sql”. That will get you around the tight PHP and web server limitations and should prove to be very fast as well. I’ve seen MySQL pull in 1.2GB plaintext dumps very quickly.

    Also, I have to say: I haven’t been commenting too much lately, but I’m really enjoying the autoblography series thus far. I’m looking forward to reading the rest. :)

  12. Anachronist says:

    “It will not end in the coming week…”

    Um, it’s an autobiography. Theoretically it doesn’t end until something stops you from writing it. If you keep at it you’ll eventually catch up to present-day.

    1. Shamus says:

      Obviously I have a fixed point where I plan to stop writing.

      1. Amarsir says:

        But at this pace, if you don’t stop, you’ll have posts detailing years into the future. That would be really handy information to know!

        1. MichaelG says:

          Not if they are as depressing as the earlier entries. Imagine *knowing* that is your future!

        2. Tizzy says:

          Trying to confirm the Time Lord rumor, are we?

  13. Grant says:

    On a completely random note – there were a LOT of parentheses in that post.

  14. Jack V says:

    FWIW, I don’t have a popular blog, but I started tagging stuff on my LJ/DW in an ad hoc way. “I can go back and tag the old entries later”, I told myself.

    It turned out that didn’t matter: the longer I procrastinated, the higher the proportion of my posts were tagged, and that was plenty useful in itself just for helping me find stuff since I remembered roughly what I’d written but not exactly the date or title. So my advice to people is to start tagging if they’d like to, and not feel bad if they can’t fix the backlog immediately: if you _can_ tag a lot in one fell swoop, that’s obviously really great for keeping things consistent, but even if you just tag stuff piecemeal as it turns up, that’s probably ok :)

  15. TightByte says:

    Shamus, I might be off here, but at least one of the limitations you listed may be less “firm” than you seem to think.

    http://www.php.net/manual/en/info.configuration.php#ini.max-execution-time

    I’m not sure how (or whether) that might be helpful, but I thought I’d mention it anyway.

  16. Someone says:

    I think the leftmost category icon doesn’t work, at least on the main page.

  17. Zombie Pete says:

    My question about your website is: Why am I sitting in California looking at an ad for Dish Network in Polish?

  18. drlemaster says:

    Not picky about what number you end at, just please don’t stop now. I need to find out if you and Heather get together. She seems like such a nice girl.

  19. Jon Ericson says:

    I hate working with MySQL because it doesn’t automatically put updates, deletes and inserts inside transactions like every other RDBMS on the planet. (And therefore could be considered a non-database.)

    When I’ve inherited a MySQL database, I’ve come to treat it as a sort of data warehouse system where another (transactional) database writes and the web framework reads. (MySQL has almost no presence outside of web development.)

    Better stop before this become a true rant.

    1. Ian says:

      Oh man, I agree with a passion.

      I used to sort of like MySQL until I started a job that required me to use MSSQL. Better tools, better performance, better software.

      Have you ever used to the .NET connector for MySQL? What a bloody nightmare. I’ve literally had to rewrite simple SELECTs, like “SELECT * FROM MyTable” to “SELECT * FROM MyTable WHERE 1=1” because it would just die in the middle of a read. It wouldn’t throw an exception, it wouldn’t even make my program stop responding to events (and, for that matter, it wouldn’t continue program execution). It would just stop dead in its tracks. I’m sure there’s a logical reason for it to fail like that (I hope so!) but from the perspective of someone that’s trying to use the component in an application, that’s pretty lousy and makes for code that’s essentially impossible to debug, as the lock happens outside of MSIL.

      Just…lovely.

      1. jamie pate says:

        Have you tried devart’s dotconnect? it seems to work ok, and there is a free version
        http://www.devart.com/dotconnect/

        1. Ian says:

          I haven’t tried that one yet, but it looks better in every conceivable way just based on the feature blurbs.

          I still have one C# project that I maintain that ties with a MySQL database, so many thanks for the link. I’ll definitely give it a shot.

  20. jamie pate says:

    This might work as a safety net…
    InnoDB supports the SQL statements SAVEPOINT and ROLLBACK TO SAVEPOINT. Starting from MySQL 5.0.3, RELEASE SAVEPOINT and the optional WORK keyword for ROLLBACK are supported as well.
    http://dev.mysql.com/doc/refman/5.0/en/savepoint.html

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply to Shamus Cancel reply

Your email address will not be published.