Why My Website Goes Down

By Shamus Posted Sunday Jan 31, 2010

Filed under: Programming 67 comments

It’s time to solve a mystery. (A boring technical coding mystery.)

It’s been bothering me for years: Why does my site go down when I get linked on Reddit or Stumbleupon? It shouldn’t. I use Hosting Matters, and they host blogs that are far larger than mine without difficulty. Usually my site will just vanish for a few hours, and after the fact I’ll be able to sort out what happened by looking at the incoming traffic.

Eventually I noticed that not all traffic surges were created equal. It was really only the ones that linked to DM of the Rings that crushed my site. Links to other stuff wouldn’t even cause a hiccup, even if the overall traffic spike was larger. The natural assumption is that it must be due to the fact that the comic is image-heavy. Except, this problem persisted even when the images were moved off-site. Hmmm. What about the scripts that set up all the comic navigation stuff? That seems like a stretch. Perhaps the script I wrote that embeds images so I don’t have to clutter up my prose with a lot of HTML? Meh. It might be inefficient, but I can’t imagine it’s anywhere near bad enough to bring down the site where the average comic was one or two images. But what else could it be? What else makes the comics different from the other posts on my site?

I’d ponder this problem for a day or so and then forget all about it until the next time my site went down. This cycle has been repeating itself for a couple of years now.

I’ve finally figured it out. And I am really embarrassed at how long it took me.

It’s the number of comments. Also: Duh.

I was always looking at the early comics, where the comment count is a couple dozen or so per entry. Later in the series the comments regularly top a hundred, and the finale clocks in at just under seven hundred comments.

I run some tests on my local machine, where I have a functional mirror of this blog. It takes my machine an agonizing three and a half seconds to generate the page for the finale. Now, I’m sure the server at Hosting Matters is many times faster than my humble computer, but 3.5 seconds is still insane. Get a few thousand people in there chewing through those high-comment posts, and it’s easy to see how they could render the machine helpless. I try disabling the display of comments and it is able to generate the page in just 0.14 seconds. So 96% of the CPU time is spent churning out the HTML for the comments, and the other 4% is spent on everything else.

But why? Why is it taking so long to list the comments? My first thought is that this is caused by my dice roller. The code to display the dice for a comment is a recursive operation that subtracts from a number and then calls itself until the number is zero.

As of this writing, the current comment count on the DM of the Rings finale is 678. This requires ten levels of recursion. (Six 100 sided dice and four 20 sided dice.) Keep in mind it has to roll a handful of dice for each and every comment in the list, with the number of dice needed gradually going up.

Lots of dice times lots of comments time lots of comment-heavy posts = a crap-ton of HTML.

Every comment has a bunch of HTML around it. Nested DIV tags and SPAN tags and TABLE tags and… wow. This is really messy. There is a lot of extra stuff in here. I remember how this came about: Most of it was for cross-platform reasons. The differing behaviors of IE 6, IE 7, and Firefox were crazy to sort through, and often making very slight changes would blow out the formatting on one browser or another. Once I had it working, I stopped messing with it because I was pissed off and sick of fighting with it. So I left this crufty mess in place, even though I most likely don’t need the majority of it.

Just to make sure I’m getting a clean test, I remove ALL of this extraneous HTML. I just have a single line-break between comments, but otherwise they are a big jumble. That’s fine. I’ll sort that out later. Finally, I disable the comment editing plugin and SuperCache so that I can be sure they aren’t part of the measurement.

The code that is left is as streamlined as it gets: WordPress yanks comments out of the database and spews them onto the page without doing anything else. It now takes almost exactly 2 seconds to build the page. Nice, but I have a feeling there is still a really big bottleneck in here someplace.

Time to get systematic. Each comment has several distinct parts:

1) The comment number
2) The dice
3) The avatar
4) The name / link of the author
5) The permalink of the comment itself, which is wrapped around…
6) The date on which the comment was made and…
7) The time of day of the comment
8) An admin edit link (just for me)
9) The actual comment text

Hm. So which of these steps would be the most time consuming? My primary suspects are #2, #3, and #9. Time to run some tests. I’m going to disable an item, reload the page a few times to see how much time it saves, then re-enable it and run the same test for the next one. This test isn’t super-accurate. Values jump around in 0.05 increments, but considering how huge our bottleneck is this shouldn’t be a problem.

But let’s see how it goes…

chart_comments.jpg

Uhbuhwhut?

One of my prime suspects – my dice roller – is so fast that it’s basically “free”. (I give myself a pat on the back.) #9, the comment text, is indeed a high priced thing to process. But the absurd surprise is that #6 and #7 are really expensive. Together, it takes longer to show the date and time of when you left a comment then it does to display the actual text! What!?!

I am at a loss, here. Suddenly, everything I know is wrong.

While I can’t explain why it’s taking so long to do #6 and #7, I can clearly see that I don’t need to do them both. In both cases, I’m retrieving the timestamp, I’m just printing it differently. It’s like if your mom asked you what the date is, so you decide to walk five miles into town and look at the marque at he bank. After you walk all the way back home and tell her, she sends you back to the bank to find out what time it is. I don’t know why it’s taking so long, but at least we can get both nuggets of data on the same overlong trip.

That shaves a nice chunk off of the loading time.

The next thing I notice is that there are two ways I can get the comment text. I can do it the way I’ve been doing it:

comment_text ();

Or I can replace it with a very similar line of code:

echo get_comment_text ();

The first is what takes over 0.6 seconds. The second happens so fast I can’t even measure it with the benchmarking tools I’m using. It is, in effect, “free”.

This is one of those changes that seems a little too good to be true. I must be giving up something by switching to the new way. But what? How do comment_text () and get_comment_text () differ?

I have looked in the WordPress docs, and found nothing. As in: Neither one is even listed. I looked in the support / discussion forums, and the WordPress community is its usual helpful self. In fact, someone actually asked my question.

Alan: Hey, what’s the difference between comment_text () and get_comment_text ()?

Betty: Have you tried looking them up in the docs?

Alan: Uh. Yeah? Nothing there?

Betty: Did you try reading the source?

I cannot stress how useless it is to post RTFM in a forum like this. If you don’t know, just keep your yap shut and maybe someone else will come along and help. Of course you can read the source. It’s time consuming and you’re most likely consulting the docs and forums in an attempt to learn from someone who has already done so. This is why the forum exists.

Perhaps I just have bad luck, but in all my trips to WordPress.com, this has been my experience every time. Nobody knows, and everyone tells you to read docs that don’t exist.

Sigh. Anyway.

Item #8 is also inexplicably expensive. Its only job is to print nothing to every single visitor to the site, unless the viewer is me, in which case it needs to print a little link. I never click on that link, so this element can be removed. I have no idea why it eats so much time, but at least I can cut it.

We’re now down to the page taking 0.85 seconds. That’s a huge improvement, but it really bothers me that just printing the stupid date takes half of that.

I check into it some more, and it look like all the time is being eaten when converting date formats. Apparently the date of each comment is stored thusly:

2009-03-29 15:44:10

But of course it’s much nicer to display that as:

March 29th, 2009 03:03pm

But the latter costs me 0.4 seconds. I can double the speed of this code by sacrificing usability. It seems like the thing to do is to keep the nice dates on posts with less than a hundred comments. Once a post passes this, it will start cutting corners and display the all numeric date.

(Remember, when I say 0.4 seconds, I’m talking about processing the 600+ comment page, not just a single comment. The differences would be too small for me to measure if I was only working with one comment at a time.)

I make some other minor changes, and in the end the page loading happens in about 0.31 seconds. It’s more than ten times faster than it was when I started.

I haven’t made the changes live yet. While mucking about with this mess I discovered that WordPress now supports threaded discussions. We need this here. We have large, diverse, and multi-threaded conversations on this site.

Still, it’s clear this place needs an update before the next traffic surge.

 


From The Archives:
 

67 thoughts on “Why My Website Goes Down

  1. Simulated Knave says:

    I’d believe that it’s the text, since that’s the first thing that chokes and dies when the comments get up around a few hundred (after a while, all the comments are blank).

    I agree that it doesn’t make sense, though.

  2. Denubis says:

    Not boring at all actually. A good account of how to debug code. When I teach DB again, I’ll certainly point my students at this post.

  3. Laura says:

    Can you convert the date before storing it? Then you only take the hit when a comment is left (once), not when it’s displayed (many times). This may be beyond your control – I haven’t used WordPress to know how it works.

    1. WJS says:

      That’s actually a good example of the trade-off between memory and speed there. You can store the date in text form, but it will take six to seven times as much space as a 32-bit timestamp. However, it is many, many times faster to just pull text from a database and dump it to output than it is to create a date object and format it. Given that the size of comments is typically going to be much bigger than the size of the date, I’d say it’s worth it in this case – a small increase in storage versus a heavy hit to performance.

  4. BarGamer says:

    You might be a coder if your mom sends you to the bank, TWICE, to check the date and then the time, and you think nothing of it. :)

  5. lebkin says:

    In a fun twist of irony, when I tried to get to your page from my RSS reader, it wouldn’t load. My first thought: Shamus crashed his website with a post about crashing his website.

    Sadly, this ironic amusement only went so far. Apparently my internet had simply gone down for a bit, making this just a strange coincidence.

  6. Galenor says:

    I actually found this really interesting. Being an amateur coder, my debugging sessions are less about spending several hours figuring out a solution by in-depth analysis of the code, followed by intelligent optimisation. It’s more about spending several hours banging my head against a wall out of stress until I realise I wrote a > rather than a <. Then, i'm banging my head against the wall for the next several hours out of shame.

    However, this was pretty educational in how certain features requires certain amounts of bandwidth. I thought that text was relatively lightweight ’til I saw it burst its seams on the chart!

    But then I guess the avatars are quite small, and comments can get to silly lengths sometimes. Hrm.

    1. WJS says:

      This isn’t anything to do with bandwidth. This is how long it takes the server to make the page, not how long it takes to send it to you.

  7. DrMcCoy says:

    Actually, I for one really prefer the ISO date/time format to that stupid textualized version…

  8. toasty says:

    “It's more about spending several hours banging my head against a wall out of stress until I realize I wrote a > rather than a <"

    Ya, that's why I don't enjoy programming. THat and I'm just no good at it. however, there is something really satisfactory about finishing a program, testing it out, and saying, "I did that! Awesome!"

  9. Shandrunn says:

    Boring? Are you kidding me? This sort of thing is what I come here for!

  10. Fenix says:

    These are my favorite kind of posts. Seeing a problem get identified and then absolutely slaughtered, makes me happy.

  11. scragar says:

    I just realised why your comment text function was consuming so much time Shamus.


    function get_comment_text() {
    global $comment;
    return apply_filters('get_comment_text', $comment->comment_content);
    }
    function comment_text() {
    echo apply_filters('comment_text', get_comment_text() );

    The first filter call doesn’t exist, so it returns right away, the second filter call though is huge(at least in terms of performance hit, first it sorts an unknown size list of arrays, then iterates the list and the contents of that array executing functions on the contents, no wonder it takes so long).

  12. midget0nstilts says:

    Shamus,

    I did some looking on the Internet. I’ve never programmed in PHP before, but it looks like comment_text() is a wrapper for get_comment_text(). The comments for the comment_text() snippet say it uses the apply_filters() and then passes it through the ‘comment_text’ hook.

    EDIT: Damn! scragar beat me to it! :)

  13. lazlo says:

    So what you’re saying here is that there’s a big update coming, and until it happens you’re going to focus your blogging on things that are so boring that no one would want to comment or link to it? Good luck! :)

    Seriously, I don’t think you could do it.

  14. Peter H. Coffin says:

    Just use the ISO date. It’s much easier to remember that 2009-03 is earlier than 2009-05 and by how much, than remember that May isn’t really that close to March, and that there’s something starting with “A” in between.

  15. scragar says:

    Erm, can’t edit my reply, so I’m going to post again, but would it not be possible to store the date/time instead as a timestamp? If I remember correctly all you need to do is modify the relevant tables and a few sections of code, timestamps are what the input time would be converted to anyway, and it doesn’t change any sorting methods or anything.

    1. WJS says:

      The smart way to do it would indeed be to use integer timestamps rather than ISO date strings, because parsing those is going to be a hell of a lot slower than just passing the timestamp to the date constructor. However, this sounds like it was WordPress code, not anything Shamus had written, so it wasn’t really in his hands.

  16. Jazmeister says:

    “It's time to solve a mystery. (A boring technical coding mystery.)”

    That is what makes my ears prick up right there.

  17. Chris says:

    WordPress seems like a real hog. Ideally, it should be spending almost all of its time waiting on the database, not processing markup. That’s why we write these things in slower languages like PHP and Perl, since the job is supposed to be I/O bound.

    1. Shamus says:

      Chris: To be fair, the bottleneck could easily be in my theme, which is hacked to hell and back.

      This is not to say WordPress isn’t guilty as charged, but that I should not be taken off of the “person of interest” list.

      1. Jabor says:

        As an aside that I hope you don’t mind too much, it’s good to see you going back and taking advantage of the threaded comments on older posts ;)

  18. onosson says:

    “Perhaps the script I wrote that embeds images so I don't have to clutter up my prose with a lot of HTML? Eh.”

    That usage of “eh” totally threw me – you must not be a Canadian.

    1. Shamus says:

      onosson: I think I intended “meh”. Fixed.

  19. Nentuaby says:

    The problem with the ISO time is that it’s ambiguous to human readers who aren’t familiar with ISO time as such. Due to the differing order in which various cultures write their dates, a string of “03-05” could be read as March 5 or May 3.

    Better than letting the site crash, though. Reverting when the load’s high was probably the best choice.

    1. Don J says:

      The whole point of the ISO format is that you never have “03-05” by itself. It will always be “2010-03-05”, so someone who is used to seeing the year last will always be clued in that something is different here.

      One of my biggest pet peeves is the occasionally seen “half-assed ISO” format — “10-03-05”. I can’t understand why anyone would ever think that this would be easy to understand, but I have hundreds of receipts that suggest that someone in the world thinks this is a good way to go. As far as I can tell, most North Americans will read this as October 3, 2005, and most Europeans will see March 10, 2005, and only rare freaks like me will actually see March 5, 2010. Bleh.

  20. Zak McKracken says:

    Don’t worry about the time. I actually understand “15:34” much easier and quicker than “03:34pm”. Actually, around here (Germany, that is), times are almost always displayed with 24 hours. Having the month spelled out is an advantage, though, since in Germany we use a different date format (dd.mm.yyyy), so I always need to think which number is the day and which the month. Anyway, that’s not a grave problem either since I barely ever read the date/time at all.

  21. Jeffry Degrande says:

    Shamus,

    you would be surprised how much code and queries get executed before your comment loop gets hit. Shaving off a few microseconds of your comment loop will get you only that far.

    Take a look into paging your comments. Not sure if core supports it out of the box but there are definately plugins for that. You could also look into disqus.com. I find it works very, very well.

    I personally would ditch wordpress and go with statically generated html. I ran into far too many problems like you’re having right now.

  22. Pickly says:

    Another “That was actually quite interesting” post from me here. (And I’m also one of those people whose coding experience consists of struggling for half an hour to figure out which comma I forgot to type.)

    I have some plans for Elemental modding that are probably far overly ambitious for my programming skill, so it is nice ot see a tiny bit of how someone might sort through programming difficulties.

  23. Lex Icon says:

    There was something terribly amusing about reading a blog post on comment posting code, and then seeing the little “Feeling Chatty? There are XX posts already” link.

    Like seeing a dying camel in the desert, and then coming across a camel seller. So of course, I had to stop by and add my straw to the pile.

  24. Axle says:

    Thanks.
    Deffinitly not boring stuff, especialy not for someone who codes for a living (and loves it).

    Looking forward for the upgrade.

  25. SatansBestBuddy says:

    *reads post*

    Boring, boring, boring, interesting, boring, boring, THREADED COMMENTS AWESOME!

    Actually the whole of the post is pretty interesting, kinda like looking inside a detective’s mind and seeing how they connect all the pieces of a puzzle together, only with it being about website killing instead of actual killing.

  26. Maddy says:

    I like these kinds of posts too – the process of solving a problem.

    I’ve got no problem with the ISO date. On the rare occasions when I find the month/day order ambiguous, I can usually work it out by comparing the date of the blog post and the dates of the other comments.

    Honestly, when I’m looking at date stamps, it’s usually to find out whether or not the material is outdated – in which case the year is more important than the month or day. Sadly, some blogs display only the time but not the date – that’s a much bigger problem.

    So I’ll take a date stamp in any format, even Julian, over none at all!

  27. Samuel Erikson says:

    “It's time to solve a mystery. (A boring technical coding mystery.)”

    I know enough about coding to know that I’ll never be any good at it, but I still find your posts of this nature absolutely fascinating. There are two primary reasons behind this:

    The first, of course, is your writing style. It comes across almost as if it’s been dictated, transcribed, and then cleaned up. This sort of writing makes most non-fiction infinitely more readable.

    The second is the problem-solving & logic on display. As you said, this is a mystery. You set out to solve the mystery and it makes for great reading.

    Edit prior to posting: It’s a good thing I refreshed the page before posting this, as it looks like I’m not alone in why I enjoyed this.

  28. silver Harloe says:

    this is all written in pseudo-code, mind you, but:
    replace the main php file with:

    $path = determine_cache_path($which_post)
    if ( $path exists ) {
    $contents = read from $path
    } else {
    $contents = make_wordpress_go($which_post);
    write $contents to $path
    }
    write $contents

    (and add a “write $contents to $path” to the end of the routine which responds to comments)

    then you can leave your date/dice/avatar/admin-link functions all alone, since the pages will be generated Once Per Comment instead of Once Per Read.

    this assumes one webserver OR a nfs-mounted cache directory, of course.

    Also it is not in any way multi-process safe (i.e. if two people reply at the same moment) UNLESS wordpress does some locking and you put the “write $contents to $path after comment” section inside the “locked for updating comments” part that exists.

    But, honestly, I’m fraking surprised wordpress doesn’t already have a “cache generated pages” option you just need to set to true/1.

    (Though the whole “if you’re the same person, let you edit your last comment for 15 minutes” section might need revisiting with a cache system)

  29. scob says:

    Another vote for this being the type of post I come here for. Your thought processes as you debug this or your city-generating program give us insights that we needs, precious. We likes it.

  30. SteveDJ says:

    Adding another “I found that very interesting” comment here.

    Also, I figure we should all add lots of comments to this particular post – get it well into the 100s even – so the improvements you talked about can be experienced right here on this page (and hot have to go looking for the DMotR post…) :)

    Re: suggestions about changing how you store the date with comments — wouldn’t that break all existing comments?

  31. Mr. Son says:

    I actually found this quite interesting, and that date change looks alright to me. I’m not used to 24hour time, but I already date things that way because frankly, it makes much more sense to write dates in order of YYYY/MM/DD or DD/MM/YYYY than the MM/DD/YYYY mixed nonsense my country thinks is hot stuff. And frankly, how many people actually READ the datestamp on these comments?

    (Also, if you date files with YMD then they autosort from oldest to newest on your computer when you sort alphabetically. It’s why I switched over in the first place, for sorting my art when I scanned it. Great time saver.)

  32. neolith says:

    This post is far from boring and one of the reasons I like to read your blog!

  33. Rutskarn says:

    There’s a lot of satisfaction to be gained from reading this post. This is because I have absolutely no idea how to begin to fix a problem of this kind, and it’s interesting watching someone else do it.

  34. TSHolden says:

    I’m a big fan of posts like this: I would really encourage you write your own threaded comments code, and make a series of posts about it. I think a lot of us would enjoy seeing how you would approach a project like that, and compare it to our own methods.

  35. Mari says:

    I like these posts, too. I’m also in the “really crappy coder” category. I understand HOW it works and all but getting MY logic to match computer logic is too time consuming for me to stay interested in coding myself. So I live vicariously through people like you. This way I get to enjoy the fun and interesting part of coding without enduring the frustrating part.

  36. Jabor says:

    @silver Harloe:

    IIRC, the wordpress core does not support caching. Though there are plugins (like SuperCache, which this blog uses) that handles it for you.

    The downside is that if you’ve ever commented before, the system has to give you an uncached version (so that you could edit your comments), so it probably wouldn’t be too much help on the 600-comment monsters.

    Also, the caching was hilarious with the old theme-switcher. “Hey Shamus, the theme switcher is broken. EDIT: Working now.”

    1. WJS says:

      Actually, I would think that that caching behaviour would indeed help with the comment-heavy posts. Shamus mentioned that the site was getting slammed by external links, and it’s pretty safe to say that of ten thousand reddit users, very few would have previously commented on the page.

  37. Eldiran says:

    ++ for this being a very interesting post. It’s a good reminder that sometimes a little debugging can go a long way.

  38. Von Krieger says:

    Let me toss another log of “I thought it was interesting” onto the growing bonfire.

  39. mookers says:

    Add me to the “just use the YYYY/MM/DD HH:MM:SS” date and time formatting cheer squad.

    I think that most of us here are geeks and will understand it anyway.

  40. Bryan says:

    Galenor (#6): No.

    The time here is *not* the time that it takes to send a given page to a browser.

    The time here is the time that it takes to *generate* the page’s HTML on the server, based on the incoming GET and the database contents.

    (Yeah, yeah, it was measured from a browser. But it was measured from a browser running on localhost, which has effectively no bandwidth limit.)

    The issue is *not* the size of the comments, it’s (or, it seems to be) the number of them. I strongly suspect that having a single 6000 character comment is *MUCH* faster than having six hundred 10-character comments, or six thousand 1-character comments.

    (And you can scale the number of comments there down by the extra-info/HTML/HTTP overhead if you want. I strongly suspect you’d still see a noticeable speedup (before this stuff was fixed) by serving some given N bytes in a single comment, versus spread across hundreds. And that suspicion is just based on the testing method. :-) )

  41. Lanthanide says:

    The ISO date stamp following YYYY-MM-DD should be unambiguous, because it is the only common format in which the year comes first, so you know the second number will be months.

    The DD-MM-YYYY that most sensible countries use is easy to confuse with the MM-DD-YYYY that less sensible countries use precisely because the only obviously unique part, the year, is in the same position in both formats.

  42. chaz says:

    Hmm, not so sure about threaded comments myself. Brilliant when they work, but on (say) reddit, it is completely annoying when the first comment gets hi-jacked by all the doofuses who want to appear on top of the page.

  43. Veloxyll says:

    @chaz: Well if that happens Shamus wlll just have to do his own first post hyjinx again.

    As for the post itself, I found it interesting to read and see all the things he (he, you? I did do an @chaz before which could cause confusion. Would I be able to use You if it came before my @chaz? hmm!) experimented with to come to a solution. Even though I can’t code to save my life, it’s still fun to read how OTHER people solve code-problems.

  44. Narida says:

    As has been mentioned before, what I would have done here is implement some kind of caching system, seeing as most of the content on the page is static and I assume that the pages a generally read a lot more often than commented on…
    Seeing as the comment editing function seems to be the only thing that is not static, you could cache everything else: The whole page with the edit links replaced by something like: if (checkIfDisplayLink()) echo $link;
    I don’t know how or if this is supported by wordpress though…

  45. Luke Maciak says:

    Shamus, I’m not sure if someone suggested this yet, but have you tried telling WordPress to break the comments into pages? I think that there is an option under Settings and Discussion to enable paging. So for example you can say that only 100 first comments are displayed and if someone wants to see the next 100 they have to click a “More Comments” type link.

    I’m not sure if that would help though – I’m not sure how wordpress implements this. My hope would be that it will simply add a LIMIT bound to the SQL query and fetch 100 comments at a time – thus possibly fixing your issue altogether. But for all I know it might even make it worse adding another performance hit somewhere.

    1. Shamus says:

      WordPress does indeed support paging comments.

      But my theme doesn’t, which means you’d have no way to nav between pages.

      And it completely defeats the functionality of the dice roller which, while not crucial to the operation of the site, is part of the look of the site.

      So no matter how I do this, I have my work cut out for me.

      1. Dragonbane says:

        Threaded comments combined with paging appears to break the die-rolling functionality. The first page, as I’m posting this, ends in 53 due to the two replies made by Shamus (I expect it to end in 54 in a moment), while the second page starts with 51.

        Cheers!

  46. RPharazon says:

    This post interests me because I dabble a bit in programming, and I like to see how the pros figure out such things.

    Which is to say, I like to see that they fail in even bigger and simpler ways than us n00bs and dabblers. Oh well, at least I finally figured out how to make a dice-rolling algorithm from this post.

    Also, I don’t think that threading comments will help much. Every place I’ve gone to (other than forums) that has changed from a blog-style comment thing to a threaded comment system has invariably ended up as miserably closed and petty as it is possible to get on the internet.

    What I usually see is 60% of the discussion taking place on the first post, 30% of the discussion taking place on the second post, 5% taking place on the third, and the rest taking place in isolated comments later on, simply because people like having their comments read.

    But that also tightens down the posting material, since people also want to evade that off-topic, clearly-just-posting-in-this-thread-for-the-visibility banhammer.

    I’d steer away from it, if at all possible.

  47. Mari says:

    I don’t think it’s fair to discourage Shamus from using threaded comments based on what we’ve seen elsewhere on the ‘net. Frankly, what I’ve seen elsewhere on the ‘net is people who are incapable of holding coherent discussion in proper English. Aside from a brief period of griefers I’ve never seen “the people on the web” here at Shamus’s blog. If he can manage to insulate himself from such a widespread phenomenon as “the people on the web” I think he can probably manage threaded comments ;-)

  48. Steve C says:

    The dice roller is far too cool to give up for threaded comments.

  49. Kdansky says:

    I prefer “15:03:51” to “3:03:51 pm”. I also prefer 2000-08-29 to whatever insanity the americans use.

    Lastly, can we finally have wider comments? The 50 comments on this page means we have to scroll down about 25 screens, with most comments requiring a half to a full screen height. On the other hand, I’ve got 2/3 of my screen unused to the left and right. Of all the websites I visit, twentysided is the one with the worst wastage of screen-estate. And if you want threaded comments, it will really get out of hand.

  50. Simon Buchan says:

    Caching, paging. Fixes everything – even PHP.

  51. Veloxyll says:

    @Kdansky: I think the page is still set out for people using 800×600. Possibly because the comics on which this site is founded are designed for 800×600 and everything lines up nicely in the theme this way. A lot of the comment size comes from Gravatars, names and dates, especially with the shorter comments.

    I can see how things might get out of hand with threaded comments though, since if it indents like livejournal does you’ll quickly get
    posts where each word has its own line. Something to watch for anyhow.

    Also just noticed while writing this post – If I fill the comment box, the only way to scroll down is with page down or up/down arrows, there’s no scroll bar and mouse-wheeling has no effect. It doesn’t auto-scroll either. Using Firefox 3.5.7.

  52. I don’t think anyone mentioned it previously, but can you store a converted date once and then re-apply that to the following comments should they match that timestamp date? Since most comments happen on the same day (at least when you initially post) you could do single date conversions to the pretty format for a bunch of comments all at once (time would have to be done separately using a substring of the original timestamp and adding that in).

    Obviously this would be close to useless with comments that aren’t posted on the same day but you could save a bit of time doing it like that.

    I’m gonna be folding my paper and putting it into the ballot box with “I dig these types of articles” written on it.

  53. Nathon says:

    Does WordPress not have a profiler? I’ve never done any PHP stuff, but everything I have worked with (C, Python, wonky things nobody cares about) has profiling support. It seems like you could make useful improvements to WordPress as a whole by finding out what lines are taking such a huge chunk of the processing time.

    Also, my first suspicion is that comment_text() would do some nice protecting against xss vulnerabilities, but I guess you would have to read the source to find out.

  54. drow says:

    i haven’t delved into wordpress code much myself, but if its possible, select unix_timestamp(date_col); then have PHP format from there. may be faster than having PHP parse the formatted date/time first.

  55. Lou Erickson says:

    Dates can be surprisingly slow, depending on how they’re implemented. I don’t know how WordPress or PHP specifically does it, but here’s a common thing which I’ve seen before:

    Date is stored as some collection of numbers, perhaps YYYYMMDDHHMMSS. This would be trivial to break out as characters and format up. You call a library function to do so.

    The library function is written in C. C wants dates stored as the number of seconds since 00:00 1 Jan 1970 GMT. It does a whole lot of multiplication and logic to get that number of seconds.

    Once it has seconds, it can then do it all backwards to reformat the date into something sensible.

    Since you didn’t keep that intermediate C-style time around, when you ask it for the time… it gets to do all that math again.

    Long ago I sped up a program by more than 100x by being more careful with time/dates. Two weeks ago, a co-worker edited a Perl script that was taking 17 seconds to parse data and develop a graph into a script that took 3 seconds to parse the same data and develop the same graph, by simplifying the date library used.

    Admittedly, dates are hard to get right, but it’s also easy to spend way more time on them than you’d ever expect.

    1. WJS says:

      You’ve got it pretty much completely backwards. If the dates were stored in a sane manner – Unix timestamp – then it wouldn’t need to do the expensive parsing to convert a text string – your “YYYYMMDD” – into a date object. WordPress storing the dates as text is the problem, not the solution. (Sure, you could make it faster still by writing your own function to convert dates, but we use libraries to avoid writing our own code)

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

Leave a Reply to Galenor Cancel reply

Your email address will not be published.