Messages from Spammers Part 6

By Shamus
on May 31, 2017
Filed under:
Random

Wednesday’s usual Nan O’ War episode will appear later this week. I was going to post about the latest talk from John Carmack, but I feel like that kind of post needs to simmer for a few days. So rather than leave this spot blank, I thought we might look at the work of a new spammer to the site.

All of the messages in this post arrived from the same IP address, and all of them within a few minutes of each other. All of them bypassed the various spam filters and appeared on the site where the public could see them. (I manually took them down once I spotted them, obviously.) They managed to properly handle the “Check here if you’re not a spammer” checkbox. They managed to spoof Akismet, which is my main software-based defense against spam. They also successfully got by the common stuff like keyword filters. One or two of them almost got past the ultimate filter, which is my human brain. That’s a pretty good night’s work for a spam bot. (Or perhaps a shameful night’s work for my spam filters.)

Let’s meet our first contestant…

javascript obfuscator
Very funny. Keep up the good work!

The text of this comment is totally legit. It’s not word salad. It doesn’t have screwball formatting or extraneous non-English characters. It’s actually IN English. It’s even been left on a funny post, so it checks out. The giveaway here is the name. Now, on any other site the name “javascript obfuscator” would immediately get you busted as a spam bot. But around here it could plausibly be some programmer’s self-deprecating handle. Something like:

Q: So what’s a javascript obfuscator?

A: Anyone who writes javascript for a living, because JS is self-obfuscating.

But the giveaway here was that the name “javascript obfuscator” was also in their domain name. Still, it was a nice try.

So how did this bot manage to post a coherent comment in the proper context? They cheated and copied off of someone else’s work. That same comment had already appeared on the post.

192.168 ll
The ending of Mass Effect 3 is where the problems culminated, not where they began

The bot tried again just three minutes later. Again, they managed to post a coherent thought. The mistake here was that they quoted the post and not another commenter. Even that might have slipped by if I was being inattentive, but then they used a gibberish name that drew attention to itself. 192.168 is the first part of the default IP of a home router, and it’s just screaming out “someone has configured their spambot incorrectly”.

spanish to english
Very funny indeed

Two minutes later. That pattern here is pretty obvious by this point. Again, this comment is simply quoting someone else. It would have skated right through, except “spanish to english” makes no sense as a username and it matches their URL. If they were named “The Translator” – and if it didn’t come in this rapid-fire bundle of spam – I wouldn’t have given it a second look. “Oh. This guy translates stuff for a living. Good for him.”

bullet force
Can’t wait for the next entry!

Same mistakes. The name is odd enough to grab my attention, where I notice that it matches the URLDon’t bother looking it up. It’s a garbage multiplayer shooter for mobiles.. Also, the ripped-off comment doesn’t make sense this time. “I can’t wait for the next entry” made sense four months ago, but that is no longer a plausible response to that post because the next entry has already been posted. 17 of them, in fact.

Last one:

json formatter
It was ridiculous. Even without being able to read the code on the slides, you could tell the steps varied widely in operation count, were often split up and in different order, and just looked different.

Same thing again: Nonsense name that matches the URL posted too soon after other messages with the same M.O. Also, this one repeated the mistake of quoting the post rather than a comment. That’s actually a spam bot quoting me quoting John Carmack.

I’ve never seen a spambot behave quite like this one before. If I look in my spam filter nearly everything falls into one of these categories:

  1. Word salad gibberish.
  2. Giant walls of meandering text unrelated to my site or the spammer’s URL.
  3. Non-English.
  4. Just a big list of URLs.

So this one was kind of refreshing. Hopefully Askismet catches up soon so I don’t have to sort through too many of these by hand.

Enjoyed this post? Please share!

Footnotes:

[1] Don’t bother looking it up. It’s a garbage multiplayer shooter for mobiles.


20201555 comments. It's getting crowded in here.

From the Archives:

  1. CMaster says:

    It’s actually a fairly old technique (copying other posts to appear legit). Really swarmed a few sites 2-3 years ago. Interesting that they’re only just finding here (or more likely, only just figuring out how to get past the anti-spam here)

  2. Daimbert says:

    I’ve had a number of spam comments — caught by the spam filter, though — that copied parts of the post it was replying to. So it looks on topic and like an actual comment until you look through and realize that it’s what I said in the post and ISN’T someone quoting it.

    • Echo Tango says:

      They just need to add in some quote tags, or quotation marks, and then they’re golden.

      • Abnaxis says:

        I’m guessing the clever bit that got this bot through the filters is how short the phrases they copied are. Spam filters probably need tuned so that they won’t throw thing small, commonly used phrases (though that might also be desired behavior against a “FIRST POST” epidemic) otherwise they’ll throw out two people quoting the same meme or something. If my theory is correct, the filter bot is looking for post greater than X characters that are identical to other content.

      • Daimbert says:

        What they do is good enough for most blogs/bloggers. In my case, I get so few comments that I’m almost always going to read any legitimate one in detail, and so even sticking the quote on won’t help.

        At least the first time, though, it gave my a kind of surreal “Hey, that’s an insightful and well-reasoned comment … well, it should be, it’s what _I_ wrote!” moment.

      • Syal says:

        Throw in a ‘lol’ at the end. Indistinguishable from human.

  3. Mephane says:

    Clearly this is where we are ultimately headed: https://xkcd.com/810/

    • Mousazz says:

      Wouldn’t that just end up with clogging the system up with a large number of spam users whose whole job is to upvote other spam messages as “constructive”? The easiest way to do so (in keeping with the spammer community spirit!), would probably be to just upvote EVERY comment as “constructive”, ultimately marginalizing and breaking the system.

      And if the system tries to counter by increasing the number of “constructive” ratings required to a certain minimum threshold, the spam group could pull the plug by having some sort of a back-end database solution wherein it keeps a list of posts that are spam, and only upvotes those posts. While it wouldn’t help spam get through the system (especially if there were several spam groups, which would cut themselves off into only upvoting their own spam), it would definitely stop any legitimate comments from going through, as those certainly wouldn’t have a base of support from the sham spam accounts.

      And if the system tries to counter THAT by banning spam upvoters, they could randomize their patterns of working by only upvoting some spam while also upvoting some (but perhaps a smaller number) of legitimate comments. This would then require a new system to recognize through spam upvoters, which may or may not be a more difficult system than just recognizing spam…

      This problem just seems like a never-ending battle more than one with a definitive solution.

  4. Daemian Lucifer says:

    So rather than leave this spot blank

    You had the opportunity for the perfect joke,but now its too late.

  5. javascript obfuscator says:

    Very funny. Keep up the good work!

  6. Droid says:

    Dear god, they are EVOLVING!

    Why is it always the crappy movies that I remember well enough to reference?

  7. Quoteception says:

    Shamus wrote:

    json formatter
    It was ridiculous. Even without being able to read the code on the slides, you could tell the steps varied widely in operation count, were often split up and in different order, and just looked different.

    • Nick says:

      Shamus wrote:

      json formatter
      It was ridiculous. Even without being able to read the code on the slides, you could tell the steps varied widely in operation count, were often split up and in different order, and just looked different.

  8. Daemian Lucifer says:

    We are the spam.Turn off your filters and surrender your blogs.We will add your text and your valuable content to our own.Your audience will adapt to service us.Resistance is futile.

  9. Galad says:

    This was the most entertaining of the “meet the spambots” posts :>

    (Feel free to quote me on that, whether you are human or a spambot :> )

  10. Echo Tango says:

    Shamus,

    If the spammers are commenting on old posts, maybe you could just auto-lock comments on posts older than X weeks? I was also going to suggest moving the comments somewhere that requires a login, so a large company can deal with the spam, because surely they can deal with spam effectively! Then I spent 5 minutes reading about the plague of spam on Reddit…

    • Daemian Lucifer says:

      If the spammers are commenting on old posts, maybe you could just auto-lock comments on posts older than X weeks?

      Not gonna work here because some people do leave comments on those old posts because they are new.Some of those insightful even.Heck,even Shamus has responded to a couple of posts recently that were made on posts that are years old.

  11. evenest says:

    Forgive my ignorance, but what is the purpose of these spambots simply copying and pasting text from other posts? I guess I’m having trouble understanding the endgame. Does it give the programmers access to something on Shamus’ site? Since it seems like a great deal of work to bypass the various filters on this site (and others), what do they have to gain by doing this? For instance, I will often see in some comment threads the (obligatory as a guitar in the background of a scene in a movie) post claiming that “my x made thousands of dollars by doing y.” Do people actually click on those links?

    • Daemian Lucifer says:

      If I understand their methods correctly,having their links floating around improves their seo.Basically,these are bots designed to trick other bots into displaying those links higher when people look stuff up on google.

    • Shoeboxjeddy says:

      Go check out Jim Sterling’s video on gambling websites. Basically, they don’t care if anyone clicks because just having the link out there apparently improves their Google Search ranking.

  12. Mersadeon says:

    Now, I’m not a programmer so forgive me if this is naive – if they figured out how to check the “Check this if you aren’t a spammer” box, could you theoretically make two boxes, one that says “check this if you aren’t a spammer” and one “check this if you *are* a spammer” and thus fool them until they recalibrate the spambot?

    Honestly, I wonder WHY they got a spambot to check the box. Your site is the only site I’ve seen this simple feature on, and I don’t think a spam-bot-maker would teach their bots to circumvent something not widely used.

    • cavalier24601 says:

      I remember in a previous post it talked about doing something like that; possibly even ‘hiding’ the second so we won’t see it but the bots will. Check boxes are becoming common enough that a site I follow uses a two-position slider.

      It’s a technological arms race, with the smaller sites (like here) seeking protection from the major powers (Facebook/Twitter/etc integration).

    • Rick C says:

      “Your site is the only site I’ve seen this simple feature on”

      I used a variation of that a decade ago on a forum I used to run–actually, though, it was the opposite: a checkbox created by mildly obfuscated Javascript that, IIRC, was automatically checked. Spambots of the day didn’t run JS so they wouldn’t check it, and the server would do everything the same with posts that didn’t have the box checked except for actually write the post to the DB, so the bot would theoretically think its post was actually posted.

      I’ve seen a few other sites lately do a variation: “what color is an orange?” and a box for you to type “orange”.

    • Retsam says:

      While we’re talking about the “I’m not a spammer” checkbox, would it be possible to fix the tabindex on the checkbox so that we can tab to it properly?

      I’m the sort that generally uses the keyboard as much as possible, and the fact that I can’t just hit “[Tab][Tab]Space” to check the checkbox (and then “[Shift-Tab]Enter” to submit) is a minor annoyance.

      It looks like you’d just need to set ‘tabindex=”8″‘ on the checkbox input tag.

    • Zagzag says:

      I believe there actually is an “I am a bot” box, which is invisible but still selectable by bots who aren’t viewing the page the same way web browsers do.

      • Retsam says:

        No, there isn’t. Looking at the script, it just adds a single checkbox dynamically. Bots that aren’t running JS won’t “see” the checkbox, so they won’t know to check it and so their comment won’t be accepted.

        • Shamus says:

          There is supposed to be a hidden field called “email”, which will get you flagged as a robot if you put anything in there.

          • Adam says:

            I wonder if “autocomplete” plugins/features might do that by accident, thus flagging a human as a spambot.

            • Retsam says:

              Almost certainly not. The hidden input is a special type of input, that’s specifically exists for input that the user is not supposed to modify. (It’s not an invisible textbox like you might expect from “hidden input”)

              There’s no sane reason for an autocomplete feature to attempt to modify hidden inputs, because the whole point is that they’re not supposed to be modified by users.

          • Retsam says:

            Ah, yeah, there is an <input type=”hidden” name=”gasp_email”> in there. It’s not really a “hidden checkbox” but I guess it’s not a terrible way to think of it. (In reality “hidden input” is a normal mechanism that websites use to send along extra data with a request, that isn’t entered by a human)

            I’m coming to appreciate how clever and multi-layered this anti-spam is. If a bot is just scraping the raw HTML and parsing it to make raw HTTP POST requests (as I suspect many do), they’ll see the bogus email field, assume it’s supposed to be filled out (e.g. by JS). If they’re actually running some sort of headless browser without JS, they won’t see the checkbox that they need to check. They’d have to be running a full headless browser with JS enabled to actually see the checkbox and check it.

            … or alternatively, they might just be programmed to be aware that the GASP anti-spam plugin exists, and recognize that a “gasp_email” field is bogus. I wouldn’t be surprised if more sophisticated bots are specifically programmed with countermeasures for the more common anti-spam plugins.

            • silver Harloe says:

              … or alternatively, they might just be programmed to be aware that the GASP anti-spam plugin exists, and recognize that a “gasp_email” field is bogus. I wouldn’t be surprised if more sophisticated bots are specifically programmed with countermeasures for the more common anti-spam plugins.

              That’s certainly the second thing I would do if I were required by my job to program a spam bot.

              But since the first thing I would do is quit that job, I’m guessing that’s kind of moot.

  13. As it stands, these are technically the only posts I have time to read and comment on, because they’re not analytical. I’m trying to curb article reading while I finish reading some philosophy books and start a new job.

    Other then that, I perfectly understand this master spammer. Clearly, he has spent years preparing for this moment to spam you and use your own words against you.

  14. The Stranger says:

    “Spanish to english” makes no sense as a username.

    FWIW, I frequent a site where a commenter named “Serbian to Vietnamese to French and back” routinely responds to troll/idiotic comments by running them through Google Translate repeatedly (guess which languages he uses) and posting the results.

    Also, to save somebody the trouble:

    I broke the site where the commentator called “Serbia in Vietnam in France and in the back” habit corresponding to troll / stupid comment for them to go through Google Translate occasions (guess which language it uses) and the game results.

  15. Tometzky says:

    I think simple but custom “I’m not a robot” tests are much more effective than anything you can just download and install. The problem with your “check this if you are not a robot” test is that a random input is 50% likely to guess right. A correct answer to a task like “write first letters of all the words of the following statement: I AM NOT A ROBOT” is much less likely to be filled right by a robot not specifically written to spam you.

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *

*
*

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>