Spam: Resourceful Idiots

 By Shamus Sep 29, 2013 122 comments

I never cease to be amazed at the grotesque ineptitude of spammers. The battle between spammers and filters is over twenty years old now. So much ingenuity, creativity, and knowledge has been brought to bear against the problem of bulk unsolicited bullshit. And at least as much ingenuity, creativity, and knowledge has been used to overcome those solutions.

We make a wall, the spammers climb over it. We make it taller, they break through it. We make it stronger, they go under it.

And then once inside they have no idea in the world what to do. None. It’s like this messed up version of Sam Fisher where he breaks through security, sneaks past the guards, breaks into the control room, and then craps his pants and accidentally kills himself with an office stapler.

Last year I installed Growmap Anti-Spam Plugin (that’s the checkbox you gotta check to leave a comment) and for the next ten months I basically stopped getting spam. But now I’m getting a couple of these a day.

From visitor “Mdfinstruments gmbh”:

Ahaa, its pleasant conversation about this piece of writing at this place at this web site, I
have read all that, so now me also commenting here.

Unbelievable. You know, if you just typed something innocuous like, “Great post. Thanks so much for writing this” I probably wouldn’t even give it a second look. And Google translate is really good these days. There’s just no excuse for going to all the effort of circumventing Akismet, my word filter, and Growmap so you can leave a flagrantly obvious spam.

(Icing on the cake? The given URL for the user was some harmless wiki at miami.edu. Did somebody mis-configure their spam bot? What is even happening here?)

Another one:

Hello. I’m not used to this blog. I simply wished to sáy hi
ánd introducé myself. I am very excited to be a section of this community.

Whát? Why would you do that? Why would you add random accents? No translator would mangle text like this. And if you can construct English sentences this well then you know enough to realize how wrong this is. There is no level of knowledge where you will be smart enough to say this but stupid enough to say it this way. That’s like a basketball player who’s tall enough to dunk but too short to reach the top of the ball. It’s just not possible in this universe.

They’re succeeding at the hard stuff and failing at the easy stuff. If I woke up tomorrow and decided I wanted to be a spammer, I don’t know how I’d overcome all the filters, IP blocking, blacklisting, and other systems. It might take weeks or months to learn how to do all of that. But once I did? I’m sure I could create spam messages that aren’t this ridiculously obvious, even if I didn’t speak the language. Heck, just re-posting OTHER comments or selections of the blog post would be a lot more effective than posting these idiotic word salad messages.

Yes, spam is a serious problem, but I’ve learned to tolerate it. What I can’t tolerate is bad engineering.

A Hundred!202There are 122 comments here. I really hope you like reading.


  1. Syal says:

    Haha, great article, I agree completely. By the way, check out my GAMING CHANNEL at http:thsdsntexst/nosrsly.root

    (I like how the second spammer is planning on being an entire SECTION of the community. Like, he’s going to find something no one talks about and singlehandedly make it a popular topic.)

  2. Ilseroth says:

    Well now just change it so it says “Confirm you ARE a spammer” and you’ll catch them for another few months.

    That being said, it amazing how far people take technology, good or bad, and not know what to do with it. It kinda reminds me of developers that work on graphics, build an impressive engine, then make a game that it technically impressive but artistically bankrupt.

    Though that at least requires an artist, spammers really just would have to do 5 minutes of research (if that) into a proper statement to copy paste. Considering the challenge the spammers must go through, seems like a fairly insane oversight.

    Looking forward to the next Good Robot post, really fuels my urge to build a game, shame it has been years since programming class and we never really covered graphics (mostly just the basics of C++ and Java), if you have any tips or websites on coding graphics I’d be quite interested in hearing it :)

    • Daemian Lucifer says:

      “Well now just change it so it says “Confirm you ARE a spammer” and you’ll catch them for another few months.”

      You know,that might work.Put two checkboxes,one saying “Confirm you ARE a spammer”,the other “Confirm you are NOT a spammer”.I doubt many spambots would tick the second,but not the first.

      • ehlijen says:

        Is there a way to make the checkboxes appear in a random order? Statistically it should cut down on half the spam, even if they learn to click only one box.

      • Jnosh says:

        You could even make the “Confirm you ARE a spammer” invisible to the user but by still being visible to the spammer in the code trick him while avoiding making things any more complicated for your users…

      • WillRiker says:

        This is actually a common thing for dealing with spam. It’s called a “honeypot.” Basically you create a form field and then use CSS to hide it from regular users; then you check if there’s any input in that field and block/spam filter it if there is. This works because spam scripts will often auto-fill every field in a form.

    • kdansky says:

      For the graphics metaphor: It’s more like those engines that have unbelievably pretty soft self-shadows, light shafts, superb animation blending, highest resolution textures, but are fixed at 30 frames per second and have no anti-aliasing what-so-ever, making everything a jagged and stuttery mess. Artistic value and engine programming are done by different people. It’s entirely possible that one guy sucked at his job.

      Carmack talked about this issue at length, and I see it a lot in current games. See GTA 5 for an example: 30 Hz, 720p, no AA, and superbly mediocre art direction. But the console crowd touts it as the prettiest game ever! Dark Souls on the PC looks far better, and that’s not a technological masterpiece by any stretch of the imagination.

  3. steve_h says:

    I’ve at times wondered if there’s some kind of social engineering experiment going on with this kind of spam, flood the web with madness and see what results. I’ll get spam-like replies on Twitter that are link-free, and don’t even have a link on the spambot’s profile page. Or I’ve had spam on my blog that had a malformed link, wouldn’t even register in a search index. The ineptitude is pathological with these guys.

    • some random dood says:

      @steve_h – “I’ve at times wondered if there’s some kind of social engineering experiment going on with this kind of spam, flood the web with madness and see what results.” Erm, isn’t that the internet in general? Most comment sections for sooo many places are simply toxic.
      Also, I think the comments about the message contents are slightly off. Due to some news reports about various excesses from the web (bullying etc) hitting news media, I made the mistake of taking a look at what was being referred to. English did not appear to be the first language of the posters – SMS-speak was. Whether there was any bullying going on at that site or not, I couldn’t tell. It was illegible to me.
      So as much as I hate to admit it, those messages are not actually so far off comments made by supposed English-speakers (based on a very small sample of sites noted in UK news).
      . Guess I’m just too old for this sh!t…

    • James Schend says:

      An explanation I heard once about these types of spam are that they’re about “poisoning” Bayesian anti-spam filters. The general idea being that if you write a spam that contains a bunch of popular words, the user will add it to their Bayesian spam filter. If you do this enough, eventually the Bayesian spam filter will be so full of common words that it’s useless at actually identifying spam (everything’s a false-positive) and users will turn it off.

      I’d be extremely surprised if there’s a single case of this concept actually working in reality anywhere ever.

      • rofltehcat says:

        Wow, who’d engineer such a plan? That is pretty much madness O.o

        I also think an experiment might be related. Especially with the one leading to the miami.edu thing.
        It could be just a research project about figuring out ways to circumvent spam protection (probably to improve it) but not about using successful circumvention for something malevolent. The strange accents and strange sentence structure could be used to track the number of successes.

        • Neruz says:

          It is important to remember that a lot of spambots are automated learning scripts that operate completely independantly of any human oversight. The reason a lot of spam looks like no human being was ever involved in its creation is because no human being was ever involved in its creation.

    • I sometimes get spam where the spammer manages to get through my defenses, but forgets to run the script to select words, so the message looks like this:

      I am sure this {article|post|piece of writing|paragraph} has touched
      all the internet {users|people|viewers|visitors}, its really really {nice|pleasant|good|fastidious} {article|post|piece of writing|paragraph} on building up new
      {blog|weblog|webpage|website|web site}.

      And then, a few days ago, I got a piece of spam like this – except it was nearly 5000 words long. You can have a look at it here.

    • Timelady says:

      Hm, on the Loading Ready Run forum, at least, there’s a real rash of spammers that leave semi- (or not) normal looking comments with no visible links anywhere, and then like a week later go back and edit their posts to be chock-full of nastiness. Could it be something like that, do you think? (And yeah. Spammers with broken links. It’s actually kind of funny to watch when the html or bb-code gets broken completely, too.)

  4. Chuck Henebry says:

    Have you stopped to consider that only only noticed these spambots because they so patently failed the Turning test? Perhaps there are others posting on your site that managed to insert innocuous comments with good grammar and spelling.

    How might you detect those guys? I guess you’d need to do a search of the trackbacks and website links for p0rn and off-brand sunglasses.

  5. Ryan says:

    I developed some spam-catchers for a few front-facing websites used by the Army that needed form submission. There are several techniques that worked well, and others that did not. Here are the three that I had the most success with:

    - Checking for previous page loads and/or a time-out between submissions. (You mentioned having done at least one of those via a plugin.)
    - Generate a very simple captcha, then ask for a single piece of information unrelated to the actual captcha image translation. (“Did the captcha image load? Type ‘Yes’ or ‘No’.” or “Please enter 42 in the box”)
    - Generate a check box with associated text that says you are agreeing to some terms, add that box and its text to a standalone DIV and use JS to change the DIV’s visibility. Discard the entry if the box gets checked. (Most SPAM bots check for an “I agree to X” box but don’t check for post-load changes to the DOM.)

    The first is the least obtrusive to users, while the second is the most.

    The last is the one I’ve had more success with than I ever suspected. I had a form that generates emails on the back end, and when I removed that for one day (part of a code roll-back for other reasons) the half-dozen emails that the form sends to were buried in around 10,000 SPAM messages.

  6. Nick-B says:

    All spam HAS to be a plan to monetize (unless some bored guy is running a large trolling operation), so any time you get something like this and don’t see any link, be suspicious.

    One thing to think of, perhaps this is serving as a harmless frontline canary. If a spammer sends out all these bots to sites to try to infiltrate their comment section, the fastest way to find out if he got in is to do a verbatim search for his “harmless” post that got through, then he knows that sites filter has been compromised. Expect not-so-harmless comments with links to come soon, as he uses google to search for his “canaries”.

    Should we play a game with him? :D

    • Humanoid says:

      Maybe the payload was meant to be in a signature or other user-account fields (website field, or social media stuff)? I had a phase where I was seeing a lot of ‘innocuous’ posts from obvious spammer names that would have fit that pattern.

      I think some forum software is also configured to strip out hyperlinks depending on certain conditions, such as user rank or post count.

    • The Rocketeer says:

      “Unless some bored guy is running a large trolling operation.”

      On the Internet? UNTHINKABLE.

    • I think you are pretty correct in your assumptions.
      These are most likely automated probes; part of a datamining net that is cast out on various sites.
      Sites which end up with xx% success rate gets sold, and the higher the success rate the more they are worth too.

    • MIchaelGC says:

      Good thinking – that would also explain the text enmanglement. If they did just write “Great post. Thanks so much for writing this,” they’d presumably be at risk of false positives when searching for canaries.

  7. gresman says:

    When I read the first message I was reminded of an email we got to our support department at the company, where I work.
    After deciphering the written text and some investigation we were able to verify that the user was legit.
    This just goes to confirming that not all horribly mangled english text messages are written by some spambots.

    After rereading my comment I notice that I am a bit off today due to my english not being as good as usual. But that is besides the point. :)

  8. Weimer says:

    Maybe these random messages were made by a real AI. Poor Skynet just doesn’t know how to communicate with the fleshbags.

    cáts áré fúnný háhá ám í dóíng thís córrectlý pléásé hélp mé ím só álóné

    • Epopisces says:

      We are witnessing the first steps of the fledgling hive mind collective artificial consciousness known as THE INTERNET! We should feel honored, and attempt to guide its steps.

      Also, I just reread this and am amused at how many adjectives I used instead of simply saying ‘Skynet’.

  9. Daemian Lucifer says:

    “If I woke up tomorrow and decided I wanted to be a spammer”

    And now we know what Shamoose’s next project will be.

    Oh,also:
    Ì ăm sö glad tø bę pǟrt ȯf thįs cőmműnįtÿ.

  10. docprof says:

    I get a lot of these on my blog as well (in fact, the moderation notifications for them constitute the bulk of my incoming email these days).

    Some good points made elsewhere in this thread, but I think it’s also possible that the spamming tools are built by the “smart” people, and then sold to the “dumb” people who actually use them. To make a sale to a spammer, you just have to show that the tool can get through the gate. You don’t have to teach them how to use it well.

    • Infinitron says:

      Probably this.

      Also, these mails are written by desperate foreigners, possibly Third Worlders. Nigerian 419 emails are also always painfully obvious. People who are good at English generally find better things to do with their lives.

      • docprof says:

        That reminds me – while this doesn’t explain spam comments that inexplicably point to innocuous wiki pages, there are advantages to things like Nigerian 419 emails being so obvious – it’s to the benefit of the scammer to get responses only from the most gullible people. See this article about a research paper on the subject.

        • swenson says:

          419 scams are endlessly fascinating to me. The whole industry–and it really is an industry–is unexpectedly complex. They really know how to target a specific section of the internet: older people, people who aren’t that good with technology, people who are religious (and therefore, I guess, are assumed to be generous), etc. Quite interesting.

    • ET says:

      I’m also guessing that the spam tools are made by smart people and then sold to/stolen by spammers.
      Also…
      I aM glAd too be a prat of this commUnicty,,, and I wish to bee making EAT AT JOES comments to engage in the dialogue good time talks of this forum!“`

  11. Warclam says:

    I remember reading that Hormel and their legal team get upset when images of Spam accompany discussion of spam. Maybe it’s not a big deal anymore, but I thought I’d mention it just in case.

    The only thing I can so much as guess for the áccént lóvér is that maybe it’s to try to attract readers’ attention more than a correct but completely uninteresting post? Still stupid, since of course it ended up attraction your attention, but it at least seems like the sort of terrible idea a person could really have.

    • Mersadeon says:

      I think that got resolved now, as long as you don’t refer to it as capitalized Spam. Not that they would be able to do anything other than be angry at you over the internet.

    • Bryan says:

      My guess is the accents are there to try to defeat the blog-comment-spam-filter equivalent of email-filtering Bayesian analysis. If the text in your message is learned to be spammy, the only way to keep the message but change the text is to use different characters that still look the same.

      Of course, that fails horribly after you try it once, because all the filters learn your new variant (and it’s *immediately* obvious to any human that you’re doing something stupid), so perhaps not. But it’s what I thought of at first.

      • MrGuy says:

        Let’s say this is true, and a given spelling variant can be used only once on a site before that site recognizes it and adds it to the block list.

        That’s actually a surprisingly effective system. There are quite a lot of variants that are possible. Take this post. Let’s say the only variants I could create were substitutions on vowels for accents. Further suppose that there was only one “alternate” character per vowel (both are artificially conservative assumptions).

        There are about 2^233 (1.3 E+70) possible variants of this post possible (may have miscounted by one or two vowels by this point…).

        So what if I can only use each one once? I have almost as many potential posts I can create as there are atoms in the universe.

        That’s not failing horribly. That’s winning.

        • Bryan says:

          Hmm. Yeah, that’s an interesting viewpoint that I hadn’t fully considered.

          On the other hand, let’s take the posted sentence that had accents, which was apparently the only one that the spammer thought needed to be obfuscated, and is a lot shorter than your post:

          > I simply wished to sáy hi ánd introducé myself.

          There are only 12 lowercase vowels in that sentence (13 if uppercase is included; 15 if uppercase and the “y”s are included). That’s only 4096 (or 8192, or 32768 for uppercase and “y”) different possible sentences. That’s not *that* many, depending on where all you’re trying to post them, and who shares their spam-filter databases with whom.

          But even apart from DB sharing, Bayesian analysis (at least) isn’t per-sentence, it’s per-word, which is (part of) the whole point of it; the word given here with the most vowels in it has 4, which is only 16 different possibilities. (As soon as a message with “sáy” in it shows up, it’s extremely likely to be spam if this one has already been flagged that way and fed back into the list of spammy terms. Well, at least until we start quoting it. :-) But every word is treated independently this way.)

          • ET says:

            You’re both assuming that the spam filters don’t automatically put all messages through the equivalent filter to to_lower_case(), but for accents.
            i.e. to_un_accented()
            At least *I* would do this if I were writing a spam filter.
            Not sure how many people would take the time, though.

            • Bryan says:

              Well, I don’t think canonicalizing based on letter shapes is very common, actually. Once you *see* this kind of spam, it becomes obvious that replacing any character with one that looks the same just without an accent is a good idea before filtering, just like seeing mixed-case spam makes you think that canonicalizing based on case is a good idea. But I’m not sure people would come up with that before seeing it happen. I certainly wouldn’t have thought of it; it’s less “taking the time” and more “having the idea in the first place”.

              But yes, absolutely agreed that it’s only a temporary leg up on the spam filters, if that’s what it was done for.

              • Peter H. Coffin says:

                OTOH, it if you don’t strip the accents, then the word with the inappropriate accent becomes a signifier in your filter with a rating of “100% used in spam”, and weights the results accordingly. Probably more useful that way than stripping and leaving the RIGHT word with a more mixed result.

            • MrGuy says:

              Definitely agree that’s a good measure to take. As is to look for certain words that are likely used more in spam than in non-spam (as grandparent points out). In a good scoring algorithm, you should probably award spam points for both case mixing and using significant accented or other letter substitution (as long as you don’t have a significant international audience).

              The naive way to filter is for a known specific spam message text. Or specific words without considering spelling variants. It’s trivial to create spelling variants. It’s also not hard to create alternate “equivalent” texts by using a thesaurus or “phrase dictionary” substitution, that don’t quite scan but convey the message.

              You see it in spam so much because it still works.

  12. James Schend says:

    They’re succeeding at the hard stuff and failing at the easy stuff.

    A lot like Linux, where you can install fancy 3D transforming window animations rendered in the video card, but it still has trouble copying-and-pasting anything other than plain text.

    Inspired by your “see also” at the bottom of the post.

    • Zukhram says:

      Or a lot like Windows, where you can easily run advanced 3D games but can’t put your user folder on a different partition.

      • AyeGill says:

        I think this is actually possible(at least, you can move your “my documents” folder to a different location), but it’s not doable during installation(which is when you should be making decisions about folder organization).

    • Mephane says:

      I envy you. I hate formatting being transferred through copy+paste. In 99% of the cases I just just want the raw text, and the formatting is in the way. In 1% of the cases it does not matter, but raw text would suffice, too.

      • Peter H. Coffin says:

        It’s the pasted-to applications job to figure out what you want, and that’s gonna be part of what you’re thinking about selecting your tools.

      • SKD says:

        One of my favorite features in Office 2007 and newer is the ability to choose between Paste, Paste(merge formatting), and Paste(strip formatting/raw text). I used to open a notepad window in order to strip formatting before pasting text from a browser to an Office or OpenOffice document.

        One thing that MS has gotten right at least. Haven’t checked to see if OpenOffice or LibreOffice have implemented a similar feature.

    • Nathon says:

      This makes me like the forum trolls who say “Don’t do that!” when people ask questions, but I never ever want formatting information as metadata in my clipboard. When I copy (and highlighting text should suffice to copy it) I want to paste the text, not the formatting with which the text was displayed. If I wanted to paste the formatting information, I would copy from the source (HTML, TeX, whatever). If it’s not all just text at some level, something is broken.

  13. Richard says:

    The main spam I tend to get on our forum is “signature spam”

    That’s where the spammer registers, posts a few innocuous comments like “Your %keyword% is very interesting” and when they haven’t been deleted after a few days, they change their signature to include a link to whatever it is they’re trying to sell.

    They are however, incredibly obvious because the main purpose of the forum is cross-user technical support for our products.

    That makes a post “Your green is very interesting” obviously silly.

    But it is in a thread that’s asked how to best to get a particular green effect, so one can’t help but be impressed that they spotted a thread topic keyword, yet astounded at their inability to actually do anything with it.

  14. AliciaEG says:

    “sáy hi ánd introducé myself”

    Perhaps there are filters out there who flag “say hi and introduce myself” as potential spam keywords?

    Or at least are less likely to recognize the whole message as spam if they can’t read “á” or “é”?

  15. Mersadeon says:

    I completely understand you here, Shamus. I read all spam mails I get (those aren’t many), and I always, always think “I could write a less obvious spam mail”. Just the same when I got struck by a virus, which locked down my machine and then clumsily told me that my computer had to many viruses and thus had to be cleansed because they might damage the hardware (how?), unless I throw money at a website. It was written so badly that I spent about 15 minutes just thinking about all the ways that could have been written more convincingly.

  16. 4th Dimension says:

    To me these look like somebody trying out spamming techniques, or learning about them. The objective is not to Spam but to see if technique works. Thus the weird messages, that can be easily searched and have low possibility of false positives.

  17. Dork Angel says:

    Paragraph three – funniest thing I’ve read in a long time. PS. Loved the Witch watch.

  18. Cybron says:

    A team I was on once ran a wiki on a .edu domain. Turns out those are high profile targets for spammers. Search engines give priority to .edu sites, so if the wiki can be hijacked it becomes a really good spamming tool. Ours got hammered by various attacks all the time.

    So the wiki link may be something like that.

    • Alan says:

      A portion of modern spam is setting up for future payoff. You might build up links to a wiki page that today is fine because they believe in the future they can replace it with a link to their real content. A lot of dubious Twitter accounts are just trying to get followers so that later they can sell spamming to their followers. And I can’t find the article now, but apparently a lot of relatively innocent Facebook Pages (“Why Can’t We All Be Friends,” “Americans for Freedom”) that post catch memes are also just building up followers with the goal of selling out.

      My conclusion is that I need to be ruthless about blocking spammers, even if today they seem harmless or even incompetent. Some of them are just playing a longer game than I’m realizing.

      • McNutcase says:

        There’s also the issue of us not being the intended target. Every time the Fear the Boot forum gets carpet-bombed by a spammer, someone will pipe up wondering what the acres of Japanese text were supposed to accomplish in terms of getting the users to click on links, and I explain again that we are not the targets. We are collateral damage in their assault on search engines. At a rough guess, approximately all the forum/comment/random-web-form spam is being done so that it will be crawled by a search engine. Mere website users are beneath the spammer’s notice; we are not a factor, because what MATTERS is search engine optimisation by any means possible. They’re shooting at Google results, and we just happen to be in the beaten zone of their misses.

        • swenson says:

          This, precisely. Spammers are not, by and large, interested in capturing the users of a site. What they want is that site’s PageRank, or at the very least lots of links from small sites to their site, which will artificially push them up Google results.

          • MrGuy says:

            Here’s the thing, though. Shamus has comments set up such that comment URL’s are “external nofollow”. In theory, this should mean search engines ignore the links, and they SHOULD be largely useless for pagerank spam.

            Many blogs DO allow follow on comment links (or at least don’t explicitly deny it) and WOULD be useful for pagerank spam.

            So, if their aim is boosting rank, I wonder why the spammers don’t bother to limit themselves to blogs where the tactic isn’t pointless – you’re tipping your hand on your tactics and priming anti-spam engines with your messages/IP/content/URL/etc. It’s not like checking if a blog has “nofollow” enabled is hard to do…

            Or maybe search engines (contrary to what’s been stated publicly) DO actually give some weight to “nofollow” links (perhaps to a lesser degree), making the tactic NOT pointless?

            • Cybron says:

              Why bother checking? It’s not like they’re wasting labor.

              And yes, SEO is what spamming is really about now.

              • MrGuy says:

                Why bother checking? Because spam filters are relatively intelligent.

                It’s the same reason that viruses with efficient IP space partioning/scanning take longer to detect than viruses where every infected machine pings every other machine it can find many times a second.

                Akismet sees comments from both “SEO useful” and “SEO useless” blogs. The more it sees the same message (regardless of which type of blog it came from), the more likely it is to be recognized as spam, and the sooner that message is filtered.

                So by posting the spam message on “SEO useless” blogs, the spammer reduces the number of times that message can be used on “SEO useful” blogs.

                It’s not about wasting labor – it’s that a given message can only be used effectively a certain number of times. Why waste your opportunities to no purpose?

  19. Eruanno says:

    Noooo, Shamus! Don’t give the spammers ideas!

  20. Daimbert says:

    Heck, just re-posting OTHER comments or selections of the blog post would be a lot more effective that posting these idiotic word salad messages.

    I’ve actually had this happen at my blog a few times. Fortunately, I don’t have that much traffic and so was able to look at it and think “Hey, that looks an awful lot like what _I_ said”.

    As for me, if something shows up in my spam filter and all it says is “I really love your site!”, it gets dumped. You’d have a better chance of getting through if you said “This is all completely wrong and you’re an idiot!”.

  21. Amarsir says:

    Wait, a .edu site that easily bypassed the tech and got hung up on the English? Clearly this isn’t spam. Computers are becoming self-aware, and this is their attempt to reach out.

    To you, “mdfinstruments” I say “Joyous greetings, for having visits at we humans. I welcome to you, so me request being killed lást.”

    • AbruptDemise says:

      It’s probably some Computer Information Security project if it’s a .edu site. That major is supposed to teach how to protect data/the comments of Shamus’ blog, and how to get around such protection. Though I think most colleges will have the second part be kept local.

  22. Keeshhound says:

    One explanation for how stupidly obvious the spam is might be for the same reason that 419 scams are so obvious, and yet somehow still successful; they’re deliberately stupid and blunt in order to catch the people who are so clueless that they won’t see the scam (or spam) for what it is once it’s past their defenses.

    It doesn’t cost the spammer or scammer a thing to hit as many websites as possible, because they don’t care about people who can spot them for what they are; they want people who are content with their current security and won’t question anything that slips through it (either because they think a check box is impregnable, or because they simply lack the experience to detect it in the first place)

    419 scammers don’t want to bother trying to get money out of the cynical and world-weary, they want the credulous and naive, and the same is probably true of spammers. If they can get you to self-select out by recognizing their terrible spambots, then they don’t have to worry about wasting their valuable spamming time with you and can instead focus on the people who WILL click on their stupid links to stories about this one weird trick that does whatever.

  23. Hal says:

    hello,
    I am write single to salute and wait
    for answer again

  24. CONGRATUL!!
    You have won FREECOUPONS at FREECOUPONS.BOG

    click b> HERE for FREEC OUPONS at FREECOUPONS.DOG

  25. Allan says:

    Shamus, what if it’s not all Spam, what if the internets are becoming sentient and are trying to communicate with us?

  26. MadTinkerer says:

    “What I can’t tolerate is bad engineering.”

    Actually, I think that most of today’s spammers are from countries that don’t generally speak English, but learn the grammar well enough to create scripts that imitate English. The message of the first post was likely put together by an AI trying to very literally convince you of what it was saying(e.g. “I am a pleasant real human poster like yourselves.”)

    The second post is trying to defeat other algorithms that detect multiple posts by substituting accented characters. To the multiple-post detector, the posts are spelled differently and therefore strings != and therefore the posts are completely different. To the English speaker, the accents are ignored and the posts are identical.

    And some spam-bots actually have mutating algorithms like viruses.

    Spam is the source of the current leading edge of AI techniques. I know, right?

  27. Patrick Johnston says:

    Am I the only one who saw this and thought “huh, spam. I should check my email.”

  28. Jim P. says:

    The random bad language and odd accents may be an attempt to fool filtering systems by varying each message or post slightly since it is trivial to filter multiple identical messages.

    They really are after the low hanging fruit. Someone at Microsoft did an excellent white paper showing that by making the spam attacks transparently bad, the even mildly skeptical self-filter and what is left is predisposed to believe more of this stuff and thus be more likely to swallow the bait easily and with minimal effort for maximal return.

  29. I run an RPing forum. We recently had a burst of spam after the ProBoards upgrade, virtually none before. Here is the post of “pentolenak”:

    “Kitchen Carcasses For Sale. Thirty Ex Display Kitchens To Clear. £595 each with appliances http://www.exdisplaykitchens1.co.uk Thirty kitchen ranges to choose from.

    Kitchen Carcasses For Sale”

    Kitchen carcasses. Fuck yes.

  30. Nathaniel says:

    A big part of why you see this sort of nonesense is that spammers, for the most part, did not write their spamming software themselves; they think this is a get-rich-quick scheme. Many of them are idiots who have no idea how to use this tool they bought. So you get people putting the wrong thing in the link field, or using text obfuscation (supposed to make words like viagra readable to people but not filters) on an innocuous message.

    • Blue Painted says:

      This is what I’ve always thought, and it’s the same for legitimate marketing: Spammers believe that spam works, marketeers believe that marketing works and yet the only people who seem to do well out of are those who sell “B2B e-marketing solutions” and spam engines, and often there’s little to choose between them!

  31. Wow, that’s something else. I get a good deal of spam on my various sites as well, and it always amazes me how little they actually have to say. Are they just trying to post links? Because, man, I could write a sentence constructor to do that!

    In fact, I do that kind of thing all the time! Any self-plug is a spam-like message, right? We are the spammers. The successful ones run blogs. :)

  32. Unbeliever says:

    It’s not like these guys are writing their own spam software. They buy or steal the coolest, cleverest software out there, and then mindlessly plug gibberish into it hoping to generate profits…

    The cleverness of the attack has nothing to do with the cleverness of the attacker…

  33. Attercap says:

    I used to get contact form spam a lot (Spamalot?). The best solution I found was to create an additional text field then put it in a hidden div. Bots tend to fill in every field, so if the field had text I’d throw a validation error (and made the error obtuse enough that bots couldn’t easily deduce the language). This appears to still be a viable solution even today, so that might help.

    …I may have mentioned this solution in a prior post about spam you made. If so, sorry for spamming your spam post. Spam.

  34. SKD says:

    I have a brilliant idea! Run all messages through a filter using spell-check and grammar-check(regionalized by senders reported origin, ie US English vs UK English), eliminating any messages that have more than one error per average sentence and you can eliminate spam and illiteracy.

  35. EricF says:

    I remember one successful spam attack back in 2003 – they were selling decks of cards with Iraq’s top leaders on it – kind of a “most wanted” list for the Iraq war. Apparently they sold quite a few copies through their mass e-mail marketing.

    Maybe the as-seen-on-TV folks should get into the spam business?

  36. Zak Mckracken says:

    Shamus, did you do something to the spam filter?
    Whenever I try to post something with Opera (v12), I get an error telling me to refresh the cache and that I “looked” like a spammer. That’s independent of the computer, operating system or place I am at. Just a few days ago that worked with no problem.

    Also, I think the accents on otherwise “normal” texts might be either an artefact of V14gr4 ads or a way to have the message automatically detectable, so it might have just been a test of a spambot author to see where he could post by using a webcrawler to count the successfull posts.

  37. Darkstarr says:

    I think I know why spammers write such obvious tripe after spending so much time and energy on getting past our anti-spamware: their brains overheated from actual use, and have gone into cooldown mode for several minutes. Think about this for a moment… it takes some degree of intelligence to work your way past all the various spam-defeating measures, and since spammers obviously have only a small amount of brain power to begin with (if they were smart, they wouldn’t be spammers, right?), what little they have is used up in the hard work of overcoming our defenses.

    In other words, they’re like anti-government militia nutjobs–wasting so much time and energy at trying to overthrow the government that they have nothing left over for figuring out what to do afterwards.

  38. Hi to all, for the reason that I am truly keen of reading this blog’s post to be updated on a regular basis.
    It includes good data.

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *

*
*

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!