Evil has a bendy new ally

By Shamus
on Nov 28, 2006
Filed under:
Pictures

Clippy the spammer’s friend!

Adaptive filtering worked pretty well for a while, but the alternate spelling stuff has been running circles around my filter in Thunderbird. Arg.

UPDATE: Looking more closely, it’s not so much the alternate spelling ones that are the problem, but the ones which are 10 mispelled words and 200 correct but unrelated words at the end. They have their ad on line 1, which is then followed by a couple of paragraphs of English gibberish. Flagging these as spam just teaches the adaptive filter to treat normal, valid words as spam words, which in turn leads to lots of false positives.

I think we need a more drastic solution.

Enjoyed this post? Please share!


is a programmer, an author, and nearly a composer. He works on this site full time. If you’d like to support him, you can do so via Patreon or PayPal.

8Eight comments? Nobody's THAT hungry.

From the Archives:

  1. wintermute says:

    I reccomend Spamato.

    Hrm. There seem to be lots of negative comments there, but it works great for me, and for several other people I know who use it.

    Give it a try.

  2. I’ve always thought that the next filter ought to be for misspelled words. Enough and it goes into the spam box.

  3. Shamus says:

    A spelling filter would be dual-purpose: It would block spammers AND morons. Then I wouldn’t have to read any more emails like:

    HOW COEM U DONT LEIK HALO IT ROX THE BEST GAME EVAR IF U SAY IT SUX THAN U TOTALY GAY LOOSER!!!

    Every once in a while I’ll get something like this from another brave footsoldier in the War On Literacy. Composing a reply would be a waste of time, since my words will be incomprehensible to them, so it’s best for everyone if I could just filter them out.

  4. Aren’t they using a Bayesian filter? If so, the swarm of legitimate words shouldn’t matter.

    Are you using a POP3 host? If so, you can use K9 for filtration. It’s free:

    http://keir.net/k9.html

    I’ve been using it for several years now and I swear by it. Because it’s using a Bayesian filter, it isn’t deceived by the kind of thing that seems to be causing you trouble.

  5. In my opinion, email is just broken. I think we’re heading in a direction where people will have whitelists, and if you’re not on their whitelist, they don’t get your mail. It’s sad but I fear we’re moving in that direction.

  6. Pixy Misa says:

    I’m using a combination of POP3 and Thunderbird’s Bayesian filter, but I have a problem with false positives. I’d rather have to delete ten spams from my inbox each day than scan through a thousand spams looking for one or two good emails.

  7. David V.S. says:

    I use gmail and have almost no problem with spam.

    As far as I can tell, the gmail spam filter is not only rules-based but community-based.

    Apparently, chances are that any spam I receive has previously been flagged by another gmail user and is filtered appropriately. I’ve only had a single “false positive” marked-as-spam in several months. I get about 20-25 spam messages a day, and usually only have to deal with one or two myself.

    I can set rules, but I have not found any need to do so. I only do my share of occasionally (those 1-2 messages per day) flagging a message as spam.

    I used to use Thunderbird, and was skeptical about gmail. But needed some account I could check _and_ search archived mail from any computer (i.e., work and home). I’ve since decided gmail is superior to Thunderbird in every way but one: if my network provider goes down then at home I cannot access current archived email. (I can transfer archived e-mail to my computer using a POP server, but do not do so regularly.)

    Let me know if you want to play with gmail.

  8. Rask says:

    I don’t know exactly how the technology works, but the latest version of the email server I use, MDaemon, has cut inbound spam from 80+/day down to 1-5 a day.

    Rather than judge an email by its content, it analyzes the sending patterns of the email, as well as the SMTP conversation fingerprint.

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *

*
*

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>