Monday, February 04, 2008

How Accurate is your Spam Filter?

When researching spam filters, you'll frequently see reference to accuracy. Is a filter 90% accurate? 95% accurate? 99% accurate?

Here's how you can test the accuracy of spam filtering with whatever email system you're using:


  1. In the morning, when you get into the office or fire up your computer at home, clear your spam folder. (i.e. Microsoft’s Junk e-Mail folder, Ella Review Spam etc.)

  2. Jot down the number of messages you have in your inbox. [e.g. 523]

  3. During the course of the day, don't delete any messages (good or spam). Leave them in the inbox or the spam folder

  4. The next morning, determine the “denominator” – the total number of messages you received since yesterday morning

    a. Take the total number of messages in your inbox (including messages you consider to be spam), [e.g. 584 total messages in your inbox]

    b. Plus the number of messages in your spam folder, [e.g. 87 messages in your spam folder]

    c. then subtract the total number of messages that you wrote down in step 2 above [(584+87)-523 = 148]

  5. Then determine the “numerator”

    a. the number of inbox messages that you consider to be spam. [e.g. 9 inbox messages considered to be spam]

    b. plus the number of any “false positive” messages (if any) that the filter might have misclassified (good messages in your spam folder) [e.g. 1 message]

  6. Now divide the number of messages misclassified by your filter (the numerator) by the total number of messages you received. [10/148 = 6.76%]

  7. Subtract the result from 100%. [100% – 6.76% = 93.24%]


This simple method should give you a true reflection of the accuracy of your spam filter.

Labels:

Friday, December 01, 2006

...and it keeps on comin'

The past couple of months have seen a REAL spike in spam - and as such, a spike in installs of the FREE and Trial versions of Ella (oh yeah, sales too). We again consulted with Erik Schmidt, and he pointed out that since August, spam volume has increased roughly 300%. (I sure wish I would have kept that link). Ella is still keeping up - and that's good - in my account today, 301 spam came in and Ella missed just 1 - that's over 99.5% accuracy. You go girl.

The only significant content difference in spam is those silly gif/jpg images that are typical with stock come-ons. Of all of the spam I have ever seen - those seem to be the ones least likely to have a user actually click on- but hey, maybe I'm a jaded anti-spam guy.

Saturday, September 16, 2006

Weird uptick

Over the past couple of months, spam has been increasing, and as such, there has been an odd uptick in usage of Ella. We have been seeing a steady increase in downloads, installations and sales of Ella. Our SEO ace, Erik Schmidt, is convinced that there is a new strain of spam and that Ella is either the only, or just one of the few that can handle this. It certainly good for our ego, but in a way it makes a little sense. By using a unique training profile for every single user, Ella is an elusive target for the spammers. Even if he is wrong and it is a phase of some astrological sign - I'm ok with it. I just want to see more people benefitting from a solid solution to what seems to a be a problem with no clear end.

Thursday, March 30, 2006

Spam, beginning to drift

Spam seems to be beginning to drift. Not necessarily drift away, but from a volume and an importance point of view. My impression is that the CAN-Spam laws have caused porn spam to virtually disappear (at least in my sphere of influence) and much of what we get is severe repetition. In my opinion, this is a good thing - spam is becoming less of a viable revenue source. The less revenue associated with it, the fewer players, and eventually a falloff in businesses that support it.

Saturday, February 04, 2006

February 2006 Spam Categories

Here is a breakdown of 100 spam messages that I received over the last couple of days. The distinct lack of porn might indicate that the CAN Spam activities might be working on them - but certainly not the viagra ads. Over the last 500 messages - Ella missed two spam - not bad... 99.6%

55 pharma (mostly viagra, but weight loss, fountain of youth cream, pheromones)
13 cheap replica watches
10 oem software
9 mortgage, get out of debt for free
4 virus, fraud, 419
2 cheap degrees
2 gambling
1 cheap sex
1 foreign language
1 get rich quick
1 penny stocks
1 sat dish