Tech Support: E-mail - Spam Filtering;

Tech Support: E-mail - Spam Filtering;

E-mail: Spam Filtering

This page is meant to provide some tips and tricks regarding the SpamAssassin spam filter ... which we sometimes refer to as our content based filter as compared to the challenge-response filter.

False Negatives and False Positives

The content filter is not 100% accurate. It will sometimes think a spam is a legit message (this is a false negative), and on rare occassions a legit message will be marked as spam (a false positive).

False positives are usually a sign of a configuration error, such as a mail server which has been blacklisted, or mail which has been forwarded through a 3rd party (thus breaking SPF checks).

False negatives are what spammers strive to achieve, and some types of spam are hard to identify. Graphical spam falls into this category and is increasing.

Tuning the SpamAssassin Content Filter - Basic settings

The basic configuration screen has 3 fields: a threshold, a whitelist, and a blacklist. The threshold is the score at which a message is classified as legit (ham) or spam. If you increase the threshold, more messages will be considered legit and you reduce the chance of false positives. If you lower the threshold, more messages will be considered as spam. The whitelist and blacklist fields are lists of addresses that should either be classed as ham (whitelist) or spam (blacklist). The addresses can include 'wildcards' like *@example.com which would pass or block all messages from senders at the example.com domain.

Tuning notes (basic).

The spamassassin rules are designed around the premise that the threshold will be 5. Changing the threshold to be below 5 _will_ cause false positives. This might be ok for a casual user. However I would suggest that it is NEVER appropriate for a business contact address to lower the threshold below 5. (Doing so will cause the loss of some legit mail.)

Whitelisting your own address or domain is generally a bad idea. This is because spammers often forge spam to you from you, or from an address at your domain to lots of addresses at your domain.

I often see client configurations with long lists of white and black lists. I cringe when I see this because these crude lists override all the complex logic in spamassassin. Occasionally I'll see scores in the logs like -75. That means spamassassin assigned a score of 25 (blatant spam) to a message but a whitelist entry (-100) over-rode it.

(The bad whitelists also make our job as sys-admins more difficult, as the logs will show ham hitting rules that should _never_ be hit by anything but spam. When we investigate to find out what happened, it often turns out to be a bad whitelist entry.)

I also see resellers that go around whitelisting themselves in their client's configurations. This works and isn't such a bad idea, but it seems to me that a simpler solution would be to write messages that don't look like spam!

(This is simple enough when you have access to the system. Setup a mailbox with a low threshold and set it to mark spam instead of bouncing it, then send you message to that mailbox. You should end up with a message that includes a report of all the rules your message hit and their respective scores.)

Advanced Tuning Notes.

There are some occassions where you may want to do some more advanced tuning than the threshold and white/black lists allow. We built an interface that was supposed to allow adding your own SpamAssassin rules, but due to the way "spamd" works it does not work for adding rules. However it _does_ still work for changing scores for a particular rule.

There are configurations that cause false positives. The most obvious one is forwarding mail from an outside address to a baremetal hosted mailbox. This breaks the SPF checks as the server hosting the outside address is effectively relaying mail. The fix for this is to disable the SPF rules by adding the following two lines to your advanced configuration:

score SPF_HELO_SOFTFAIL 0
score SPF_SOFTFAIL 0

Alternatively you could increase your score threshold, but this solution targets the specific problem and doesn't cripple spamassassin as badly.

This approach should be a general solution if you find that there are rules that are consistently causing you trouble. Keep in mind that there isn't much point in worrying about low score rules like HTML_MESSAGE.

-Tom

Home Page    Domain Registration Services    Web Services    Technical Support
About Baremetal    Privacy Statement    Billing Info    Charities
My Account    Legal Info    Search BareMetal