spamassassin learn debug

时间：2006-10-26 来源：snowtty

Your setup may be standard to you, but this forum is familiar with the 'standard' Postfix/amavisd-new setup, but that should not affect what is going on with Bayes since we are all using SpamAssassin.

Quote:

#1. If one does the XBL-SBL checks before doing the SpamAssassin checks, the bulk of the spam email is killed before it can ever be rated. In the long run, it seems that doing the XBL checks will affect how well Bayes learns spam. Would it be a good idea to turn off the XBL checks during some introductory learning phase?

I would say it does affect it, but it is probably not worth accepting the extra spam just to train Bayes. It would not last that long anyway. Tokens expire. The more tokens Bayes learns, the faster they expire. I would consider quarantining the spam you do get. After verifying there is no good mail in the pile of rubbish, feed that to Bayes as spam, then delete it.

Quote:

Users still report getting about 75% spam.

Bayes is only part of the picture. There are ways to diagnose why you are not getting better results from spamassassin and there are ways to improve SpamAssassin's performance (which have been discussed of the SA talk list - and here). First off, what version of SpamAssassin are you using, and are you seeing the ALL_TRUSTED rule hit on mail originating from outside your network? Sometimes it requires help from outside sources such as SARE and plugins like FuzzyOcr.

Use the latest version of spamassassin, 3.0.x is now way behind IMHO.
Use sa-update.
If your SpamAssassin 3.1.x is older than 3.1.5, a patch for amavisd-new may be
needed so it finds the new rules:
http://www200.pair.com/mecham/spam/p3.txt

Who knows if MailScanner needs something like this too?

Consider (somewhat resource intensive) FuzzyOcr:
http://www200.pair.com/mecham/spam/image_spam.html
https://secure.renaissoft.com/maia/wiki/FuzzyOCR23

and ImageInfo plugins:
http://www.rulesemporium.com/plugins.htm

Make sure network tests are enabled in amavisd.conf:
$sa_local_tests_only = 0;
(oops, sorry, forgot you are using MailScanner - never mind)

Use Razor and DCC and Pyzor (with Pyzor I suggest changing the server)
http://marc.theaimsgroup.com/?l=spamassassin-users&m=114956889224487&w=2

I also use:
http://marc.theaimsgroup.com/?l=spamassassin-users&m=115637139728022&w=2

Use a local caching name server - it can help considerably with network
tests.

To see if anything is going on as far as net tests go, you can break out
debugging info and try stuff like:
spamassassin --lint --debug area=1,dns

Here you would want to see:
dbg: dns: is Net::DNS::Resolver available? yes

spamassassin --lint --debug area=1,uri
spamassassin --lint --debug area=1,razor2
spamassassin --lint --debug area=1,dcc
spamassassin --lint --debug area=1,pyzor

Make sure trusted/internal networks is set up properly:
http://www.freespamfilter.org/forum/viewtopic.php?t=309

I have done all these things, and even with lowering the scores in
the plugins and I have never had less spam in my inbox (and my
kill_level is set to 8.0). It scares me sometimes.Your setup may be standard to you, but this forum is familiar with the 'standard' Postfix/amavisd-new setup, but that should not affect what is going on with Bayes since we are all using SpamAssassin.

Quote:

Users still report getting about 75% spam.