In this recipe, we will learn how to install and set up a well-known e-mail filtering program, spam-assassin.
Getting ready
You will need access to a root account or an account with sudo privileges.
You need to have Postfix installed and working.
How to do it…
Follow these steps to filter mail with spam-assassin:
Install spam-assassin with the following command:
$ sudo apt-get update
$ sudo apt-get install spamassassin spamc
Create a user account and group for spam-assassin:
$ sudo groupadd spamd
$ sudo useradd -g spamd -s /usr/bin/nologin \
-d /var/log/spamassassin -m spamd
Change the default settings for the spam daemon. Open /etc/default/spamassassin and update the following lines:
ENABLED=1
SAHOME="/var/log/spamassassin/"
OPTIONS="--create-prefs --max-children 5 --username spamd - -helper-home-dir ${SAHOME} -s ${SAHOME}spamd.log"
PIDFILE="${SAHOME}spamd.pid"
CRON=1
Optionally, configure spam rules by changing values in /etc/spamassassin/local.cf:
trusted_networks 10.0.2. # set your trusted network
required_score 3.0 # 3 + will be marked as spam
Next, we need to change the Postfix settings to pass e-mails through spam- assassin. Open /etc/postfix/master.cf and find the following line:
smtp inet n - - - - smtpd
Add the content filtering option:
-o content_filter=spamassassin
Define the content filter block by adding the following lines to the end of the file:
spamassassin unix - n n - - pipe
user=spamd argv=/usr/bin/spamc -f -e
/usr/sbin/sendmail -oi -f ${sender} ${recipient}
Finally, restart spam-assassin and Postfix:
$ sudo service spamassassin start
$ sudo service postfix reload
You can check spam-assassin and mail logs to verify that spam-assassin is working properly:
$ less /var/log/spamassassin/spamd.log
$ less /var/log/mail.log
How it works…
Spam filtering works with the help of a piping mechanism provided by Postfix. We have created a new Unix pipe which will be used to filter e-mails. Postfix will pass all e-mails through this pipe, which will be then scanned through spam-assassin to determine the spam score. If given e-mail scores below the configured threshold, then it passes the filter without any modification; otherwise, spam-assassin adds a spam header to the e-mail.
Spam-assassin works with a Bayesian classifier to classify e-mails as spam or not spam. Basically, it checks the content of the e-mail and determines the score based on content.
There's more…
You can train spam-assassin's Bayesian classifier to get more accurate spam detections.
The following command will train spam-assassin with spam contents (--spam):
$ sudo sa-learn --spam -u spamd --dir ~/Maildir/.Junk/* -D
To train with non-spam content, use the following command (--ham):
$ sudo sa-learn --ham -u spamd --dir ~/Maildir/.INBOX/* -D
If you are using the mbox format, replace --dir ~/Maildir/.Junk/* with the option --mbox.
See also
Sa-learn - train SpamAssassin's Bayesian classifier at https://spamassassin.apache.org/full/3.2.x/doc/sa-learn.html and https://wiki.apache.org/spamassassin/BayesInSpamAssassin
Learn about Bayesian classification at https://en.wikipedia.org/wiki/Naive_Bayes_classifier