Train SpamAssassin to Block SPAM!

One of the most frustrating parts of having an email account is the part where complete strangers can send you unsolicited email and you have no control over it.  You can click the little button that says “Unsubscribe” and, in a few instances, it may even prove effective enough to work.  Most of the time, though, the unsolicited email (SPAM) will actually increase as opposed to diminish.  That’s where SpamAssassin comes in. We are about to show you how to train SpamAssassin to take care of SPAM and send it to the depths of no return!

Train SpamAssassin

SpamAssassin is a powerful, highly customizable tool for, as one might imagine, fighting SPAM in your inbox or other folders.  It comes with a command-line tool called “sa-learn” which we’ll use in this post to train SpamAssassin to detect good mail (“ham”) and bad mail (“spam”).  With a little CRON magic, you can make this powerful tool learn exactly what is SPAM and what isn’t SPAM in your mailboxes.  What we’ll need is access to your cPanel account, your email address, and a desire to kill things that suck.

The first thing we’ll do is log into our cPanel account and navigate down to the “Apache SpamAssassin” link:

Select Email > Apache SpamAssassin
Select the Apache SpamAssassin icon

Enable the Spam Box

Click Enable Spam Box
If your Spam Box is disabled, select Enable Spam Box. Otherwise, skip this step.

Once there, you’ll want to enable your “Spam Box”.  The spam box is a folder which is automatically created for spam mail which is detected by SpamAssassin to go to.  The folder is there in case you want to save email which mail mistakenly be identified as SPAM.

Now that we’ve got our Spam Box enabled, we’ll need to go into your webmail client to actually subscribe to the folder which has been created.  This will allow you to see the folder from within your mail client or your webmail client.  The first step is to log into your webmail client.  We’ll go over SquirrelMail and Roundcube.

Select spam and click Subscribe
You’ll need to add your new spam folder to your list of subscriptions here.
  1. In the Email Accounts section in cPanel, choose the Access Webmail option next to the domain whose archived mail you wish to view.
  2. Click on the icon to log into SquirrelMail.
  3. Click on the Folders link at the top of the page.
  4. Under the Unsubscribe/Subscribe heading, select the folders you wish to view in the Subscribe list and click Subscribe.
    • If you wish to include the Spam folder, select the folder for Spam.
Check the box for the spam folder
Here you can add spam to your list of visible folders.
  1. In the Email Accounts section in cPanel, choose the Access Webmail option next to the domain whose archived mail you wish to view.
  2. Click on the Roundcube icon.
  3. Click the Settings icon in the top right corner of the Roundcube interface.
  4. Choose Folders in the left menu.
  5. Check the box for the folder whose archived mail you wish to view.
  6. Check the box for the date of the archived mail you wish to view.
    • If you wish to include the Spam folder, check the box for Spam.

Create Cron Jobs

Select Cron Jobs
Edit your cron jobs here.

Now that we’ve got everything going, we’ll put in an automated check via cron jobs twice a day.  The purpose of setting up this cron job is to enable an automated check of defined mail folders with good email (ham) and bad email (spam).  This will require two different cron jobs but they are set up nearly identically.  Navigate to your cPanel account and open “Cron Jobs” under “Advanced”:

 

Before you start adding the cron job, you can decide whether or not you would like to receive an email each time a cron job is run.  This is handy if you’re wanting to check on the status and get a feel for how sa-learn is progressing with its learning.  You can do this by entering your email address in this box:

Add an email address for results
You need to tell Cron where to send the email with the results.

If you do not want an email every time the cron job runs, you can put “>/dev/null 2>&1” after the command in the cron job.  This will stop email from being sent.

The sa-learn command is part of the SpamAssassin suite where you can forcefully train SpamAssassin the difference between ham and spam.  To create the scripts and enable the cron job, scroll down on this page.  We can do this one of two ways: per domain and per account or a blanket method which will cover all domains and accounts.  You would consider the second if you have add on domains.

Built into cPanel are basically templates for the repeating cycle timer.  You can simply pull down a box and select the frequency you’d like your cron job to run.  In this case, we’re choosing once every 12 hours, or twice a day.

Add the command for the cron job
Add your command line cron job to this box.

 

Now that we’ve determined the frequency of the cron job, we’re ready to insert the actual script to execute the command.  As mentioned before, there are a few different ways to do this.

The first method is to scan all domains and all accounts in one command.  This is likely the most popular way of doing this due to the fact that this will cover everything in one sweep.  An example of this would look like this:

 

HAM scan of the Inbox:
sa-learn -p ~/.spamassassin/user_prefs --ham ~/mail/*/*/{cur,new}

SPAM scan of the “spam” folder:
sa-learn -p ~/.spamassassin/user_prefs --spam ~/mail/*/*/.spam/{cur,new}

The other example is to scan by single domain and by single email account.  Examples of this would look like this:

HAM scan of a folder called “Archive”:
sa-learn -p ~/.spamassassin/user_prefs --ham ~/mail/demodomain.com/demouser/.Archive/{cur,new}

SPAM scan of the “spam” folder:
sa-learn -p ~/.spamassassin/user_prefs --spam ~/mail/demodomain.com/demouser/.spam/{cur,new}

The end result of all of this work?  Less spam.  More ham.  We’ve set up a cron job to train SpamAssassin to look through the specified mailboxes searching for either good email (ham) or bad email (spam) and, for future messages, direct bad email directly into the spam folder so you never even have to see it.  Have questions or suggestions on different ways to accomplish this?  Let us know in the comments!