Apache SpamAssassin is the leading spam filtering computer program. It is an open-source platform which classifies emails and blocks spams i.e., unsolicited bulk email. It uses a robust scoring framework and plug-ins to scan a wide range of advanced heuristic and statistical analysis tests on email headers and body text like DNS, Bayesian filtering, fuzzy checksum techniques, blacklists, external programs and online databases.
Apache SpamAssassin is a project under the ASF, Apache Software Foundation. Initially it was released on April 20, 2001.
The program can be integrated with the email server in order to automatically filter all mail for a website.
Justin Mason created Apache SpamAssassin. Mason while maintaining patches of a program named filter.plx by Mark Jeftovic rewrote all his codes from scratch and uploaded his codebase to SourceForge on April 20, 2001.
In summer 2004, Apache Software Foundation took over the project and renamed it to Apache SpamAssassin.
- Wide-spectrum: Apache SpamAssassin uses a wide variety of local and network tests to identify spam signatures. This makes it harder for spammers to identify one aspect which they can craft their messages to work around.
- Free software: The same terms and conditions as other open-source software packages are distributed under Apache SpamAssassin. For e.g., Apache Web Server.
- Flexible: Apache SpamAssassin encapsulates its logic in a well-designed, abstract API so it can be integrated anywhere in the email stream.
- Easy to extend: Anti-spam tests and configuration are stored in plain text, making it easy to configure and add new rules.
- Easy configuration: Apache SpamAssassin is very easy to configure. Once classified, site and user-specific policies can then be applied against spam. Policies can be applied on both mail servers and later using the user’s own mail user-agent application.
Method of Usage
Apache SpamAssassin is a Perl-based application which filters all incoming email for all users. It can operate as a separate standalone program or as a subprogram of another application or as a client that communicates using a daemon. The client/server or embedded mode of operation has some performance benefits, but under certain circumstances it may introduce a few additional security risks.
Network-based filtering methods
- DNS-based blacklists and DNS-based whitelists
- Fuzzy-checksum-based spam detection filters such as the Distributed Checksums Clearinghouse, Vipul’s Razor and the Cloudmark Authority plugins (commercial)
- Hashcash email stamps based on proof-of-work
- Sender Policy Framework and DomainKeys Identified Mail
- URI blacklists such as SURBL or URIBL which track spam websites
Bayesian filtering is one of the filtering methods Apache SpamAssassin uses to identify spam mails. Bayes filter is a general algorithm to compute belief from observations and control data.
1. If at any point one of the buckets of the Bayes filters reaches 0 or 1 after normalization then the Bayes filter will become overconfident in its state and not allow for some future belief which may include the actual state of the robot. Only allow buckets to reach 0 or 1 if you are absolutely certain that the robot is not or is at the specific state.
2. If too many observations from one state are added to the fitter too quickly then the filter will converge exponentially fast to the state(s) which match that measurement, this can be dangerous because of remark 1 or if the measurement is incorrect.
3. Be careful of biases in measurements, such as a person standing in front of a door which will cause for several successive measurements to be reported incorrectly.
For each email it analyses, SpamAssassin generates a header, with a set of rules and points for each (positive, negative, or zero). To keep that from happening, senders need to keep an eye on SpamAssassin scores.
Because SpamAssassin is capable of running a wide variety of tests, it is tough for spammers to fool and unlikely that non-spam messages are incorrectly filtered or blocked. Still, email filtering is not an exact science, and SpamAssassin is not a perfect solution. There may be instances when legitimate emails get incorrectly identified as junk mail.
Official Website of Apache SpamAssassin: https://spamassassin.apache.org/
Also read this: