| Article Index |
|---|
| Fundamental Anti-Spam Techniques |
| Page 2 |
| All Pages |
Fundamental Anti-Spam Techniques |
|
In this section, we cover some of the common methods used to defeat spam. These techniques show up many times throughout this book and are the basis of the ongoing fight against spam |
Whitelists and BlacklistsThese are lists of senders who are always allowed (whitelisted) or always denied (blacklisted) the ability to deliver messages. These whitelists/blacklists are less useful now than they were when spam first became a problem, but they can be helpful in some circumstances. Depending upon how they are implemented, whitelist/blacklist checks can happen at one or more points along a message's journey. For example, list checks can happen at the edge of the network, at the POP/IMAP server, or at the email client. Whitelists or blacklists can be enforced within most MTAs at different times in the transfer of the message. Such times include prior to the connection being accepted by the server, after the connection has been accepted but before delivery to the recipient, and at email recipient delivery time. Some of the more common fields that can be whitelisted/blacklisted include
All modern MTAs have good support of whitelisting and blacklisting. Many of the anti-spam packages covered in this book have their own whitelist/blacklist support. McAfee SpamKiller for Mail Servers even has hierarchical whitelists/blacklists, allowing the administrator to override certain client-listed items. Whitelists/blacklists are an integral part of many anti-spam solutions and are an added feature of others. For example, POPFile's "magnet" feature is a whitelist/blacklist. In a challenge/response system, after an email address is "known" to the system, it is whitelisted so that the sender needn't go through the authentication process on subsequent messages. SpamAssassin gives you the ability to whitelist senders automatically. Header CheckingAnother common method of determining a message's legitimacy is to perform header checks when the MTA accepts the message. Some of the tests that can be done here include
Valid From AddressOne way spammers conceal their true identity is by forging their From address. You can defeat this trick by requiring your mail server to check the validity of the From address's domain. Of course, spammers can counter this technique by using valid From addresses, but many do not bother. Sender Policy Framework (or SPF) is an attempt to standardize the process by which a From address is considered legitimate. This is accomplished by publishing special DNS TXT records indicating that mail with a particular From address should be coming from a certain set of email servers. More information on SPF is available in Appendix A and at http://spf.pobox.com. DNS ChecksSimilar in nature to the valid From address, many spammers use servers with no forward or reverse DNS entries or servers whose forward entry does not match the reverse. Mail originating from servers with incorrect DNS setups like this can be stopped with the appropriate setup in most MTAs. Be aware that strict DNS checking may stop some legitimate email from getting through to users, though. Strict Header CheckingEmail standards are defined by Request for Comments, or RFCs. RFCs are the basis for how the entire Internet interoperates at a low level. Email-specific RFCs specify how sending email servers connect to recipient servers in order to transfer their messages. Older versions of Sendmail are lax in interpreting RFCs. For example, Sendmail version 8.8 and earlier are very lenient in their default acceptance of parameters to the MAIL FROM: and RCPT TO: commands. Other rigorous header checking techniques include requiring HELO/EHLO, accurate parameters to HELO/EHLO, etc. Many MTAs can control how strict the server should be when accepting inbound messages. Making these sorts of changes may reduce spam but can also cause problems for legitimate email delivery from misconfigured systems. Blacklists and WhitelistsBlacklists and whitelists can be defined in all modern MTAs. These lists can take the form of servers, domains, or IP address ranges and can be static (defined locally on the server in a text file or database) or dynamic (for example, domain name system block lists or DNSBLs). When used at mail transfer time, DNSBLs can reduce the amount of spam coming into a mail system. However, use these lists with caution because they can end up blocking legitimate messages if the block lists are too strict. Content FilteringThe ability to scan email for certain spam-identifying characteristics is an excellent method to reduce spam. Content filters often generate a score for a message that helps the end user decide how to handle the message, rather than automatically rejecting or sidelining the message. The user utilizes the email client filtering capability to move messages identified by the content filter to a junk email folder. The downside of content filters is the effort required to keep them current. Spammers constantly tweak their messages to get past content (and other) filters. Bayesian AnalysisBayesian analysis is a special form of content
filtering, in which statistical analysis of the message components
(including headers) takes place. Bayesian analysis is a
particularly accurate way of identifying whether a message is spam.
Refer to Chapter 7,
"Introduction to Bayesian Filtering," for an introduction to
Bayesian analysis and Chapter 8, "Bayesian Filtering," for Bayesian
solutions. Chapter 9,
"Email Client Filtering," has a section on POPFile, which is
usually implemented on email clients such as Microsoft Outlook
Express. The Bayesian analysis can actually work as a complex
filtering mechanism to replace the filtering capability typically
included in the email client itself.
Email Client FilteringThe capability to filter in the end user email client software is a method used throughout this book to help identify the modifications that anti-spam software makes to messages. The changes to email messages are usually either the addition of a header or headers indicating a spam score and other information or modifications to the subject line of the message. Chapter 9 includes coverage of how to configure popular email clients for use with the server-side solutions outlined in this book. Distributed Collaborative FilteringThese systems calculate checksums of every message processed and place the result into a database. Then, each time a particular checksum is encountered, a counter is incremented. If the count for a particular checksum (email) is high, then the message is probably either a legitimate mailing list message to a large number of recipients or a spam message. In the case of a mailing list message, the sender can be whitelisted so that the message does not get misclassified as spam. It is important to understand that Distributed Collaborative Filtering (also called Distributed Checksum Filtering, or DCF) systems do not identify messages as spam or non-spam. They simply count the number of times a message has been seen by a particular set of email systems and report that count appropriately. The DCF method is very good at what it does, but the system needs to be deployed as part of a larger anti-spam solution, or else a high rate of false positives will likely be encountered. Mailing messages will often be treated as spam unless some sort of whitelist is used with a DCF solution. Chapter 6, "Distributed Checksum Filtering," contains coverage of two common distributed collaborative filtering systems: DCC and Vipul's Razor. Sender VerificationSender verification is a broad category of techniques that require some action on the part of the sender in order to prove that the sender is not a spammer and that the message is not otherwise undesirable (such as an electronic virus). Numerous types of sender verification systems are available; we cover the following types in Chapter 12, "Sender Verification." Challenge/ResponseThis method requires the receiver to send some sort of an acknowledgement back to the sender before the sender is able to view the message. Many email recipients won't respond to challenge/response systems. Another issue is the "chicken and the egg" problem, where two people who use challenge/response systems want to communicate with each other for the first time without any other method of communication. This is a difficult, if not impossible, situation to address with the challenge response solution. Tagged Message Delivery Agent, Active Spam Killer, and Camram all have support for challenge response. They are covered in Chapter 12. Special Use Email AddressesOne way to reduce the amount of spam is to generate special-purpose email addresses. Some MTAs (qmail in particular) make it very easy to generate email addresses on the fly that effectively can be one time (or special) use. Tagged Message Delivery Agent also has support for special-purpose email addresses. Sender ComputeIn the sender compute model, a recipient
requires the sender to calculate an algorithm and send the result
back to the recipient, usually in the form of a web page or special
email header in the original email. This method is often called
"proof of work" or "Internet postage," although the latter term
implies money transferral, which doesn't happen in the sender
compute model. Camram (covered in Chapter 12) contains support for the sender
compute model, as well as challenge/response and a GUI interface to
CRM114 (a highly accurate Bayesian classifier).
|
Other Anti-Spam MethodsThe following methods are less effective in general and therefore less useful for most organizations. However, they may be useful for some people in some cases. They are not covered elsewhere in this book, except peripherally or in an appendix. Reporting SpamFor the benefit of all who use email, it is a good idea to report spam. Although this is an after-the-fact method, it can reduce the amount of spam that everyone receives in the future. One of the best-known sites for reporting spam is http://spamcop.net. This and other ways of reporting spam are covered in Appendix B. Charging per EmailSome people have suggested charging all senders per email message sent. This would require significant changes to the underlying email transfer protocol and would have to be addressed by a change in the SMTP protocol itself. Also, this idea brings up all of the usual issues related to handling moneydetermining who handles transferring funds from one party to another, settlement, escrow, and so on. Third-Party Anti-Spam SolutionsA number of commercial anti-spam solutions are available on the market. Unfortunately, we can only cover a couple of types and solutions here. Anti-Spam ServicesSymantec Brightmail and Postini are anti-spam services where a subscribing organization's mail streams are "washed" of spam by the vendor's service. The resultant "cleaned" email stream is forwarded to your regular email infrastructure for delivery to the end user. Any messages identified as spam end up in a quarantined area on the vendor's infrastructure. Both Symantec and Postini claim patents on their respective solutions, which makes them unique. The benefits of using an anti-spam service like these include
The negatives of using anti-spam services include
For some organizations, a service-based solution is precisely what is needed. They are certainly worth considering when shopping for anti-spam solutions. Anti-Spam AppliancesThese devices are similar in nature to firewallsthey are standalone single-purpose devices that, instead of protecting your network from security events as firewalls do, protect your network from spam. McAfee, Inc. makes its SpamKiller anti-spam product available in an appliance product that can be extended to include its anti-virus products. Also, some firewalls have built-in anti-spam capability as well. Some examples of products in this area include Ciphertrust Ironmail and Mirapoint. Benefits of anti-spam appliances include
Additional benefits of combining anti-spam with other security functions such as anti-virus or firewall include
Of course, the downside to such devices is their potentially lower spam identification accuracy, flexibility, and cost. A big negative to anti-spam firewalls is the fact that it is much more difficult to swap out individual anti-spam components and replace them with higher accuracy techniques. |
|
|




