Image spam

Spam containing images, or “image spam” was a major focus of spammers and Anti-Spam vendors during 2006. During the last few years techniques used to detect text based spam, and the computers that were sending it, were effective at detecting almost all spam and spammers were fighting a losing battle getting their spam delivered to inboxes.

During the second quarter of 2005 spammers began to develop a technique of including an image rather than text to carry the spam message. This type of spam started to increase in complexity and volume, and by the start of 2006 image spam accounted for up to 30% of all spam. By October image spam had increased up to 40% of all spam and by the end of 2006 image spam accounted for up to 65% of all spam. With a 100% increase in image spam, which is typically 3-4 times the size of text based spam, there must have been a lot of extra junk clogging up the tubes of the internet last year.

Increase in image spam

At the start of the year image spam consisted primarily of ‘pump and dump’ stock spam. This was more suited to image spam as it did not require recipients to click on a link. By the end of the year image spam was advertising ‘pump and dump’ stock, pharmaceuticals, fake degrees, counterfeit software, loans, mortgages and other kinds of junk usually associated with text based spam.

Image spam, like text based spam, is continually changing and although many of the images appear to be the same at first glance, in most cases each image is unique. Even the older image spam used techniques to avoid detection such as random background noise in the image file, random image file names, random subject lines and ‘hash buster’ message bodies were added to disguise the spam. Some image spam used animated gifs and some used multi-layer image files to hide the spam message in the image.

Over the year McAfee developed a large number of methods to detect image spam accurately. Analyzing the actual content of the image is very slow and CPU intensive, and spammers have already started to obfuscate the text in the spam to prevent OCR techniques from classifying the image (for example by using wavy or broken text as in the examples above.) McAfee Anti-Spam does not analyze the actual ‘picture’ as this is slow and not currently necessary to detect the spam. Instead McAfee Anti-Spam uses a number of techniques to detect image spam, some are based on the (mostly botnet) computers used to send the spam and some are based on analysing the content of the spam message. Current McAfee Anti-Spam detection rates for image spam are around 99%+.

The trend of image spam seems certain to continue in 2007 as spammers continue to build up their botnets and hone the tools used to distribute this type of spam.

Further blogs regarding image spam and some of the techniques used to detect it are planned for the coming weeks/months.