Spam Study

2009-12-03 02:12:21

Last year, I started a study. I wanted to observe comment spammers in their natural environment, see how they function, how they move, and every other bit of data I could possibly find.

So, I set up a spam trap. A fake comment form rigged to collect data, here are the results.

1) These bots found my comment form via a robots.txt Disallow statement, or via a rel="nofollow" link, meaning they intentionally look for things that web developers don't want scanned.

2) Only one out of the nearly 2,000 spammers had JavaScript enabled.

3) All of the spammers had cookies enabled.

4) Spammers will put a test post first, containing different formatting methods for links, when it determines which one works, it returns with spam in that format (HTML, BBcode, etc).

5) The bot will attempt to locate the posted spam, presumably for future refrence.

6) None of the spammers activated an onfocus, onclick, onkeypress, onmousemove, or any other event with the exception of onload.

From this data, I devised a very simple method to stop comment spam without CAPTCHAs. Require JavaScript, and require a focus event. Those two should prevent all but the most determined customized comment spammers.

For my next study, I will collect data about email harvesters, and try and correlate specific spam to a specific spambot by delivering a different email address to each bot, and logging the data. My hope is to make specific correlations about certain businesses that are using email harvesters, and hopefully deduce some of the businesses that are selling email lists from random email harvesting.

Until then, ciao.

Post A Comment!