Please support our Site Management advertiser: Affiliate Marketing
Jul 14th, 2006, 3:52 pm
Microsoft's Cybersecurity and Systems Management Research Group have created an automated tool, as part of the larger Strider Search Defender project, to combat sources of comment spam that is the scourge of blogs across the web. Because sites can get high legitimate search rankings while at the same time serving up spammed ads, it is a problem that has to be dealt with. Microsoft is on course to be the unlikely hero of the hour.
SpamHunter does this by creating a list of doorway sites, hosted on legitimate blog or forum sites and feeding ads from a central spammer target page. Rather than adopting the usual content reading approach to spam discovery, Microsoft is thinking contextual analysis of URL redirection instead. By crawling the web using search engine queries to locate sites within the same network, SpamHunter can pass the information across to the Microsoft Strider URL Tracer which then puts the pieces together and determines where the central domains fed by those doorways are. Because networks of thousands of doorway pages can serve ads from a single domain, it is possible for Search Defender to take down an entire operation in one hit. Indeed, the system has already had some measure of success during testing, determining that 97 percent of the 5,500 spam sites at Blog4Ever were the work of a single comment spammer who was using the same AdSense affiliate identifier for example.
The real clever part is that the more comments there are linking back to a spam site then the quicker SpamHunter will find them, and what is more it effectively becomes a honey forum enabling other spam URLs to be easily obtained. Of course there will always be the problem of false positives, and to try and reduce these Microsoft is making use of the whitelist of legit advertising and web analytics sites that it has compiled during work on the Honey Monkey malicious exploit finder project.
Of course, this is not so much a cure as just another weapon to be used in the fight against spam. It is up to us, bloggers and forum moderators, to do our bit in keeping our eyes open for comment spammers and sending them to the dev.null hell where they belong. Search engines could do a little more when it comes to blacklisting sites that host comment spam, and MSN is promising just that. Working with the Search Defender team it will pursue leads and either remove proven spam sites or assign them such a low relevance rating that they it amounts to much the same thing. I would like to think that the other big search players will co-operate, but given the current rivalry between Microsoft and Google this seems unlikely.
SpamHunter does this by creating a list of doorway sites, hosted on legitimate blog or forum sites and feeding ads from a central spammer target page. Rather than adopting the usual content reading approach to spam discovery, Microsoft is thinking contextual analysis of URL redirection instead. By crawling the web using search engine queries to locate sites within the same network, SpamHunter can pass the information across to the Microsoft Strider URL Tracer which then puts the pieces together and determines where the central domains fed by those doorways are. Because networks of thousands of doorway pages can serve ads from a single domain, it is possible for Search Defender to take down an entire operation in one hit. Indeed, the system has already had some measure of success during testing, determining that 97 percent of the 5,500 spam sites at Blog4Ever were the work of a single comment spammer who was using the same AdSense affiliate identifier for example.
The real clever part is that the more comments there are linking back to a spam site then the quicker SpamHunter will find them, and what is more it effectively becomes a honey forum enabling other spam URLs to be easily obtained. Of course there will always be the problem of false positives, and to try and reduce these Microsoft is making use of the whitelist of legit advertising and web analytics sites that it has compiled during work on the Honey Monkey malicious exploit finder project.
Of course, this is not so much a cure as just another weapon to be used in the fight against spam. It is up to us, bloggers and forum moderators, to do our bit in keeping our eyes open for comment spammers and sending them to the dev.null hell where they belong. Search engines could do a little more when it comes to blacklisting sites that host comment spam, and MSN is promising just that. Working with the Search Defender team it will pursue leads and either remove proven spam sites or assign them such a low relevance rating that they it amounts to much the same thing. I would like to think that the other big search players will co-operate, but given the current rivalry between Microsoft and Google this seems unlikely.
This blog entry was written by Davey Winder, staff writer aka happygeek. It has received 1,405 views, 0 comments, and 0 linkbacks. 1 voter has rated this entry 5 out of 5 stars. It was promoted to featured status Jul 14th, 2006.
•
•
•
•
advertising apple blog business daniweb dell development economy email facebook firefox gaming google government hacking hardware ibm intel internet iphone ipod linux mac malware marketing microsoft mobile mozilla mp3 music news open source privacy programming search security software sony spam stocks technology ubuntu video vista web windows xp yahoo youtube
All Recent Tags Post Comment
•
•
•
•
Only community members can start a blog or comment on blog entries. You must register or log in to contribute.
•
•
•
•
•
•
•
•
DaniWeb Site Management Marketplace
Related Blog Entries
- Google gives users an insight into search (1 Day Ago)
- Apple fixes iPhone 2.0.1 software to break Pwnage tool (2 Days Ago)
- Amazon sells 240,000 Kindles (5 Days Ago)
- Parents have no idea what kids are doing online - shock horror (6 Days Ago)
- Major ISPs Drop alt.* Usenet Hierarchy (9 Days Ago)
- Google and Cuil search giants go head to head in DaniWeb testing (10 Days Ago)
- Microsoft Live Search for Facebook (11 Days Ago)
- Pentagon hacker allegedly threatened with Guantanamo Bay military tribunal (11 Days Ago)
- UK ISPs agree to throttle illegal music file-sharers (13 Days Ago)
- WikiGoogle or GooglePedia? Nope, it is Knol actually. (14 Days Ago)
Featured Entry