April 3, 2007
The Price of Internet Content SPAM
A recent research report from Microsoft (NY Times article) claimed that approximately 11% of search results and as much as 30% of some competitive search results contain Internet Content SPAM. Spammers create these pages for the sole purpose of profiting from advertisements by fooling the search engines (i.e. Google, Yahoo!,etc.) to rank their pages well. Typically these SPAM sites are computer generated to create huge numbers of pages each targeted to rank well with a specific keyword.
There are many strategies, but typically these SPAM sites are computer generated to produce a huge numbers of pages, each targeted to rank well with a specific keyword. I’ve found several SPAM content generator scripts (YACG, RSSGM, WP AutoBlog, & MyGen) on the web that take content snippets from a variety of articles cribbed from across the web. These scripts auto generate content by jumbling content snippets into search engine readable pages using statistical algorithms such as Markov Chains. Add a custom template to make the site unique and create an internal link network between pages within the site.
Once the site is built, Spammers create huge numbers of bogus blogs (Blogger) and webpages (from free hosting sites such as Yahoo!’s GeoCities) to build links directed towards the Spam site. This confirmed by the Microsoft study where most free web hosting and free blog sites have over 70% Internet Content SPAM.
Internet Content SPAM is a huge problem for Search Engines because it degrades the quality their search results. As Albert Einstein famously said “We can’t solve problems by using the same kind of thinking we used when we created them.”, it is tough for Search Engines to identify Internet Content SPAM. Typical approaches entail usage of filters and/or blacklists when a SPAM signature or domain is discovered. This makes Internet Content SPAM one of the most vexing issues for this generation of Search Engines.
Filed by SWong at 6:25 am under Interactive Advertising, SEM, SEO, Search
No Comments
1 Comment



