<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stanley Wong&#039;s Blog &#187; SEO</title>
	<atom:link href="http://stanleywong.org/category/seo/feed/" rel="self" type="application/rss+xml" />
	<link>http://stanleywong.org</link>
	<description>Musings from the Interactive Ad Man</description>
	<lastBuildDate>Thu, 16 Dec 2010 06:38:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>The Price of Internet Content SPAM</title>
		<link>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/</link>
		<comments>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/#comments</comments>
		<pubDate>Tue, 03 Apr 2007 06:25:08 +0000</pubDate>
		<dc:creator>swong</dc:creator>
				<category><![CDATA[Advertising]]></category>
		<category><![CDATA[SEM]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://stanleywong.org/?p=34</guid>
		<description><![CDATA[A recent research report from Microsoft (NY Times article) claimed that approximately 11% of search results and as much as 30% of some competitive search results contain Internet Content SPAM. Spammers create these pages for the sole purpose of profiting from advertisements by fooling the search engines (i.e. Google, Yahoo!,etc.) to rank their pages well. [...]]]></description>
			<content:encoded><![CDATA[<p></p><h3 id="post-15"><span style="font-weight: normal; font-size: 13px;">A <a title="Microsoft Research" href="http://research.microsoft.com/SearchRanger/" target="_blank">recent research report from Microsoft</a> (<a href="http://www.nytimes.com/2007/03/19/technology/19spam.html?ex=1331956800&amp;en=44a8402e53db4153&amp;ei=5090&amp;partner=rssuserland&amp;emc=rss" target="_blank">NY Times article</a>)  claimed that approximately 11% of search results and as much as 30% of  some competitive search results contain Internet Content SPAM.  Spammers  create these pages for the sole purpose of profiting from  advertisements by fooling the search engines  (i.e. Google, Yahoo!,etc.)  to rank their pages well.  Typically these SPAM sites are computer  generated to create huge numbers of pages each targeted to rank well  with a specific keyword.</span></h3>
<div>
<p>There are many strategies, but typically these SPAM sites are  computer generated to produce a huge numbers of pages, each targeted to  rank well with a specific keyword.  I’ve found several SPAM content  generator scripts (<a title="YACH" href="http://getyacg.com/" target="_blank">YACG</a>, <a title="RSSGM" href="http://www.rssgm.com/" target="_blank">RSSGM</a>, <a title="WP Autoblog" href="http://elliottback.com/wp/archives/2006/06/06/wp-autoblog-a-syndication-plugin/" target="_blank">WP AutoBlog</a>, &amp; <a title="MyGen" href="http://www.boogybonbon.com/2006/08/17/mygen-v102-now-available/" target="_blank">MyGen</a>)  on the web that take content snippets from a variety of articles  cribbed from across the web.  These scripts auto generate content by  jumbling content snippets into search engine readable pages using  statistical algorithms such as Markov Chains.  Add a custom template to  make the site unique and create an internal link network between pages  within the site.</p>
<p>Once the site is built, Spammers create huge numbers of bogus blogs  (Blogger) and webpages (from free hosting sites such as Yahoo! GeoCities) to build links directed towards the Spam site.  This  confirmed by the Microsoft study where most free web hosting and free  blog sites have over 70% Internet Content SPAM.</p>
<p>Internet Content SPAM is a huge problem for Search Engines because it  degrades the quality their search results.  As Albert Einstein famously  said “We can’t solve problems by using the same kind  of thinking we used when we created them.”, it is tough for Search  Engines to identify Internet Content SPAM.  Typical approaches entail  usage of filters and/or blacklists when a SPAM signature or domain is  discovered.  This makes Internet Content SPAM one of the most vexing  issues for this generation of Search Engines.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

