<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>StanleyWong.org &#187; Search</title>
	<atom:link href="http://stanleywong.org/category/search/feed/" rel="self" type="application/rss+xml" />
	<link>http://stanleywong.org</link>
	<description>Musings from the Interactive Adman</description>
	<lastBuildDate>Fri, 11 Sep 2009 22:06:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Real Time Streams becoming more relevant</title>
		<link>http://stanleywong.org/2009/05/18/23/</link>
		<comments>http://stanleywong.org/2009/05/18/23/#comments</comments>
		<pubDate>Mon, 18 May 2009 00:24:46 +0000</pubDate>
		<dc:creator>SWong</dc:creator>
				<category><![CDATA[Interactive Advertising]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Startups]]></category>
		<category><![CDATA[Web 2.0]]></category>

		<guid isPermaLink="false">http://stanleywong.org/?p=23</guid>
		<description><![CDATA[Great article on TechCrunch today about &#8220;Jump Into the Stream&#8221; and how we are in an interesting shift to real time streamed information.  The biggest challenge is to make sense of this deluge of information and to figure out what is meaningful and what is meaningless. With Google making some rumbling about real time search [...]]]></description>
			<content:encoded><![CDATA[<p>Great article on TechCrunch today about &#8220;<a title="TechCrunch Jump Into the Stream" href="http://www.techcrunch.com/2009/05/17/jump-into-the-stream/" target="_blank">Jump Into the Stream</a>&#8221; and how we are in an interesting shift to real time streamed information.  The biggest challenge is to make sense of this deluge of information and to figure out what is meaningful and what is meaningless.</p>
<p>With Google making some rumbling about real time search this is going to get really interesting.  Much like SEO is used to drive traffic to web 1.0 websites, social bookmarking and social discovery was used to drive traffic to websites in Web 2.0.  Web 3.0 may be where we see another source of traffic generation through different real time feeds and streams.  An indication of this is <a title="How Twitter &amp; Facebook Now Compete with Google" href="http://blogmaverick.com/2009/05/15/how-twitter-and-facebook-now-compete-with-google/" target="_blank">Mark Cuban&#8217;s recent blog post</a> about how he is getting more and more traffic from twitter and facebook than Google.  In his case it is definitely Streams (Facebook + Twitter) &gt; SEO (Google).</p>
<p>Very intent to see where this leads us.</p>
]]></content:encoded>
			<wfw:commentRss>http://stanleywong.org/2009/05/18/23/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Price of Internet Content SPAM</title>
		<link>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/</link>
		<comments>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/#comments</comments>
		<pubDate>Tue, 03 Apr 2007 06:25:16 +0000</pubDate>
		<dc:creator>SWong</dc:creator>
				<category><![CDATA[Interactive Advertising]]></category>
		<category><![CDATA[SEM]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/</guid>
		<description><![CDATA[A recent research report from Microsoft (NY Times article) claimed that approximately 11% of search results and as much as 30% of some competitive search results contain Internet Content SPAM. Spammers create these pages for the sole purpose of profiting from advertisements by fooling the search engines (i.e. Google, Yahoo!,etc.) to rank their pages well. [...]]]></description>
			<content:encoded><![CDATA[<p>A <a target="_blank" title="Microsoft Research" href="http://research.microsoft.com/SearchRanger/">recent research report from Microsoft</a> (<a target="_blank" href="http://www.nytimes.com/2007/03/19/technology/19spam.html?ex=1331956800&#038;en=44a8402e53db4153&#038;ei=5090&#038;partner=rssuserland&#038;emc=rss">NY Times article</a>) claimed that approximately 11% of search results and as much as 30% of some competitive search results contain Internet Content SPAM.  Spammers create these pages for the sole purpose of profiting from advertisements by fooling the search engines  (i.e. Google, Yahoo!,etc.) to rank their pages well.  Typically these SPAM sites are computer generated to create huge numbers of pages each targeted to rank well with a specific keyword.</p>
<p>There are many strategies, but typically these SPAM sites are computer generated to produce a huge numbers of pages, each targeted to rank well with a specific keyword.  I&#8217;ve found several SPAM content generator scripts (<a target="_blank" title="YACH" href="http://getyacg.com/">YACG</a>, <a target="_blank" title="RSSGM" href="http://www.rssgm.com/">RSSGM</a>, <a target="_blank" title="WP Autoblog" href="http://elliottback.com/wp/archives/2006/06/06/wp-autoblog-a-syndication-plugin/">WP AutoBlog</a>, &#038; <a title="MyGen" target="_blank" href="http://www.boogybonbon.com/2006/08/17/mygen-v102-now-available/">MyGen</a>) on the web that take content snippets from a variety of articles cribbed from across the web.  These scripts auto generate content by jumbling content snippets into search engine readable pages using statistical algorithms such as Markov Chains.  Add a custom template to make the site unique and create an internal link network between pages within the site.</p>
<p>Once the site is built, Spammers create huge numbers of bogus blogs (Blogger) and webpages (from free hosting sites such as Yahoo!â€™s GeoCities) to build links directed towards the Spam site.  This confirmed by the Microsoft study where most free web hosting and free blog sites have over 70% Internet Content SPAM.</p>
<p>Internet Content SPAM is a huge problem for Search Engines because it degrades the quality their search results.  As Albert Einstein famously said &#8220;<span class="body">We can&#8217;t solve problems by using the same kind of thinking we used when we created them.&#8221;, it is tough for Search Engines to identify Internet Content SPAM.  Typical approaches entail usage of filters and/or blacklists when a SPAM signature or domain is discovered.  This makes Internet Content SPAM one of the most vexing issues for this generation of Search Engines.</span></p>
<blockquote><p><span class="body" /></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://stanleywong.org/2007/04/03/the-price-of-internet-content-spam/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Yahoo! Buzz Index Patent Has Been Granted</title>
		<link>http://stanleywong.org/2007/01/12/my-yahoo-buzz-index-patent-has-been-granted/</link>
		<comments>http://stanleywong.org/2007/01/12/my-yahoo-buzz-index-patent-has-been-granted/#comments</comments>
		<pubDate>Fri, 12 Jan 2007 19:51:44 +0000</pubDate>
		<dc:creator>SWong</dc:creator>
				<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Yahoo!]]></category>

		<guid isPermaLink="false">http://stanleywong.org/?p=12</guid>
		<description><![CDATA[Last month, I found out the patent for the Yahoo! Buzz Index (US Patent 7,146,416) was granted. I had a great time working with Janice Yoo, Elliot Yasnovsky, and KT Lim on building this project. From Search Engine Land: Yahoo&#8217;s Buzz This patent had me wondering how Yahoo presently measures trends in topics searched for [...]]]></description>
			<content:encoded><![CDATA[<p>Last month, I found out the patent for the <a title="Yahoo! Buzz Index" target="_blank" href="http://buzz.yahoo.com/">Yahoo! Buzz Index</a> (<a title="Buzz Index Patent" target="_blank" href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&#038;Sect2=HITOFF&#038;d=PALL&#038;p=1&#038;u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&#038;r=1&#038;f=G&#038;l=50&#038;s1=7,146,416.PN.&#038;OS=PN/7,146,416&#038;RS=PN/7,146,416">US Patent 7,146,416</a>) was granted.  I had a great time working with Janice Yoo, <a target="_blank" href="http://www.linkedin.com/pub/0/ab/b12">Elliot Yasnovsky</a>, and KT Lim on building this project.  From <a target="_blank" href="http://searchengineland.com/061206-124914.html">Search Engine Land</a>:</p>
<div>
<blockquote>
<div><strong>Yahoo&#8217;s Buzz</strong></div>
<div>This patent had me wondering how Yahoo presently measures trends in topics  searched for on their search engine and portal, selected in their directory, and  from people&#8217;s usage of the many services they offer; and how the company might  be analyzing and using that information.</div>
<div>
<div>
<div><a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&#038;Sect2=HITOFF&#038;d=PALL&#038;p=1&#038;u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&#038;r=1&#038;f=G&#038;l=50&#038;s1=7,146,416.PN.&#038;OS=PN/7,146,416&#038;RS=PN/7,146,416">Web  site activity monitoring system with tracking by categories and terms</a><br />
Invented by Janet Yoo, Kian-Tat Lim, Stanley Ben Wong, and Elliott  Yasnokvsky<br />
Assigned to Yahoo<br />
US Patent 7,146,416<br />
Granted December 5,  2006<br />
Filed September 1, 2000</div>
<div>Abstract</p>
<blockquote>
<div>A traffic monitor provides statistics of traffic using an activity input  for receiving data related to activity on a server system. Events being  monitored are binned by topic or term, where the terms are associated with  categories. The categories can be a hierarchy of categories and subcategories,  with terms being in one or more categories. The categorized events include page  views and search requests and the results might be normalized over a field of  events and a result output for outputting results of the normalizer as the  statistical analyses of traffic.</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<div>The Yahoo! Buzz Index goes way beyond what you see on the public version of the site. Back when I was at Yahoo! we conceived of this product as a marketing dashboard that would give you all types of insight on arguably one of the largest online panels in the world.<br />
The <a title="Yahoo! Buzz Index Client Version" target="_blank" href="http://buzz.yahoo.com/client/">client version of the Buzz Index</a> empowers marketers to slice and dice data to find out aggregate information (without personally identifiable information) about who was engaging with a brand, concept, or search term.  These insights leverages Yahoo!&#8217;s unparalleled number of consumer profiles (over 250 Million when I last looked).  <a title="Google Zeigeist" target="_blank" href="http://www.google.com/press/zeitgeist.html">Google&#8217;s Zeigeist</a> and <a title="Google Trends" target="_blank" href="http://www.google.com/trends">Google Trends</a> are similar but lacks the insights contained by the behavioral patterns enhanced with the sheer number of profiles.  For example, a brand manager can find out whether or not Pepsi&#8217;s brand profile on the web engaged with younger female audiences than Coca-Cola&#8217;s and how they compare.<br />
It was a lot of fun dreaming up that product when I was at Yahoo!  It was fun to work at Yahoo! during those times when we were routinely innovating with new concepts and technology.  In fact, the Yahoo! Buzz Index predated Google&#8217;s Zeigeist and Google Trends by over 4 years.  Glad to finally see the patent has been granted.</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://stanleywong.org/2007/01/12/my-yahoo-buzz-index-patent-has-been-granted/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Take All?</title>
		<link>http://stanleywong.org/2007/01/03/google-take-all/</link>
		<comments>http://stanleywong.org/2007/01/03/google-take-all/#comments</comments>
		<pubDate>Wed, 03 Jan 2007 20:55:07 +0000</pubDate>
		<dc:creator>SWong</dc:creator>
				<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://stanleywong.org/?p=10</guid>
		<description><![CDATA[I just read Rick Skrenta&#8217;s great blog post, &#8220;Winner-Take-All: Google and the Third Age of Computing&#8221; Rick is right on the money with a lot of his observations, especially the fact that Google has built their huge lead on the backs of the Search and Advertising dominance. One thing I&#8217;d like to add is zero [...]]]></description>
			<content:encoded><![CDATA[<p>I just read Rick Skrenta&#8217;s great blog post,<br />
&#8220;<a target="_blank" title="Skrenta Post" href="http://www.skrenta.com/2007/01/winnertakeall_google_and_the_t.html">Winner-Take-All: Google and the Third Age of Computing</a>&#8221;</p>
<p>Rick is right on the money with a lot of his observations, especially the fact that Google has built their huge lead on the backs of the Search and Advertising dominance.</p>
<p>One thing I&#8217;d like to add is zero switching costs also has a downside for Google if a new entrant is **significantly better** than the incumbent. So far, the efforts of Yahoo!, Ask, and Microsoft in the search space are building on essentially the same technology platform as Google. The best they can do is incrementally better than Google and therefore have a huge challenge to overcome the brand gap and technology refinement from Google&#8217;s band of top notched engineers.</p>
<p>Significantly better requires one to make a dramatic leap (technology or business model) beyond what Google is doing today. When Google came into the market they introduced PageRank (link based algorithm) which was significantly better than the prevailing keyword based ranking technologies employed by the incumbents (AltaVista, Excite, Infoseek, Inktomi, etc). PageRank introduced, for that time, a relatively SPAM free search environment that plagued the keyword based search engines.  So for users comparing searches performed on Google vs. the incumbents the results were startling.<br />
The biggest challenge for search continues to be search SPAM. SEO practitioners (white hat and black hat) have continued to arm themselves with tools to SPAM the current generation of search engines.  These tools include keyword based semantic page generators (using Markov chain models) while linking them together into artificially generated networks to improve ranking.</p>
<p>Having worked in the Internet space for over 10 years, I personally think that Google is not invincible. I&#8217;ve seen too many dominant players come and go during my career to be bold enough to make that claim. However for the potential competitors, it requires one to have a talented team that can approach this space from a completely different vantage point and offer up a new technology paradigm to advance the state of the art towards the next generation.  This requires a healthy combination of Smarts (Innovation) + Money (to Scale the business) + Flawless Execution to achieve.</p>
]]></content:encoded>
			<wfw:commentRss>http://stanleywong.org/2007/01/03/google-take-all/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
