Real Time Streams becoming more relevant

Great article on TechCrunch today about “Jump Into the Stream” and how we are in an interesting shift to real time streamed information.  The biggest challenge is to make sense of this deluge of information and to figure out what is meaningful and what is meaningless.

With Google making some rumbling about real time search this is going to get really interesting.  Much like SEO is used to drive traffic to web 1.0 websites, social bookmarking and social discovery was used to drive traffic to websites in Web 2.0.  Web 3.0 may be where we see another source of traffic generation through different real time feeds and streams.  An indication of this is Mark Cuban’s recent blog post about how he is getting more and more traffic from twitter and facebook than Google.  In his case it is definitely Streams (Facebook + Twitter) > SEO (Google).

Very intent to see where this leads us.

del.icio.us:Real Time Streams becoming more relevant digg:Real Time Streams becoming more relevant spurl:Real Time Streams becoming more relevant wists:Real Time Streams becoming more relevant simpy:Real Time Streams becoming more relevant newsvine:Real Time Streams becoming more relevant blinklist:Real Time Streams becoming more relevant furl:Real Time Streams becoming more relevant reddit:Real Time Streams becoming more relevant fark:Real Time Streams becoming more relevant blogmarks:Real Time Streams becoming more relevant Y!:Real Time Streams becoming more relevant smarking:Real Time Streams becoming more relevant magnolia:Real Time Streams becoming more relevant segnalo:Real Time Streams becoming more relevant

The Price of Internet Content SPAM

A recent research report from Microsoft (NY Times article) claimed that approximately 11% of search results and as much as 30% of some competitive search results contain Internet Content SPAM. Spammers create these pages for the sole purpose of profiting from advertisements by fooling the search engines (i.e. Google, Yahoo!,etc.) to rank their pages well. Typically these SPAM sites are computer generated to create huge numbers of pages each targeted to rank well with a specific keyword.

There are many strategies, but typically these SPAM sites are computer generated to produce a huge numbers of pages, each targeted to rank well with a specific keyword. I’ve found several SPAM content generator scripts (YACG, RSSGM, WP AutoBlog, & MyGen) on the web that take content snippets from a variety of articles cribbed from across the web. These scripts auto generate content by jumbling content snippets into search engine readable pages using statistical algorithms such as Markov Chains. Add a custom template to make the site unique and create an internal link network between pages within the site.

Once the site is built, Spammers create huge numbers of bogus blogs (Blogger) and webpages (from free hosting sites such as Yahoo!’s GeoCities) to build links directed towards the Spam site. This confirmed by the Microsoft study where most free web hosting and free blog sites have over 70% Internet Content SPAM.

Internet Content SPAM is a huge problem for Search Engines because it degrades the quality their search results. As Albert Einstein famously said “We can’t solve problems by using the same kind of thinking we used when we created them.”, it is tough for Search Engines to identify Internet Content SPAM. Typical approaches entail usage of filters and/or blacklists when a SPAM signature or domain is discovered. This makes Internet Content SPAM one of the most vexing issues for this generation of Search Engines.

del.icio.us:The Price of Internet Content SPAM digg:The Price of Internet Content SPAM spurl:The Price of Internet Content SPAM wists:The Price of Internet Content SPAM simpy:The Price of Internet Content SPAM newsvine:The Price of Internet Content SPAM blinklist:The Price of Internet Content SPAM furl:The Price of Internet Content SPAM reddit:The Price of Internet Content SPAM fark:The Price of Internet Content SPAM blogmarks:The Price of Internet Content SPAM Y!:The Price of Internet Content SPAM smarking:The Price of Internet Content SPAM magnolia:The Price of Internet Content SPAM segnalo:The Price of Internet Content SPAM