Update May 1, 2010: I don’t know if Ubervu saw this post, but they fixed the original issue I mentioned by truncating the original HTML titles. I’ve changed the title of this page, and left the post below just as an example of how duplicate content can kill your rankings in Google. Ubervu looks like a great site with a responsive, considerate team so it’s worth checking out.
An example of being filtered out of Google by duplicate content from Web scraping
I’ve noticed a website that has been scraping a lot of content recently — ubervu.com.
While their service looks it might be useful, they are scraping content from other websites for SEO purposes at the expense of the other websites. Since Ubervu.com is such a large site with so many backlinks (over 700,000 in Yahoo Explorer), its content outranks the smaller blogs and websites that it scrapes. Here is a search for my blog post title, “Google Privacy: It’s Only Getting Worse“. Ubervu and Businessweek.com have both scraped my content, using my HTML title for their HTML titles.
Here is a screenshot of how those two scrapers have caused Google to completely filter out my original post, even though Ubervu links to my original post. Click on it for a larger version:
So much for Google’s claim that there is no such thing as duplicate content, and that Google can determine which copy is the original post.
UPDATE May 1, 2010: It looks like Ubervu truncated the titles to fix the problem — much better: