Duplicate Content

We always hear about how Google doesn’t like duplicate content, and will penalize a page that has the same content as another. There are plenty of articles on optimizing sites to avoid having duplicate content internally, and articles ranting about scrapers.

What I want to know is what Google thinks about duplicate content cases such as Reference.com or the Associated Press.

Head over to Reference.com, the encyclopedia branch of the Ask.com network of reference sites. Enter a search term. Now go over to Wikipedia and enter the same search term. They’re the same! Reference.com is pulling Wikipedia articles onto their site and throwing in a few ads. (How are they doing this? Does Wikipedia have some sort of API?) What does Google think of this?

Or what about Associated Press articles? They’re syndicated by many newspapers, and appear on their websites. That means the same article on multiple sites, no?

Is Google demoting these pages in their results, or are they giving them a free pass? It’s hard to tell. Reference.com as a whole has a toolbar PageRank of 8, while their iPod article is listed as N/A (while the original Wikipedia article has a PageRank of 7). So that would lead us to believe that they’re being demoted. There’s not really any way to tell for sure though, is there?

It seems that the algorithm is working, and filtering out pages such as those, but I’d like to know what the search giant’s opinion is on such pages. Is duplicate content simply duplicate content?

  • http://joeldrapper.com Joel Drapper

    I think the only thing google does is stop people seeing the exact same article on several different sites, when they could just see it once, and have other results to look through. I believe duplicate content only effects serps, but I may be wrong.

  • http://news.runtowin.com Blaine Moore

    I don’t worry about the duplicate content on other sites.  I think that the penalties come from having multiple pages on the same site that are the same thing.  For folks that rip off my content, I just make sure that mine gets indexed first and is the obvious source.  I rarely republish anything verbatim, preferring to offer something unique, but an odd syndicated article here or there won’t hurt things.