When same content is posted in multiple sites, how Google determine which is original

The same content could appear in multiple URLs on the web for a variety of reasons. If you are afraid someone else is copying your content and your traffic will go to the copied content, learn how to deal with it. Read this article to find out what will happen if someone copy your content and what you can do about it.

Duplicate content, reproduced content and plagiarism have always been challenges to webmasters as well as search engines. Content developers and bloggers spend a lot of time researching on various topics to provide the best content to their readers. However, a plagiarist needs only a minute to copy and reproduce the same content in another site and then walk away with the fruits of your hard work.

How can Google find out which is original content and which is copied content

In most cases, search engines like Google can figure out the original content based on a number of signals like the date and time of indexing, authoritative nature of the websites etc. If your website is an authority website, have good ranking and you posted the content first, then there are higher chances that Google will consider your article as the original post and would ignore the reproduced one in the search results. However, if you are new in to blogging and if your blog post is reproduced by another website which got indexed by Google before it found your post, then you will probably lose the game. By all chances, Google would think you copied from the other site, losing the benefits of all of your hard work.

How Google determines which is the original content is a secret and Google would never disclose it.

Even though Google would never tell us how it determine which is the original content and which one to ignore in search results, we can safely conclude few points based on what we found is happening in the real search engine world. Some points that Google might be considering to choose the original content against the reproduced content are:

Date and time the content was indexed by Google: In most cases, the content which was indexed first by Google would be treated as the original content and all others would be treated as reproduced content. However, this has a problem. New sites and blogs are not indexed often and if an established blogger reproduced content from new sites, Google would conclude the original author as copy cat. Fortunately, Google does not depend on this factor fully to determine the original content.

Authority of the website: Google has some internal rating for all websites regarding their authoritative nature. Content on the authority site would be given weightage in the analysis to determine the original content vs reproduced content. For example, if you reproduce an article from Wikipedia (or viceversa), it is almost impossible to make Google believe that your article is the original article because of the high ranking and authoritative nature of Wikipedia website.

Links from other websites: Google give a high respect for the incoming links from various websites. If your article has multiple links pointing to it from other highly ranked websites, then your article would be treated as the original and the article which has lower number of incoming links would be treated as re produced content.

Reference Links: Sometimes the reproduced article would give a link to the original source. This is a strong signal to Google that the article to which the link is given is the original source and the one which has the links from it is a re produced content.

If someone ask you if they can reproduce your content, it is a good idea to ask them to just reproduce only a small summary and then give a link to your original article. This will help you get some valuable incoming links. This is an opportunity to build reputation and authority for your site in Google and other search engines.

Google is moving towards Semantic Search and this could bring up another major challenge for content developers and SEO specialists. In addition to dealing with plagiarism and reproduced content, they will have to deal with convincing Google about the semantic meaning of their content, unless the search engine is smart enough to detect what exactly you mean in your content. If you are not familiar with Semantic Search, take a look my article on what is semantic search.

When same content is posted in multiple sites, how Google determine which is original

How can Google find out which is original content and which is copied content

Related Articles

Google Panda Update - Recover from Panda Penalty

Search Engine Optimization for Images | SEO for Images

Some SEO Tips I learnt from Matt Cutts

How to make my blog rank first in Google

Can Google follow and index Javascript links

How does guest posting affect SEO

How to modify AdSense Ads dynamically for responsive web design

SEO advantages and disadvantages for Responsive Web Design

7 jQuery Plugins that prove helpful in designing a responsive design for a website

Impact of reproduced or repeated content on SEO

Comments

About Techulator.com

Reviews

Quick Links

Technology