Duplicate Content

Duplicate content occurs when there is the exact same, or very similar content appearing in different places on a website. There are several issues which can occur when a website has duplicate content, with search engines viewing it in many different ways. We cover this within our Hangout Notes, along with best practice recommendations for duplicate content issues.

Technical Issues Can Cause Content to be Indexed on Scraper Sites Before Original Site

January 7, 2020 Source

If content on scraper sites is appearing in the index from those sites before the original site, this could be due to technical issues on the original site. For example, Googlebot might not be able to find main hub pages or category pages or may be getting stuck in crawl traps by following excess parameter URLs.

Google’s Algorithms Should be Able to Detect & Prioritize Original Content From Near Duplicate Versions

October 29, 2019 Source

Google’s algorithms will ideally be able to detect spun content which has been rewritten from another source and see the original content as more valuable.

Having Multiple Pages for Different Product Variations Isn’t a Problem

July 26, 2019 Source

John recommends two approaches for products with multiple variations, either ensure each individual page is indexed or have one main product page with each variation option available. The best method depends on the size of the site and the uniqueness of each variation.

GSC Data Across Duplicate Language Versions Will Only be Shown for Selected Canonical

July 23, 2019 Source

Even if you have hreflang set up correctly, Google can fold together similar language version pages and choose one to index, meaning that data in Google Search Console will only be shown for the one selected canonical page.

Having Sections Of Duplicate Content on A Site Is Fine

July 12, 2019 Source

Google will not demote your site if you have sections of duplicate content across several different pages. Instead they will recognise the content is contained on several pages and try to filter it out within search results and show just one page.

HTML & AMP Pages Containing the Same Content Will Not Be Negatively Seen As Duplicate Content

June 14, 2019 Source

Having the same content on both HTML and AMP pages is not negatively seen as duplicate content by Google. However, it can lead to competition between the pages within search results. To avoid this, John recommends concentrating the value of both pages using the relevant rel alternate link and canonical tag.

International Websites on Separate Subdomains Will Not Be Penalized for Duplicate Content

May 17, 2019 Source

Google will not penalize international websites that exist on separate subdomains if they have duplicate content. Instead, it will recognise the pages are identical and in most cases index both, but will only pick one URL to show in search results.

Pages with Internally Duplicated Content Are Indexed Separately but Folded Together in Search

April 5, 2019 Source

Google will index pages with duplicate blocks of text separately but will work out which of those pages is most relevant to show for each query and will show just one of them in the search results.

Directory Sites Should Have Unique, Valuable Content to Perform in Search

April 5, 2019 Source

To rank in search, directories should provide unique information that would make users want to visit that site instead of going straight to the website of the business that they want contact details for.

Related Topics

Copyright/DMCA Thin Content Embedded Content Images User Generated Content Hidden Content Interstitials Expired Content Keyword Optimization Header Tags Page Structure Web Spam Videos