Canonicalization

Canonicalization is a method used to help prevent duplicate content issues and manage the indexing of URLs in search engines. Implementing the canonical tag link attribute “rel=canonical” is a signal to search engines about the preferred page for indexing, and will be abided to in most cases when it is correctly implemented to an equivalent page. Our Hangout Notes on canonicalization provide best practice advice and insights for how it is handled by search engines.

Use Pagination to Join Split Pages

April 24, 2015 Source

If you want to split up a page into 2 different URLs, you can’t redirect or canonicalise from the old URL to both new URLs. You can choose one of the new pages as the main one, and link to the secondary page. And you can paginate the pages together.


Canonicalise Product Variants

December 23, 2014 Source

A discussion around when to canonicalise pages to other pages. e.g for colour variations of product pages.


URL Issues Create Duplicate Pages

December 5, 2014 Source

Duplicate URLs from inconsistent ordering, case inconstistency, and session IDs can be fixed with canonical tags if the issue is minor, but it still creates crawling issues if there are many instances.


Search Console Reports Canonicalised Pages with Duplicate Titles

November 21, 2014 Source

Search Console will report pages as having duplicate titles, even if they have been canonicalised


Order of Content for canonicalization Doesn’t Matter

October 10, 2014 Source

When Google is checking to see if pages are similar for the purpose of verifying canonicalization, the order of the content on the page doesn’t matter. Google can detect when the same content is in a different order on the a page. E.g. a set of identical search results in a different order.


Use Disallow to Improve Crawling Efficiency

October 10, 2014 Source

John recommends against robots.txt, because it prevents Google consolidating authority signals, but then says there are occassions when crawling efficiency is more important.


Hreflang URLs Should Always Be Canonical URLs

October 10, 2014 Source

Don’t include any URLs that redirect, are non-indexable, canonicalised, otherwise they might be ignored.


Hreflang Should Canonicalise to Preferred HTTP/HTTPS Variation

September 22, 2014 Source

When you have multiple language sites with hreflang, and you have http and https versions of the sites, you don’t need to worry about the hreflang for the non-canonical version. So if you canonicalise from http to https, then you don’t need any hreflang on the http.


Canonicalised Pages Stay in Google’s Index

August 29, 2014 Source

Canonicalised pages may remain showing as indexed for site: searches depending on the ‘site structure’. They are no considered as hard as a redirect, and the page can still surface for unique content. Canonical URLs are not crawled immediately, like a redirect would be. John suggests that if you have a large number of incorrect canonical tags, such as many pages canonicalising to a single page, they might ignore all canonical tags across the site. Google makes a clear recommendation that cleaning up broken canonical tags is a good idea.


Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Sitemaps Last Modified Nofollow Noindex RSS Fetch and Render