Canonicalization

Canonicalization is a method used to help prevent duplicate content issues and manage the indexing of URLs in search engines. Implementing the canonical tag link attribute “rel=canonical” is a signal to search engines about the preferred page for indexing, and will be abided to in most cases when it is correctly implemented to an equivalent page. Our Hangout Notes on canonicalization provide best practice advice and insights for how it is handled by search engines.

Unless locations have unique content offerings, separate pages are not recommended

December 6, 2021 Source

When asked about whether to canonicalize so-called ‘doorway pages’, John was keen to stress that there’s no one solution that fits every situation. The example given was a site that has separate pages for ‘piano lessons birmingham’ and ‘piano lessons london’. If there’s something unique about the offerings in each city, it’s generally fine to have separate URLs. If the information on both is the same, it’s recommended to consider folding these into one ‘stronger’ page, rather than diluting signals across multiple near-identical ones. You could also consider a mix of the two approaches if there’s a stand-out, unique element in one of those locations.


Best practices for canonicals on paginated pages can depend on your wider internal linking structure

December 6, 2021 Source

John tackled one of the most common questions asked of SEOs; how should we be handling canonical attributes on paginated pages? Ultimately, it depends on the site architecture. If internal linking is strong enough across the wider site, it’s feasible to canonicalize all paginated URLs to page 1 without content dropping from the index. However, if you rely on Google crawling pages 2, 3… and so on to find all of the content you want to be crawled, make sure that paginated URLs self-canonicalize.


Make sure important content is not found only on canonicalized pages

November 17, 2021 Source

John answered a question about whether duplicate content that appears in some form on both the canonicalized page and the canonical page needs to match. He replied that they don’t need to have the exact same content. With a canonical tag, Google will try to index the canonical page that was specified. If there is any unique content on the non-canonical pages then it won’t be indexed. So make sure that any content that is critical from canonicalized pages is also on the canonical page.


The URL parameter tool does not prevent pages from being crawled

October 30, 2021 Source

John explained that any URLs set to be ignored within the URL Parameter tool may still be crawled, albeit at a much slower rate. Parameter rules set in the tool can also help Google to make decisions on which canonical tags should be followed.

 

 


Anything Contained on Non-canonical Pages Will Not Be Used for Indexing Purposes

February 7, 2020 Source

When Google pick a canonical for a page, they will understand there is a set of pages, but only focus on the content and links of the canonical page. Anything that is only contained on the non-canonical versions will not be used for indexing purposes. If you have content on those pages that you would like to be indexed, John recommends ensuring they are different.


Review Canonical Signals if Google Are Continually Picking a Different Canonical to the Ones Set

January 31, 2020 Source

Google may occasionally pick a canonical that is different the one that has been set for certain pages, but this doesn’t change anything from a ranking point of view. However, if you’re seeing this on a large scale, John recommends reviewing if you are sending confusing signals to Google.


No Need to Remove Internal Links on Non Canonical Pages as Google is Able to Figure Out Connections

January 24, 2020 Source

Google sees links from a canonical page to a canonicalised page, and sometimes there can be multiple internal links that are associated with each. In this case, Google will combine all of the signals and keep them with each page, but is able to understand the connection between the canonical and canonicalised pages.


Avoid Providing Google with Conflicting Canonical Tags When Working on JavaScript Sites

January 10, 2020 Source

If you have a JavaScript site, John recommends making sure that the static HTML page you deliver doesn’t have a canonical tag on it. Instead use JavaScript to add it, in order to avoid providing Google with different information. Google is able to pick the canonical up after rendering the page in order to process and use it.


Google Will Usually Drop Session IDs from URLs

January 7, 2020 Source

Instead of choosing a representative URL for a set of URLs with session IDs, Google will usually drop the session ID from the URLs completely if it recognises that they don’t return any unique content.


Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Sitemaps Last Modified Nofollow Noindex RSS Fetch and Render