Indexing

In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and dependent on a number of different factors. Our Hangout Notes on indexing cover a range of best practice advice to ensure your website’s important pages are indexed by search engines.

Noindexed Pages May be Indexed if Google Receives Conflicting Canonicalization Signals

July 9, 2019 Source

If Google is receiving conflicting canonicalization signals about a page, then a noindexed version of the page may be indexed.


Disallowed Pages With Backlinks Can be Indexed by Google

July 9, 2019 Source

Pages blocked by robots.txt cannot be crawled by Googlebot. However, if they a disallowed page has links pointing to it Google can determine it is worth being indexed despite not being able to crawl the page.


Google May Index Redirected URLs if Served in Sitemap Files

June 28, 2019 Source

Redirects and sitemaps are both signals that Google uses to select preferred URLs. If you redirect to a destination URL but the source URL is in a sitemap, this is giving Google conflicting signals about which URL you want to be shown in search


Internal Search Results Pages Should be Blocked Unless They Provide Unique Value

May 31, 2019 Source

Internal search result pages should be blocked from crawling because it could overload the site’s server and they tend to be low quality. However, there may be instances where it makes sense to have these pages indexed if they provide value.


Ensure all Key Content is Available if You Are Streaming Content

May 28, 2019 Source

If a site is streaming content progressively to a page, John would recommend ensuring all key content is available immediately due to the method used to render content. Any additional content which is useful for users but not critical to be indexed can then be streamed progressively.


Googlebot No Longer Needs to Convert Hashbang URLs into Escaped Fragments

May 28, 2019 Source

Googlebot no longer converts hashbang URLs into escaped fragments as it is able to render and index them directly rather than using the pre-rendered version specified with the escaped fragment. Therefore, John would recommend moving to something that’s URL-based rather than hashtag-based.


The Indexing Issue From Last Week Has Been Resolved

May 28, 2019 Source

The indexing issue seen last week related to Google having issues indexing new content has now been resolved. If sites are still experiencing issues with the indexing of their content this won’t be related to the previous issue.


Resource Loading Errors Within Mobile Friendly Test and GSC Don’t Reflect Indexing Errors

May 28, 2019 Source

Within the Mobile Friendly Test and Inspect URL Tool, Google fetches all of the page content and embedded URLs including images, fonts and JavaScript, which can sometimes cause timeout issues. However, this shouldn’t affect the indexing of the content for search results due to caching and the use of older versions of files.


URL Removal Tool in GSC is Fastest Way to Remove a Test Site From Search Results

May 17, 2019 Source

While there are several ways to remove a staging site from organic search results, including blocking Googlebot from crawling it or returning 404 or 410 error codes, John recommends using the URL Removal tool in GSC to remove it quickly.


Related Topics

Crawling Crawl Budget Crawl Errors Crawl Rate Disallow Sitemaps Last Modified Nofollow Noindex RSS Canonicalization Fetch and Render