A rel=”noindex” directive is used to instruct search engines not to include a page within their index, to prevent it from appearing within search results. Our Hangout Notes explain the use of this directive, along with further advice from Google and real world examples.

Showing less content to search engines than to users isn’t necessarily a cloaking issue

November 17, 2021 Source

John was asked about a website that had a lot of noindexed pages that had HTTP errors. They asked whether it’s considered ‘cloaking‘ to show an empty HTML page to bots to get those URLs de-indexed, while still showing users the page.

John mentioned that the part of ‘cloaking’ that is an issue is when search engines get more or vastly different content than users. Google wants to avoid promising users something they can’t find when they go to a page from a query. However, showing an empty page with a noindex will cause Google to drop those URLs and they will not care if users see something different because the page will not appear in search results.

Having a high ratio of ‘noindex’ vs indexable URLs could affect website crawlability

November 17, 2021 Source

Having noindex URLs normally does not affect how Google crawls the rest of your website—unless you have a large number of noindexed pages that need to be crawled in order to reach a small number of indexable pages.

John gave the example of if a website that has millions of pages with 90% of them noindexed, as Google needs to crawl a page first in order to see the noindex, Google could get bogged down with crawling millions of pages just to find those 100 indexable ones. If you have a normal ratio of indexable / no-indexable URLs and the indexable ones can be discovered quickly, he doesn’t see that as an issue to crawlability. This is not due to quality reasons, but more of a technical issue due to the high number of URLs that will need to be crawled to see what is there.

Speed up re-crawling of previously noindexed pages by temporarily linking to them on important pages

November 17, 2021 Source

Temporarily internally linking to previously noindexed URLs on important pages (such as the homepage) can speed up recrawling of those URLs if crawling has slowed down due to the earlier presence of a noindex tag. The example given was of previously noindexed product pages and John’s suggestion was to link to them for a couple of weeks via a special product section on the homepage. Google will see the internal linking changes and then go and crawl those linked-to URLs. It helps to show they are important pages relative to the website. However, he also stated that if significant changes are made to internal linking, it can cause other parts of your site which are barely indexed to drop out of the index—this is why he suggests using these links as a temporary measure to get them recrawled at the regular rate, before changing it back.

If a page is noindexed for a long period of time, crawling will slow down

November 17, 2021 Source

Having a page set to noindex for a long time will cause Google’s crawling for it to slow down. Once a page is indexable again, crawling will pick up again, but it can take time for that initial recrawling to happen. He also mentioned that Search Console reports can show a worse situation than it actually is but you can use things like sitemaps and internal linking to speed up recrawling of them.

To better control page indexing, use ‘noindex’ on pages rather than ‘nofollow’ tags on internal links

November 1, 2021 Source

Adding rel=”nofollow” tags to internal links is not recommended as a way to control indexing. Instead, John suggests adding noindex tags to pages that you don’t want indexed, or removing internal links to them altogether.

Noindexed pages generally do not count towards content quality algorithms

October 30, 2021 Source

Google focuses on the quality of the content they have indexed. If it’s not shown in search, it’s generally not taken into account.

Allow a Single Variation of Category Pages to be Indexed

March 17, 2020 Source

Google doesn’t currently have guidelines on indexing different versions of category pages, but is moving towards recommending allowing a single version to be indexed, such as a sort order, and the alternative variations with different filters and sort orders should be noindexed. If there are other specific versions of category pages which are important, you can allow the first page in the set to be indexed as well.

Google Will Ignore Links on Noindexed Pages Over Time

March 17, 2020 Source

If pages are noindex, Google will ignore those links over time. If you have pages which are only linked from noindex pages then Google may not see the linked pages as important.

Google May Treat Noindex Pages as Soft 404

March 6, 2020 Source

Google may treat a noindex page as a soft 404, which are equivalent in how they are treated in search results. If you want them to be re-indexed, you need to let Google know the pages have changed, such as submitting in a Sitemap with a last modified date.

Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Sitemaps Last Modified Nofollow RSS Canonicalization Fetch and Render