Disallow

The disallow directive is used to instruct search engines not to crawl a page on a site and is added within the robots.txt file. This will also prevent a page from appearing within search results. Within our Hangout Notes, we explain how Google deals with disallow directives, with best practice advice and examples.

Disallow Rule Must Start with a Slash

February 26, 2016 Source

If you’re specifying a path in the robots.txt file, you must start with a slash, not a * wildcard. This was always true, but was only recently added to the documentation and Search Console testing tool.


Disallowed URLs can be Indexed

January 29, 2016 Source

Even if a URL is disallowed it can still show up in the index.


Submit Updated Robots.txt via Search Console

August 25, 2015 Source

If you submit your robots.txt file via the Search Console Robots testing tool, they will recrawl it immediately instead of waiting for the normal daily check.


Disallow doesn’t prevent indexing

August 25, 2015 Source

A disallowed URL will be indexed and shown in results if Google has sufficient external signals.


Disallow prevents PageRank from being passed

August 25, 2015 Source

PageRank can be inherited by a disallowed URL but can’t be passed on.


You Can Escape URLs in Robots.txt

August 25, 2015 Source

In robots.txt, you can escape URLs if you want, they are treated as equivalents.


Noindex Pages Can’t Accumulate PageRank

November 7, 2014 Source

Noindex pages can’t accumulate pagerank for the site, even though the pages can be crawled. So this isn’t an advantage over disallowing.


Use Disallow to Improve Crawling Efficiency

October 10, 2014 Source

John recommends against robots.txt, because it prevents Google consolidating authority signals, but then says there are occassions when crawling efficiency is more important.


Disallowed URLs Don’t Pass PageRank

September 12, 2014 Source

If a URL is disallowed in robots.txt, it won’t be crawled, and therefore can’t pass any pagerank.


Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Sitemaps Last Modified Nofollow Noindex RSS Canonicalization Fetch and Render