Crawling

Before a page can be indexed, and therefore appear within search results, it must be crawled by search engine crawlers, like Googlebot. There are many things to consider in order to get pages crawled and ensure they are adhering to the correct guidelines. These are covered within our Hangout Notes, as well as further research and recommendations.

URLs in JavaScript May Be Crawled

May 19, 2015 Source

JavaScript variables which look like URLs may be crawled, which can generate server errors. But you can ignore them, or block with robots.txt


HTML Crawling Faster Than JavaScript for Page Discovery

April 24, 2015 Source

JavaScript processing takes longer than pure HTML crawling, so isn’t suitable for fast discovery of pages. John says ‘it takes another cycle or two longer to process’.


Image Re-Crawling Takes Longer After a URL Change

March 27, 2015 Source

Images are not crawled very frequently, so when you migrate them to new URLs/domains, it will take a lot longer than pages, perhaps months.


Wildcard Subdomain Configuration Causes Crawl Issues

December 23, 2014 Source

Using wildcard subdomains can make a site difficult to crawl.


CSS and JS Crawling Is Important for Mobile Compatability

December 16, 2014 Source

Allowing your CSS and JavaScript files to be crawlable does affect desktop pages, but is more important for mobile pages as they need to test for mobile compatibility.


Noindex Pages Can’t Accumulate PageRank

November 7, 2014 Source

Noindex pages can’t accumulate pagerank for the site, even though the pages can be crawled. So this isn’t an advantage over disallowing.


Disallowed URLs Don’t Pass PageRank

September 12, 2014 Source

If a URL is disallowed in robots.txt, it won’t be crawled, and therefore can’t pass any pagerank.


Google Needs to Be Able to Crawl CSS and JavaScript

September 8, 2014 Source

Don’t block your CSS and JavaScript from being crawled.


Related Topics

Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Sitemaps Last Modified Nofollow Noindex RSS Canonicalization Fetch and Render