Removing Low Quality Pages Takes Months to Impact Crawling and Site Quality
Removing low-quality pages from your site may have a positive impact on crawling the rest of the site, but could take 3-9 months until you see changes in crawling which can be measured using log files. Improvements in the overall site quality may take even longer to have an impact. It’s unusual to have any negative impact from removing cruft content.
Average Fetch Time May be Affected by Groups of Slower Pages
If Google is spending more time crawling a particular group of slow pages then it may make the average fetch time and crawled data look worse.
Rendered Page Resources Are Included in Google’s Crawl Rate
The resources that Google fetches when they render a page are included in Google’s crawling budget and reported in the Crawl Stats data in Search Console.
Algorithm Changes May Result in Changes to Crawl Rate
The number of pages which Google wants to crawl may change during algorithm changes, which may be due to some pages being considered less important to show in search results, or from crawling optimization improvements.
Specify Timezone Formats Consistently Across Site & Sitemaps
Google is able to understand different timezone formats, for example, UTC vs GMT. However, it’s important to use one timezone format consistently across a site and its sitemaps to avoid confusing Google.
If a Robots.txt File Returns a Server Error for a Brief Period of Time Google Will Not Crawl Anything From the Site
If a robots.txt file returns a server error for a brief period of time Google will not crawl anything from the website until they are able to access it and crawl normally again. During the period of time where they are blocked from reaching the file they would assume all URLs are blocked and would therefore flag this in GSC. You can use the robots.txt request in your server logs to identify where this has occurred by reviewing the response size and code that was returned during each request.
It is Normal for Google to Occassionally Crawl Old URLs
Due to their rendering processes, Google will occasionally re-crawl old URLs in order to check their set up. You may see this within your log files, but it is normal and will not cause any problems.
Having a Reasonable Amount of HTML Comments Has No Effect on SEO
Comments within the HTML of a page do not have any effect on SEO unless there is a large amount, as they can make it difficult for Google to figure out where the content is and may impact the size and speed of the page. However, John confirmed he has never come across a page where HTML comments have been a problem.
Upper Limit For Recrawling Pages is Six Months
Google tends to recrawl pages at least once every six months as an upper limit.