How to Debug Blocked Crawls in Deepcrawl

Adam Gent
Adam Gent

On 17th September 2019 • 5 min read

The rapid rise in malicious bots crawling the web have caused hosting companies, content delivery networks (CDN) and server admins to block bots they do not recognize in log file data. Unfortunately, this means that Deepcrawl can be accidentally blocked by a client’s website.

In this guide, we will help you identify indicators that a crawl is being blocked and provide solutions to unblock Deepcrawl and allow you to crawl your site.

How to identify that a crawl is being blocked

The most common indicators of a crawl being blocked are listed below.

Unauthorized and Forbidden crawl errors

If a server is blocking Deepcrawl from accessing a website, then reports will display a lot of URLs with HTTP 401 unauthorized and 403 forbidden header responses.

How to debug blocked crawls in Deepcrawl - Unauthorized and Forbidden crawl errors

To find URLs with these errors navigate to the Unauthorized Pages report.

Too Many Requests

If Deepcrawl is exceeding the number of requests your site is able to receive, then reports will display a lot of URLs with HTTP 429 Too Many Requests header responses.

How to debug blocked crawls in Deepcrawl - Too many requests

To find URLs with these errors navigate to the Uncategorized HTTP Response Codes report.

Slow crawl and connection timeout errors

This will cause your crawl to be successful initially but it will begin to slow down progressively and eventually appear to stop running completely.

Any URLs crawled and displayed in Deepcrawl will show up with connection timeout errors in reports.

Why does Deepcrawl get blocked?

The reason for your crawl being blocked is most likely due to one of the following:


How can I remove the block and crawl my website?

Our team recommends the following solutions if you suspect that Deepcrawl is being blocked.

Whitelist our IP

Providing you know the site; you can ask the server administrators to whitelist the default IP that Deepcrawl uses to crawl:

How to debug blocked crawls in Deepcrawl - Whitelist IP address

Change user-agent in project settings

Some websites will block requests which come from a Googlebot user-agent (Deepcrawl’s default user-agent) but do not originate from a Google IP address. In this scenario, selecting a different user agent in a crawl’s advanced project settings often makes the crawl succeed.

How to debug a blocked crawl in Deepcrawl - Change user-agent

Stealth Mode

Use Deepcrawl’s ‘Stealth Mode’ feature, which can be found in a crawl’s advanced project settings. Stealth mode crawls a website slowly using a large pool of IP addresses and user agents. This typically avoids many types of bot detection.

How to debug a blocked crawl in Deepcrawl - Stealth mode

Enable JavaScript rendering

Certain websites attempt to use JavaScript to block crawlers that do not execute the page. This type of block can normally be circumvented by enabling our JavaScript Renderer.

How to debug a blocked crawl in Deepcrawl - JavaScript rendering

Frequently asked questions

Do I need to implement more than one solution?

Although one solution can help unblock Deepcrawl, sometimes it is necessary to try another method. For example, as well as whitelisting Deepcrawl’s IP it might also require changing the user-agent of the project.

Why can I still not crawl my website?

If you still can’t crawl your website (after trying multiple solutions) then we recommend reading the how to fix failed website crawls guide for more tips to debug failed crawls.

How do I find out if my website is using a CDN?

If you are unsure whether a website is using a CDN then read the following guide on how to identify what CDN (if any) a website is using.

How can I identify Deepcrawl in my log files?

Deepcrawl will always identify itself by including ‘Deepcrawl’ within the user agent string.
e.g. Mozilla/5.0 (compatible; Googlebot/2.1; +

Any further questions?

If your crawls are still getting blocked, even with implementing the solutions suggested above, then please don’t hesitate to get in touch.


Adam Gent
Adam Gent

Search Engine Optimisation (SEO) professional with over 8 years’ experience in the search marketing industry. I have worked with a range of client campaigns over the years, from small and medium-sized enterprises to FTSE 100 global high-street brands.

Choose a better way to grow

With tools that will help you realize your website’s true potential, and support to help you get there, growing your enterprise business online has never been so simple.

Book a Demo