Supported Upload Files

DeepCrawl supports the uploading of a wide range of file types to crawl sources. This page runs through the valid file types for each source, and where to find them.

Read more

Limiting the Size and Depth of a Crawl

This guide explains how you can restrict the overall size and depth of a crawl before you start or during your crawl. This is useful to prevent a lot of URL credits being used unintentionally, or to run a discovery crawl, when you first start to crawl a website and don’t yet know the optimal settings.

Read more

Four Awesome Things You Can Do With Regex

Many of DeepCrawl’s features are centred around identifying and monitoring issues with site architecture. But the tool can also be used creatively to improve user experience, gather data about the structure of your site, and even make non-technical tasks such as seeking out text on your site easier and more reliable. We thought we’d share four examples of how using…

Read more

Modifying URLs And Stripping Parameters

You can make modifications to the URLs, as they are being crawled, using the ‘Remove URL Parameters’ and ‘URL Rewriting’ features in Advanced Settings, in step 4 of the crawl setup. These features are useful to undertake tasks such as removing URL components that are complicating analysis of your website or to rewrite URLs to an external website, such as…

Read more

Restricting a Crawl to Certain Pages

You can restrict a crawl to any set of pages, using a mixture of inclusion and exclusion rules in the Advanced Settings if you want to check or analyze a specific section of your website, instead of crawling your full site.

Read more

How to Fix your Failed Website Crawls

Sometimes, when running a crawl on a site (or a section of a site), you may find that it isn’t progressing past the first level of URLs. This problem has several possible causes, and various ways in which you can rectify the issue.

Read more