Many of DeepCrawl’s features are centred around identifying and monitoring issues with site architecture. But the tool can also be used creatively to improve user experience, gather data about the structure of your site, and even make non-technical tasks such as seeking out text on your site easier and more reliable.
You can make modifications to the URLs, as they are being crawled, using the ‘Remove URL Parameters’ and ‘URL Rewriting’ features in Advanced Settings, in step 4 of the crawl setup.
These features are useful to undertake tasks such as removing URL components that are complicating analysis of your website or to rewrite URLs to an external website, such as lookup service e.g. retrieving information from an API for a set of your page URLs.
Sometimes, when running a crawl on a site (or a section of a site), you may find that it isn’t progressing past the first level of URLs. When this happens, only the base domain, or “start URLs” are actually crawled.
This problem has several possible causes, and various ways in which you can rectify the issue.
You can crawl your site staging or test environment and compare it to your live website to see how they are different.
This can help you test a version of your website or part of your website before you release it to the live environment and check new site wide additions such as canonical tags, social tags or page pagination implementation etc.
You can extract specific information and data from any web pages by running a custom extraction with DeepCrawl. This can be useful if you need to check your analytics or social tagging or for extracting backlinks and product data.