Getting Started

Here’s a good overview on how to crawl a website with 2.0 and also how to deal with multiple domains or subdomains.

Crawl Errors or Issues with Credits?

Are your crawls generating errors, or having issues getting to the required level of URLs? Here’s a guide on troubleshooting failed crawls.

Controlling the size and depth of a crawl can save credits and help to figure out the best way to crawl new sites, restricting the crawl to certain pages is also great for saving credits or analyzing specific content or locations.

Don’t forget to use crawl limits so you don’t waste crawl budget, especially on large sites or where some resources are low value. If you’re having issues with URL parameters or want to rewrite URLs on the fly for lookup services using APIs, we’ve got a guide for that too.

how to 2 how to 3 How to 4 How to 5

Site Changes and a Little Bit of Regex Magic

From full migration to simple updates, code regression and plain bugs impacting SEO perfomance are still very common. By following our guide to 2.0 crawling your live and test sites you can analyze and avoid any setbacks.

Regex can be a love/hate thing.. Once you know it you will love it, especially with the extra power it gives you in 2.0; from identifying thin content, to finding unique data about your products and copy, this post will detail how.

How to 8

Advanced Features in 2.0

First up is our guide to custom extraction in 2.0, this has wide ranging uses from testing marketing tags to checking schema.org markup. We also have the robots.txt overwriting feature guide which is really useful if you’re developing robots.txt for large or complicated sites and also for crawling a disallowed staging site.

How to 10 How to 11

Visit the Updated FAQs