Guide to RegEx Changes in DeepCrawl

Adam Gordon
Adam Gordon

On 25th February 2019 • 1 min read

From 27th of March 2018, DeepCrawl changed the primary reporting database from PostgreSQL to Elasticsearch, to significantly improve the speed and reliability of reports.

All new projects created after 27th of March are using the Lucene regular expression engine for regex matches. Existing projects will be migrated to the new databases during April 2018.

However, as all these benefits come from a new system there are some changes in the type of RegEx you can use to filter your reports. The following should be observed:

- Use () for matching OR statements
e.g. (a|b)
or wrap in .*
e.g. .*a.*|.*b.*

- Use character classes such as [0-9] instead of d, s , S, D, w, W, b

If you need more information, then Elasticsearch has their own guide to RegEx Syntax.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax

Or you can contact support@deepcrawl.com for any assistance.

Author

Adam Gordon
Adam Gordon

Product Manager at DeepCrawl.

Get the knowledge and inspiration you need to build a profitable business - straight to your inbox.

Subscribe today