Regular expressions used in Archivarix CMS

Published: 2020-02-05

This article describes regular expressions used to search and replace content in websites restored using the Archivarix System. They are not unique to this system. If you know the regular expressions of PHP, Perl, Java or other programming languages, then you already know how to use our search and replace.

Regular expressions are a formal language for finding and manipulating substrings in text, based on the use of metacharacters. For searching, a template is used consisting of characters and metacharacters and defining a search rule. For text manipulation, an additional replacement string is also specified, which may also contain special characters.

Here is the online regular expression constructor, which allows you to create and test regular expressions using a simple interface - https://regexr.com/

The list of regular expressions used:

[abc] A single character: a, b or c
[^ abc] Any single character but a, b, or c
[a-z] Any single character in the range a-z
[a-zA-Z] Any single character in the range a-z or A-Z
^ Start of line
$ End of line
\ A Start of string
\ z End of string
. Any single character
\ s Any whitespace character
\ S Any non-whitespace character
\ d Any digit
\ D Any non-digit
\ w Any word character (letter, number, underscore)
\ W Any non-word character
\ b Any word boundary character
(...) Capture everything enclosed
(a | b) a or b
a? Zero or one of a
a * Zero or more of a
a + One or more of a
a {3} Exactly 3 of a
a {3,} 3 or more of a
a {3,6} Between 3 and 6 of a

The use of article materials is allowed only if the link to the source is posted: https://archivarix.com/en/blog/regex/