This article describes regular expressions used to search and replace content in websites restored using the Archivarix System. They are not unique to this system. If you know the regular expressions of PHP, Perl, Java or other programming languages, then you already know how to use our search and replace.
Regular expressions are a formal language for finding and manipulating substrings in text, based on the use of metacharacters. For searching, a template is used consisting of characters and metacharacters and defining a search rule. For text manipulation, an additional replacement string is also specified, which may also contain special characters.
Here is the online regular expression constructor, which allows you to create and test regular expressions using a simple interface - https://regexr.com/
The list of regular expressions used:
[abc] A single character: a, b or c
[^ abc] Any single character but a, b, or c
[a-z] Any single character in the range a-z
[a-zA-Z] Any single character in the range a-z or A-Z
^ Start of line
$ End of line
\ A Start of string
\ z End of string
. Any single character
\ s Any whitespace character
\ S Any non-whitespace character
\ d Any digit
\ D Any non-digit
\ w Any word character (letter, number, underscore)
\ W Any non-word character
\ b Any word boundary character
(...) Capture everything enclosed
(a | b) a or b
a? Zero or one of a
a * Zero or more of a
a + One or more of a
a {3} Exactly 3 of a
a {3,} 3 or more of a
a {3,6} Between 3 and 6 of a
The use of article materials is allowed only if the link to the source is posted: https://archivarix.com/en/blog/regex/
The Archivarix system is designed to download and restore sites that are no longer accessible from Web Archive, and those that are currently online. This is the main difference from the rest of “downl…
By using the “Extract structured content” option you can easily make a Wordpress blog both from the site found on the Web Archive and from any other website. To do this, firstly find the source websit…
In order to make it convenient for you to edit the websites restored in our system, we have developed a simple Flat File CMS consisting of just one small php file. Despite its size, this CMS is a powe…
This article describes regular expressions used to search and replace content in websites restored using the Archivarix System. They are not unique to this system. If you know the regular expressions …
Our Website downloader system allows you to download up to 200 files from a website for free. If there are more files on the site and you need all of them, then you can pay for this service. Download …