Examples of using regular expressions in Archivarix CMS

archivarix review

Examples of using regular expressions in Archivarix CMS

Published: 2020-05-29

How to generate meta name="description" on all pages of a website? How to make the site work not from the root, but from a subdirectory?

Sometimes it happens that on some pages of the restored site there is no meta name="description" tag. It can be added manually, but if it is not available on hundreds or thousands of pages, then it will be difficult to do. In order not to think about compiling page descriptions for a long time, you can simply put in this tag the first phrase that appears in the text on this page. Usually it will be relevant.

The opportunity to use regular expressions for searching and replacing in Archivarix CMS may help you with it. Simply copy the following regular expression into the appropriate fields in the Search and Replace tool and start the process.

(</title>)(.*?<p>([^"<]{50,200}\.))

$1
<meta name="description" content="$3">
$2

meta name="description"

This regular expression creates the <meta name = "description" content = tag immediately after the closing </title> and adds text from the page starting with the paragraph tag <p> and having a minimum of 20 characters and a maximum of 200 characters and closes the tag with a dot . . The filter rule makes replacements only on those pages where there is no meta name = "description" .

Another example: A restored site can be redone so that it can work from a subdirectory. This may be necessary if you need to host several restored websites on the same domain.

To begin with, we change all the paths in the site structure by using the Search and Replace URL tool .

To all URLs from the beginning ^ we add a new path /newsite1

Next, we replace all the addresses inside the pages using regular expressions, be sure to include all the files (js, css, txt, json, xml) in the request:

\b((?:href|src)=['"]?)(/[^/])

$1/newsite1$2

To fix links to images in CSS files, you can use the following regular expression:

(url \ (['"\ s]) (/ [^ /])

Now in the .htaccess file you need to replace the line RewriteRule. /index.php [L] on such a line - RewriteRule. /newsite1/index.php [L]

Your website will work now at domain.com/newsite1

The use of article materials is allowed only if the link to the source is posted: https://archivarix.com/en/blog/regex-add-description-website-on-subfolder/

7 Years of Archivarix

Today is a special day — Archivarix is celebrating its 7th anniversary! We want to thank you for your trust, ideas, and feedback, which have helped us become the best service for restoring websites fr…

9 months ago

To everyone who has been waiting for top-up discounts!

Dear Archivarix users, Congratulations on the upcoming holidays and thank you for choosing our service to archive and restore your websites!…

1 year ago

6 Years of Archivarix

It's that special time when we take a moment to reflect not just on our achievements, but also on the incredible journey we've shared with you. This year, Archivarix celebrates its 6th anniversary, an…

1 year ago

Price Changes

On Feb 1st 2023 our prices will change. Activate the promo-code and get a huge bonus in advance.…

2 years ago

Black Friday

Discounts from Archivarix on Black Friday and Cyber Monday.…

3 years ago

4 years of Archivarix!

It has been four years since we made the Archivarix service public on September 29, 2017. Users make thousands of restorations every day. The number of servers that distribute downloads and processing…

3 years ago

What can be recovered from the Web Archive?

Sometimes our users ask why the website was not fully restored? Why the website doesn't it work the way I would like it to? Known issues when restoring sites from archive.org.…

4 years ago

BLACKFRIDAY

Two big tasty coupons are valid from Friday 27.11.2020 to Monday 30.11.2020. Each of them gives a balance bonus in the form of 20% or 50% of the amount of your last or new payment.…

4 years ago

Happy 3rd birthday to Archivarix

Three years ago, on September 29, 2017, our archive.org downloader service was launched. All these 3 years we have been continuously developing, we have created our own CMS, a Wordpress plugin, a syst…

4 years ago

Archivarix.net - Web archive and search engine.

Wayback Machine ( web.archive.org ) Alternative. Internet archive search engine. Find archived copies of websites. Data from 1996. Full-text search.
In the near future, our team plans to launch a uni…

4 years ago