What is Archivarix?
Archivarix is a free opensource CMS combined with an online website downloader and a wayback rebuilder. With our
system you can
restore
any website from The Wayback Machine (web.archive.org) exactly like it was. Or you can
download an existing
website and get it
in a zip file. When scraping process will be completed you will get a fully workable copy of restored/downloaded
site with Archivarix CMS, so you can easily modify and operate it.
Restore
from
Wayback Machine
Restore a website that previously existed and was crawled by Wayback Machine. We process all restored data to
provide a final ready-to-upload website with a lot of additional improvements, code fixes, ads removal, images
optimizations etc.
Download
a live website
Download or convert any existing website to make it secure, optimized and editable through our CMS. You can also
download websites with expired domain but working hosting.
Edit
and manage
websites
We've created Archivarix CMS for a convenient way to edit restored or downloaded websites. Single file to
upload, no installation required.
Archivarix
Blog
Learn how to work with WebArchive or use Archivarix to the maximum.
2023.10.10 Archivarix loader update for bots block support, custom rules page-depth support.
2022.09.29 Archivarix CMS improved x20 speed and a new loader version
- added a lot of new commands for CLI
- unnecessary files cleanup when working through the CLI
- improved cleanup of unused tags in templates when posting new pages
- automatic migration to the new database schema 1.0.2
- a new tool for converting images to WebP format
- file parameter in the template can be passed as a URL in CLI to download it
- a tool for detecting the depth of pages
- orphan pages detection
- page depth can be filtered in Search and Replace
- the ability to block various bots
- custom rules can use the depth of the pages
- processing time for removing broken links and images has been accelerated by x20
- improved conversion to www and back
- support for page metrics in the database schema for future updates
- release and support of the new loader version 0.1.220929
- a lot of minor optimizations, processing speed up, code refactoring
2022.05.05 Archivarix CMS CLI support
- default memory_limit increase
- cli timeout turned off
- cli import to skip download if import file presents
- less temporary files/better cleanup during import/install operations
- system info at the initial installation screen
- quick "Find in all" option in Search & Replace for URLs
- bulk hostname set for url replace results
- improved HTTPS detection for websites behind CloudFlare
- structure change to support huge restores (7M+ files)
- IDN support for hostnames in Search & Replace
2022.02.06 Archivarix CMS update
- export to flat-file structure improvement for .htm URLs
- support for PHP with disabled disk_free_space()
- modx integration support
- speed improvements to handle restores with a lot of subdomains
2020.05.21 An update that web studios and those using outsourcing will appreciate.
- Separate password for safe mode.
- Extended safe mode. Now you can create custom rules and files, but without executable code.
- Reinstalling the site from the CMS without having to manually delete anything from the server.
- Ability to sort custom rules.
- Improved Search & Replace for very large sites.
- Additional settings for the "Viewport meta tag" tool.
- Support for IDN domains on hosting with the old version of ICU.
- In the initial installation with a password, the ability to log out is added.
- If .htaccess is detected during integration with WP, then the Archivarix rules will be added to its
beginning.
- When downloading sites by serial number, CDN is used to increase speed.
- Other minor improvements and fixes.
This and many other cosmetic improvements and speed optimizations.
2020.05.12 Our Archivarix CMS is developing by leaps and bounds.
The new update, in which the following appeared:
- New dashboard for viewing statistics, server settings and system updates.
- Ability to create templates and conveniently add new pages to the site.
- Integration with Wordpress and Joomla in one click.
- Now in Search & Replace, additional filtering is done in the form of a constructor, where you can add any
number of rules.
- Now you can filter the results by domain/subdomains, date-time, file size.
- A new tool to reset the cache in Cloudlfare or enable / disable Dev Mode.
- A new tool for removing versioning in urls, for example, "?ver=1.2.3" in css or js. Allows you to repair even
those pages that looked crooked in the WebArchive due to the lack of styles with different versions.
- The robots.txt tool has the ability to immediately enable and add a Sitemap map.
- Automatic and manual creation of rollback points for changes.
- Import can import templates.
- Saving/Importing settings of the loader contains the created custom files.
- For all actions that can last longer than a timeout, a progress bar is displayed.
- A tool to add a viewport meta tag to all pages of a site.
- Tools for removing broken links and images have the ability to account for files on the server.
- A new tool to fix incorrect urlencode links in html code. Rarely, but may come in handy.
- Improved missing urls tool. Together with the new loader, now counts calls to non-existent URLs.
- Regex Tips in Search & Replace.
- Improved checking for missing php extensions.
- Updated all used js tools to the latest versions.
This and many other cosmetic improvements and speed optimizations.
2020.02.14 New Friday, new updates!
A lot of new and useful was done in Archivarix CMS:
- In Search and Replace, you can now filter by url date.
- Now external links from all pages of the site can be deleted with the click of a button. Anchors are
preserved.
- The new ACMS_SAFE_MODE parameter, which prohibits changing the Loader / CMS settings and loading custom files,
is also prohibited from importing import settings and custom files.
- The JSON settings files for the Loader and CMS can now be downloaded to your computer and downloaded to the
CMS from a file on the computer. Thus, the transfer of settings to other sites has become even easier.
- Creating custom rules has become more convenient, there are often used patterns that you can choose.
- New custom files can be created in the file manager without having to download the file.
- The url tree for the main domain always comes first.
- If you hide the url tree for the domain / subdomain, then this setting is saved while working with the CMS.
- Instead of two buttons, open / collapse the url tree, now one that can do both.
- Creating a new URL was simplified and when creating, you can immediately specify the file from the computer.
- In the mobile layout, the main working part comes first.
- After each manipulation of the file, its size is updated in the database.
- Fixed buttons for selective history rollbacks.
- Fixed creating new urls for subdomains that contain numbers in the domain name.
2020.02.07 New portion of updates!
There is no need to change anything in the source code of the files.
- Now you can upload sites to the server by uploading to the server only one script from our Archivarix CMS.
- In order to change something in the CMS settings, you no longer need to open its source code. You can set a
password or lower limits directly from the Settings section.
- To connect your counters, trackers, custom scripts, a separate "includes" folder is now used inside the
.content.xxxxxx folder. You can also upload custom files directly through the new file manager in CMS. Adding
counters and analytics to all pages of the site has also become convenient and understandable.
- Imports support a new file structure with settings and the "includes" folder.
- Added keyboard shortcuts for working in the code editor.
These and many other improvements in the new version. The loader has also been updated and works with the
settings that the CMS creates.
2020.01.23 Another mega-update of Archivarix CMS!
Added very useful tools that allow the click of a button:
- clean all broken internal links,
- delete missing images,
- set rel = "nofollow" for all external links.
Now additional restores can be imported directly from the CMS itself. You can combine different restores into one
working site.
For those who work with large sites or use poor hosting - all actions that previously could stop at the timeout
of your hosting will now be divided into parts and automatically continue until they are completed. Want to make a
replacement in the code of 500 thousand files? Import several gigabyte recovery? All this is now possible on any,
even very cheap hosting. The timeout time (by default, 30 seconds) can be changed in the ACMS_TIMEOUT
parameter.
Our loader (index.php) now works on both http and https protocols, regardless of the build parameters. You can
force the protocol by changing the value of the ARCHIVARIX_PROTOCOL parameter.
2020.01.07 The next update of Archivarix CMS with the addition of new features. Now any old site
can be correctly converted to UTF-8 with the click of a button. Search filtering has become even better, because
results can be also filtered by MIME type.
2019.12.20 We have released the long-awaited Archivarix CMS update. In the new version, in addition
to various improvements and optimizations, a very useful feature has been added for additional filtering of search
results and full support for the tree structure of URLs for recoveries with a large number of files. For more
details, see the
Archivarix CMS script change
log.
2019.11.27 Our WordPress plugin
Archivarix External Images Importer has
been released. The plugin imports images from third-party
websites, links to which are located in posts and pages, into the WordPress gallery. If the picture is currently not
available or deleted, the plugin downloads a copy of it from the Web Archive.
2019.11.20 We have added a new section of our site -
Archivarix
Blog. There you can read useful information about the operation of our system and site restoration.
2019.10.02 Recently our system has been updated and now we have two new options:
- You can download Darknet .onion sites. Just enter .onion website address in the "Domain" field
here and our
system will download it
from the Tor network just like a regular website.
- Content extractor. Archivarix can not only download existing sites or restore them from the Web Archive but
can also extract content from them. In the "Advanced options" field you need to select "Extract structured
content". After that you will recieve a complete archive of the entire site, and an archive of articles in xml,
csv, wxr and json formats. When creating an archive of articles our parser takes into account only meaningful
content excluding duplicate articles, elements of design, menus, ads and other unwanted elements.
2019.09.18 New features and improvements:
- Create a website with a www default subdomain.
- Set a referer to bypass cloaking with our Live Website Downloader.
- New mode for returning 404 code instead of default 301 for missing urls.
- Improved external iframes removal.
- Improved loader (index.php).
2019.05.20 A new feature - select User Agent for downloading live websites. Do you need a version
that is shown to a Googlebot only? Now you can have it.
2019.05.09 A new feature - preserving 301/302 redirects for websites restored from Wayback Machine
and downloaded from live originals.
2019.04.26 A new version for our Archivarix Website Downloader. A huge speed improvements and
better crawling with support of modern websites.
2019.04.14 Archivarix
Affiliate
Program is
available! Start making money now. Get 15% from your referrals for life.
2019.03.03 Creating custom modules? Intergrating your existing link exchange system with Arvhivarix
restores? We've released a new loader (index.php) that has additional important variables for developers who create
their own custom include modules. All new restores come with an updated loader. You can download it manually
here to update your
existing website. It's
100% compatible with any previous restore made with our system. This update also has a significant speed
improvements and low memory consumption for sitemap.xml on big websites. Tested on 3+ mil pages restores.
2019.02.25 You can now Sign In or Register in a single click with Google.
2019.01.21 Improved live website downloader.
2019.01.09 Additional CMS update. Multiline search! Finally!
2019.01.07 CMS update. Default locale detection, external redirects support, Search & Replace
within other text formats, statistics with charts.
2018.12.01 We switching from 'Ālep to Bēt version of Archivarix! Live websites download
functionality is publicly available. All restored and downloaded websites are fully compatible with our Archivarix
CMS. Thank you, all our testers!
2018.11.01 An update for our
Archivarix
CMS. Code
improvements, more intuitive password setup and additional limits that allow working with big (50k+ files) restores
with little memory. We are working on switching "lean" mode on without code editing. Stay tuned!
2018.08.03 More support for very old and rare charsets.
2018.05.24 You can now use custom password protection for CMS by setting ACMS_LOGIN_PAGE =
'mypassword'; variable. Password must have at least 6 characters. Improved support for encoding on servers that
don't have mbstring.
2018.05.16 Our new CMS is officially released! Thank you, all alpha testers for working together on
a script that brings restoring websites to a completely new level. You can edit pages, add new ones, search and
replace matches... and a lot more. The latest version is available on our
Archivarix CMS page.
2018.04.02 XML Sitemaps! Just set ARCHIVARIX_SITEMAP_PATH in our index.php loader. If your restore
does not contain a new loader (release 20180403) with a sitemap support - you can create a new clone or contact us
and we will reassemble all your restores to the latest version. We also prepare all new restores to work with our
own CMS that we are working on. All old restores will be automatically converted to a new version before we release
our CMS.
2018.04.01 Improved charset detection for text/html mime-type files.
Loader: improved handling
for missing .css and .js with query; support for trailing slash in all URLs without queries.
2018.03.24 Improved support for hostings with old PDO_SQLITE version. Error messages in our
index.php loader are more user-friendly. "Make internal links relative" option is more intelligent now.
2018.03.08 Six new languages on user interface! Write us if you see any grammar errors.
2018.02.28 Restored websites will work on a different domain name by default. No need to set
ARCHIVARIX_CUSTOM_DOMAIN. It just works!
2018.02.26 Big important update! Our system can restore websites that were restricted by
robots.txt. We are very proud of this update.
2018.02.22 Fix to support some rare cases where $_SERVER['HTTPS'] is set to 'off' instead of empty
value.
2018.02.20 Fix with encoding detection. "Optimize HTML-code" feature now works as expected even if
the website had a rare non-utf8 charset.
2018.02.12 ¡Hablamos español! And some fixes and improvements for correct restores of websites with
mixed HTTP/HTTPS content.
2018.01.12 An additional CMS mode for Wordpress and other systems. A new option
ARCHIVARIX_CUSTOM_DOMAIN for the restored website to run on another domain or localhost.
2017.12.20 Fixes for missing subdomains on some recovered websites.
2017.12.01 We have added MIME types statistics on a download page. Now you can see how many jpegs,
htmls and other file types you will get in the archive.
2017.11.21 Bug with an infinite redirect loop on some sites is fixed.
2017.11.14 We have made content downloader based on PHP and SQLITE - we have a version for
Apache+PHP, NGINX+PHP and a legacy version with .htaccess only. Recovered sites will work much faster now. Other
features - integration with Wordpress, integration with any TDS or other custom scripts and so on - see full
description in "Tutorial and prices" section.
2017.10.21 Now system can find JS trackers and delete it when you select option "Remove trackers
and analytics".
2017.10.17 Fix for compatibility with ModPagespeed hosting
2017.10.11 New features added: "Remove trackers and analytics" - You will get recovered site clean
of any adds and banners (We have over 60000 code signatures in our database). "Make internal links relative" option
- system will update all links in downloaded site to relative.
2017.10.10 We have made tutorial videos in Russan and English. You can see it in "Tutorial and
prices" section.
2017.10.09 Error with CSS styles in some restored sites fixed. Some other minor errors are fixed.
2017.10.03 Big performance improvements. Our system is running faster now.
2017.09.29 We have launched our service. Downloaded archive contains only non-php version of
website. Own CMS and integration with other CMS like Wordpress, Drupal, Joomla are planned.