Anleitungen und Preise

What is Archivarix?

When a domain name expires, the website becomes unavailable for users. But it may contain a lot of useful unique content, tens and hundreds and thousands of good external links. What happens to them? It's complicated with the content: in most cases the hosting is also disabled or deleted, but all useful backlinks are still out there, wasting away.

So what happens to the site from the point of view of search engines? The site was indexed, but now it becomes unavailable, and after that the pages are to be gradually deleted from the index. What if the site becomes available again? The pages will be put back quickly together with their positions, as the addresses of the pages remain the same, the content is the same, and all backlinks are there too.

And that’s where you meet a challenge: how can you buy the domain back and restore the website as it was? Moreover, it should be restored in a proper way that all previous page addresses, both static and dynamic, would become available again with the same content. For this particular purpose we created Archivarix.

Archivarix provides complete restructuring and arrangement of the content of websites that are publicly shared in the Internet Archive. Archivarix proceeds and arranges data in such a way that all the addresses of web pages become available at previous addresses, including also the dynamic ones. The pages code can be fully processed to be brought into full conformity with all applicable standards; all missing or unclosed tags will be fixed. All counters, trackers, suspicious third-party frames and advertisements are cleaned out; CSS styles and JavaScripts are compressed if needed. Images are optimized and reduced in size without loss of quality, backlinks are cleared, 404 errors are repaired through substituting the necessary files. All this and more you can get in a single ZIP file, the content of which is adaptable to most stringent hosting requirements.

Systemization and optimization of the content is provided in such a way that the average Google Pagespeed of the new website will be around 90 against the original 40-50. In fact, this means that the website will be nearly perfect on the criteria of the search systems; however the subject of the site, content and addresses of the pages will remain the same.

For those who strive for complete perfection there is an option to transform the site to a HTTPS version, where all the internal links in the code are also be transformed to HTTPS, and previous addresses in the search engine index are changed to the new version using soft HTTP 301 redirects.

Systemized content is arranged in a clear and comprehensible structure where all text files such as HTML, CSS, JS, XML, etc., are kept in one folder, and all binary files such as images, video, archives, PDF and other documents are located in another one. The connection between these files and the site as it is seen by the user is available in several popular formats, including JSON file, SQL file base and .htaccess rules. You can use our tools for displaying and managing the site or create your own tools for this purpose. Everything is made up for maximum convenience of further use by webmasters or former owners of the site.

Why would anyone need it?

There are several use cases. The most popular is building up your own PBN (Private Blog Network). This is the so-called private network of websites or blogs, the purpose of which is to provide large quantities of backlinks to promote the main site(s). If you buy links on marketplaces like Sape, you should be well aware that this process of purchase must be built up first and then maintained. This means constant expenses on external links. However you have no control over the pages where these links are placed. Backlink sellers may abuse their business so much they are put in spam filters, negatively affecting your website as well.

The market of expired domain names, so-called "drops", is very large. But in fact, buying a domain name and spending a few dollars on it, you can get a domain that has already been filtered by search systems, and has good external links. Former owners of this domain name have invested considerable budget to promote their website, and over the years of its existence the website has collected some natural backlinks from social networks, forums, blogs and news sites. And all these links are still out there. We hardly need to explain that high quality of such backlinks cannot be matched by quality built by adding the site to web directories or using XRumer, not even by buying permanent links? This domain has its own history, as well as its backlinks. And search systems recognize this well.

By purchasing such a domain name and simply creating your website on its basis, you will naturally receive some advantages compared to a new domain without any history. But these advantages will fade pretty quickly, because the search systems will recognize that the subject of the site has dramatically changed, the old content is no longer there and most importantly, all internal addresses of the backlinked pages no longer exist either. Such a domain name can lose its value and fall under the filter again quite quickly. That is why, when you buy a drop website it’s very important to let the search engine know: “all right, it’s me again, just has been unavailable for some technical reasons”. Then you can put and/or sell any links for the website as you wish. With time you can gradually replace the pages and turn it into a complete brand new site.

Trust us it’s almost impossible to scrape together all the content of a defunct website. Even a business website containing only a dozen of pages will take few hours of work, because not only HTML pages are important, but also JavaScripts, CSS files, images and design elements. Archivarix was created for this particular reason. Structuring is made so that you could recreate the exact structure that is familiar to the search systems and improve what will not affect the content itself but will improve the technical quality of the website.

But let’s get back to use cases again: development of your own PBNs for backlink promotion of your websites; receiving profit from selling links on marketplaces; niche traffic which can be easily converted; reselling of existing working websites with good traffic ranking. Your investment to such a website can be limited by only few dollars.

How much does it cost?

There is a free limit of 1 files. The site that Archivarix estimates to contain 1 or less files may be recovered for free. There are no limitations for the number of sites.

A first thousand above this limit will cost $10 per thousand of files (1 cents per file). Every next thousand will cost $1 only. The cost will be calculated based on the accurate quantity of files for downloading.

First example: the site contains 385 files, including all pages, images, scripts and style files. From this quantity you can deduct 1 because they will be free of charge. So we have 384 files left and you need to pay only for these. Multiply by the file price $0.01, and it equals to $3.84. The cost of the site recovery is $3.84!!!

Second example: the big site contains 25,520 files. From this quantity you can deduct 1 because they will be free of charge. So we have 25,519 paid files. First thousand will cost $10, and the rest 24,519 costs only $1 per thousand, therefore $24.519 . Full price for the big site recovery is $34.52!!!

That's it! No regular subscriptions, no additional charges for the chosen optimization or preparation of a new archive with new updates and features that may be introduced in the future.

Let's make it clear, we are currently on an alpha testing stage and the price may be increased, because it can’t go lower, you see the numbers. But whatever you restore will remain yours.

At the moment we accept payments in Bitcoin cryptocurrency, WebMoney PayPal and YooMoney. Money transfer is immediate, Bitcoin takes up to half an hour depending on how fast your transaction is included in a generated block. Because of low prices and small payment amounts the minimum amount of payment is $10. The unused balance will be kept and used for your further needs.

How do I Recover a Website?

The recovery form has only two required fields: the domain name and your e-mail address. Others are not required. To avoid any mistakes that we would like to describe the first three important fields that affect the process of initial download. They are so important because you will not be able to change them further on. All other options can be changed and you will be able to recover the site again for free, choosing which of these options fits your needs.

So, please pay maximum attention to these fields:

Domain: enter only the domain name. Do not enter http:// or paths of the site pages. We don’t recommend putting www in the domain name. If the site was originally on a third-level domain like, for example, domain.co.uk, or you know exactly that you will need only blog.domain.com, you can specify the subdomain. But in the case with the blog we still recommend you specifying only domain.com, because it’s likely that some important files with scripts or graphics were hosted on the main domain or on its other subdomain. If you specify only domain.com, we will recover as much information as possible and you will not have to pay again for recovery with new options.

To timestamp: sometimes this option is very important. If the domain is vacant, but six months ago the hosting or domain provider placed a parked page there, then it will be the recovered as the main page. Check out how the site looked to the last state through the Internet Archive at web.archive.org. If there was such a parked page or a version of the site that you didn’t want, return back to the time when you see the version of the site that you will need. This page will contain the figures in its URL like YYYMMDDHHMMSS. Example: web.archive.org/web/20160314052311/… Keep in mind that those numbers are NOT a website version or page version. Those numbers mean the exact time when the html code of the main page (or a specific url you check) was saved. Each other element of that page like images, javascript files, css files etc have its own timestamps and its own calendars. That's why it's important to give bigger time range so other elements and internal pages can be included.

Sometimes the latest versions of the main page were redirected through 301 redirect to another site. We ignore such redirects and download only the addresses of the pages that release the content without errors and redirects (returned code '200 OK').

Timestamps should be set in a short form, for example, 2015 (means: everything saved till the end of 2015 or the same as 20151231235959) or 20040204 (means: till the end of the day Feb 4th 2004).

From timestamp: in our experience, it's best to leave this field blank, because some of the important files for the operation of the site (files of styles, scripts) may not fit into your limited range and you may wonder why your website is displayed in the Internet Archives, but the recovered version is incorrect. Use it only with a good understanding and confidence.

Please also check if you entered your e-mail address correctly. Use only active e-mail address, because further references to recover, options changes and access to your account will be sent to this e-mail address.

Short summary of the recovery process: you send the data through the form. Archivarix takes a few minutes to check the available data and sends you the details to your e-mail address. By clicking on the link, you can confirm or start the recovery if the site is included in our free limit, or add funds by the amount necessary to start the process and then begin to recover the site. After that, in the order of priority, the Archivarix servers perform intensive extraction, processing and structuring processes. When it is done, you get an e-mail notification and a link to download the finished zip file.

We don't have "registration", you do not share your personal data. All links to recovered sites will be sent to your e-mail address. At the end of recovery the users that made a paid recovery will have an access to a personal account.

Test recovery

If you are first here and did not see how our system works we have prepared the test recovery for you. There is a page that Archivarix provides when the site is restored from the Wayback Machine.

For example we have chosen a website fire.com - year 2005. You can configure new recovery options by clicking "More" - "New options", you can also download a zip file with the site archive. To view the site on localhost or on another server make sure to change the ARCHIVARIX_CUSTOM_DOMAIN parameter in the index.php file by entering 'ORIGINAL' => 'fire.com' and 'CUSTOM' => 'your domain for testing or localhost'

Recommended options

These options are recommended to use to improve the technical quality of the site. Exactly the first four items from this list can make a huge improvement in Google Pagespeed and will be more accepted by search engines compared to the best times of the existence of the original site.

Optimize HTML code: we bring HTML code in conformity with HTML standards, repair the incorrect use of tags and add the missing ones. The text content of the pages remains unchanged. The search systems easily accept validated codes without errors.

Optimize images: this feature removes EXIF, IPTC and excess information from images, apply correct color chroma subsampling at 4:2:0, change the code to progressive presentation of data, reduce what can be reduced without loss of quality. Shortly speaking, this option allows you to reduce the size of image files without loss of image quality. The pixel size remains unchanged.

Minify JS: this feature makes it possible to reduce JavaScript file size by code minification. If you intend to edit website scripts, we recommend you disabling this feature and using GZIP compression on the server. In rare cases of invalid code the file may become damaged, although usually in such cases the file might have been damaged from the very beginning.

Minify CSS: same as with JS, this feature performs file minification with CSS style sheets. Do not use it if you plan to edit the code and the external design of the site to your liking. Compression can be achieved on the server.

Remove trackers and analytics: thanks to this feature, it is possible clear out about 14 thousand of known external counters, trackers, analysts, cookies and other stuff that you don't really need. It's a very complex and time-consuming process that is impossible to do manually. Google Analytics? Yes, it also can be cleared out ;) Various services and search systems verification meta-tags? Yes, this too, so that the former owner could not harm your site through WebmasterTools.

Remove external links: everything that was in a tag <a> and lead to other domains will be removed from the code, together with the internal contents of this tag. If you plan to sell the links through marketplaces, then you definitaly will need to clear out the links using this option because this might be your future income.

Remove clickable contacts – tel:, sms:, skype:, mailto:

Remove external iframes: this feature is available but still needs to be improved. Sometimes it leaves some garbage after clearing up Facebook widgets and/or some Google Maps inserted with the use of an <iframe> tag.

Make internal links relative: all urls withing its domain will be converted to relative urls. Links and urls between different subdomains will not be converted.

Make a non-www. website: We highly advise to turn this option on. Often the sites do not correctly perform redirects between the versions of the sites with and without WWW, resulting in duplicated pages in the indexes of the search systems. This option allows you to fix this problem, because duplicated pages will be properly connected, but most importantly, the entire internal linking wherever you had links will be changed to a new version of the site without WWW. Of course, all the old addresses with WWW will be redirected through 301 redirect, so that the search systems would correctly recognize and apply such a change. Other non-WWW subdomains do not fall within the scope of this option.

Make a website with www.: It does the same job as a previous setting but all urls for the main domain are on www. subdomain. We still recommend you using an option 'Make a non-www. website'. Even if you see on some external service that backlinks point to www.domain urls, our script will make a correct 301-redirect to pass all backlinks juice. We also do not recommend this option because search engines and modern browsers tend to prefer websited without an archaic www subdomain.

Keep redirections: If some pages of the website did not have any content but only redirected to some other internal pages of the same website then this option will create keep thos redirects. A good example are websites created with Drupal. Links to articles might look like /node/ 123, that redirected to /articles/Something-something. This option allows you to reduce the number of orphan pages.

Advanced options

These options require a good understanding of the implications of such an optimization, because it will need to provide additional work when setting up the site on a hosting.

Make an HTTPS website: Here you need to be even more carefully. This option will improve the site quality, because this site and its entire internal links (not only in HTML, but also in CSS styles, at JavaScripts and xml feeds) will be replaced by HTTPS instead of HTTP. The site will become more up-to-date, but you will need to buy an SSL certificate for your domain. In fact, you can get it for free on Let's Encrypt, which is integrated into some popular hosting panels. But if your site had been working on different subdomains, for example, you had a shop at eshop.domain.com, blog at blog.domain.com and a site itself at the main domain, Let's Encrypt won’t issue you a free wildcard domain name, so you will have to buy it. In the first case, we recommend putting effort and changing the site to HTTPS with free certificate from Let's Encrypt, because it will take you only a couple of clicks.

Extract structured content: This option is experimental. Along with a main restore, our system will create an additional .zip file that will contain files in csv, json, ndjson, xml and wxr formats. Those files will contain clean articles/texts without a website design, menus etc. All images in the articles will have its original website URLs. We will release a WP plugin that allows to download external images from their original location or directly from Web-Archive.

The extraction process is very time and machine resources consuming. We use our own solution for deep learning on neural networks. But don't expect any miracles, it's still an automated extraction.

Embedded options

These options fully describe themselves and you can't turn them off in the process of preparation of the website. However, they can be changed manually later on by editing parameters in our loader (index.php):

All missing pages will 301 redirect to a main page: the index may have not all the files that were on the original site. Because of the fact that such missing pages might have some valuable backlinks, it would be safe to redirect these backlinks to the main page of the site rather than to a 404 error page. This behavior can be changed in ARCHIVARIX_CMS_MODE parameter in index.php.

Fix 404 errors for missing images: in order to avoid for the search engines displaying 404 errors and missing images in the design of the site, we show 1-pixel transparent PNG image. This behavior can be changed in ARCHIVARIX_MISSING_IMAGES parameter in index.php.

Fix 404 errors for missing js and css files: the same as above. It's better to show an empty text file than a page with 404 error. This behavior can be changed in ARCHIVARIX_MISSING_JS and ARCHIVARIX_MISSING_CSS parameters in index.php.

File creation time is exact as the restored source time from Archive.org: it is disputable if it affects anything or not, but we have decided to implement it anyway. Your web server may return this data in the Last-Modified header.

Generate an additional .json file with a complete list of urls/files/options: additional file which will only benefit you if you prefer to work with JSON format and plan to create your own tools to work with recovering from Archivarix. Together with this file there is a .db file in the form of Sqlite3 file base.

Options for Live Download

Besides restores from Web-Archive, we can download live existing websites. There are additional settings for the Download website form.

Files limit: You can limit the amount of files our system downloads. If you are not signed in or your balance is $0, then you can set only up to a 200 files free limit. If you have funds on your balance then you can set a value up to your available balance. Prices are the same as for a Wayback Machine restores.

A frequent question: what number should I set to download a website X completely? Answer: Unfortunately, it's technically impossible to find out without downloading the whole website first. Be guided by how much you are willing to spend on downloading the site and, if necessary, top up your balance so that you can set the right amount. When you start the download, the amount is blocked from your balance and if the whole site is downloaded and the final count of files is smaller, the difference is credited back to your balance. If the download stoppes at the limit, and the site is not fully saved - there is no way to continue the download. It is only possible to start a new download with a new limit. Therefore, we recommend initially setting a limit higher than what you expect from the site.

Page depth: A number of clicks from the main page our crawler will make. If the site was originally made with SEO in mind, then level 6 is enough. The maximum depth level can be set up to 10. We do not give the opportunity to set this parameter higher or without this restriction, because in case if we start downloading a website where new links with content are generated on the fly, then such a site can be downloaded indefinitely and is unlikely to have any good value for you.

Different server IP: For cases when the domain expired but the original hosting still hosts the website and you know its IP. OUr crawler will make all requests to ipv4 IP you set.

Download with all subdomains: We highly recommend turning this option On. Don't forget that www - is also a subdomain.

Referrer to bypass cloaking: Some doorways show a website only if a user comes from a search engine. Here you can write a URL, for example, https://google.com/, so that our spider emulates clicks to a site from a search engine or another site.

Download site via Tor network: Allows to download DarkNet .onion websites. If you enable this option to downlaod a normal site, then our spider's visits will occur from different Tor IP addresses, but this is not always good, because many hosters or, for example, CloudFlare can restrict access to the site or show a captcha for Tor ip addresses.

Download website as: Allows you to specify the User-agent from the possible options. It may come in handy in case you need a mobile version of the site, or the site shows the content only to search engine spiders.

After recovery

When recovery is complete, you will receive a notification e-mail and a link to the page of recovery, where you can download your zip file. On that page you will see the original file with the options you have selected for recovery. The original file cannot be deleted, because it contains the data necessary to create its "clones". The originals and clones can be identified by the icons. This is a clone, and this is the original file.

A clone is a copy of your recovery with its own individual features. The clone is created from the original downloaded data, but all of the features except for the domain name and the timestamps, can be set anew. The generation of copies is free and usually takes less time than the initial recovery.

If you need to change any of the options, for example, not to use minification of CSS and JavaScripts, you can open the menu "Additional" in the recovery menu and press "New options" button. When a zip file of a new clone will be ready, you will receive a notification e-mail with a link to it.

Inappropriate (for any reasons) clones are recommended to be deleted to avoid confusion and not to waste the limit of three clones for recovery. If you have created three clones and you want to create a new one, first you need to remove the clones that you don’t use. Please, note that the clones that are in the process or in the queue for recovery cannot be deleted.

The only two simple hosting requirements are PHP 5.6 or newer with PDO_SQLITE extension enabled. These requirements are suitable for almost all standard modern hosting services. Just upload everything from the domain name folder inside a .zip file you get to the root folder of your domain.

CMS integration

We have created our own CMS for editing restored websites. You can download the latest Archivarix CMS version here. Just upload one PHP file to the same directory as you restored website and you're ready to edit. For security reasons we recommend restricting access by password (just put your password into ACMS_PASSWORD value) and/or restrict by IP addresses (edit ACMS_ALLOWED_IPS) in cms source-code.

Future updates

With the emergence of new options or improvements/changes of old ones you will not have to pay for a new recovery. Simply create a new clone with the desired options and all the new changes will apply to it just as if you would create it from scratch. Creating clones is free.

Want to know more?

We will be very pleased to communicate and discuss with you and receive your questions and ideas. Your feedback is highly important for development of Archivarix. You can always contact us via the form in the section "Contacts" in the top menu of the site.

If you want something more than asking a question, you are welcome to our Slack channel dedicated to Archivarix. Access to this channel is given on invitation and we want to make sure that people who communicate on this channel know well what PBN is, and they should be ready to make an active contribution to the development and improvement of Archivarix. If you are confident that you want to be a part of the history of Archivarix, please contact us via the contact form and specify the e-mail address to send you the invitation.

We also have created an Archivarix channel on Telegram. Time will show how much we will need this group to keep developers and webmasters in touch. It's easy to join it: in the section “Ask a Question”, select the Telegram tab to view the instructions and the address.

And, if anything went wrong, you spotted an error/mistake, or you have questions about your site recovery, or just want to support or criticize us, you can always respond to any notification e-mails that our system sends to you. We read our e-mail every single day and will do our best to help you.