What can be recovered from the Web Archive?

archivarix review

What can be recovered from the Web Archive?

Published: 2020-12-29

Sometimes our users ask why the website was not fully restored? Why the website doesn't it work the way I would like it to? There are several answers to this, and the very first one is that the website is being restored from the web archive, and therefore you can only restore what is and nothing more.

The Wayback Machine only saves the outer part of the website and cannot save the inner structure, admin panel, database, and so on. If the website was previously dynamic, then after restoring from the archive it will be static. Contact forms, comment boxes, and online purchases elements will not work. With a few exceptions - if it's all implemented in Java scripts that were saved by the Web Archive. You need to be more careful with them, because it often happens that they transmit or take some data from third-party domains, and if before, for example, there was a visiter counter script , now, after the domain was rebuilt, there can be everything including malvare.

After restoring the website, we recommend to check with our CMS all external links in the Java scripts code using the http: // and https: // templates and figure out what they do.

Another example of why restored website is not working as expected is not loaded CSS styles. On some websites, styles may be on a different domain. Our spider script will processes only links from one domain and does not follow external links. This issue can be easily verified by looking in the site code for the URL where the site styles are located. And if they look like this - https://another_domain.com/styles/main.css, it is better to download CSS styles from the Web Archive and manually upload them to the site using our CMS.

And finally, the third and most common case of incorrect operation of the restored site is incorrectly set recovery time intervals. The date the site was archived on archive.org does not mean that the entire site, from start to finish, was archived at that time. In fact, all files, styles, scripts, images were saved at different times. Too narrow period of time setted in our system, as a rule, leads to the fact that a significant part of the website will be not restored. Sometimes it is not easy to choose right timestamp, but to help you, we have an article on how to do it - https://archivarix.com/en/blog/3-how-does-it-works-archiveorg/

The use of article materials is allowed only if the link to the source is posted: https://archivarix.com/en/blog/what-can-be-recovered/

AI Video Summaries in Archivarix Tube Search

When you find a deleted YouTube video through Tube Search, you typically get metadata: a title, description, upload date, and sometimes subtitles. That is already useful. But reading through raw subti…

2 months ago

Archivarix Tube Search - A Search Engine for Deleted YouTube Videos

Tube Search is a search engine for archived YouTube data. The service aggregates information from multiple public sources: the Wayback Machine (Internet Archive), Common Crawl, and various collected Y…

2 months ago

Archivarix Broken Links Recovery: Free WordPress Plugin for Finding and Fixing Broken Links

Over time, external links in WordPress posts inevitably break, pages get deleted, domains expire, videos become unavailable. Checking hundreds or thousands of links manually is impractical. Archivarix…

3 months ago

How the Internet Archive Decides What to Archive: Priorities, Frequency, and Data Sources

One trillion saved pages. Over 99 petabytes of data. Hundreds of crawls running simultaneously every day. Behind these numbers lies a question that everyone who professionally works with web archives …

3 months ago

How to Find and Buy an Expired Domain with a Good History

Buying an expired domain with history is one of the most effective ways to launch a new project with an already existing backlink profile, trust, and even traffic. Instead of promoting a bare domain f…

3 months ago

Common Crawl as an Alternative Data Source for Website Restoration

When it comes to restoring websites from archives, almost everyone thinks only of the Wayback Machine. That's understandable: archive.org is well known, it has a convenient interface, a trillion saved…

3 months ago

Archivarix Cache Viewer Extension for Chrome, Edge and Firefox

We've released a browser extension called Archivarix Cache Viewer. It's available for Chrome, Edge and Firefox. The extension is free and contains no ads whatsoever.
The idea is simple: quick access …

4 months ago

AI Content on Restored Websites: How to Detect It and What to Do About It

When you restore a website from the Web Archive, you expect to get original content that was once written by real people. But if the site's archives were made after 2023, there's a real chance of enco…

4 months ago

Web Archive in 2026: What Has Changed and How It Affects Website Restoration

In October 2025, the Wayback Machine reached the milestone of one trillion archived web pages. Over 100,000 terabytes of data. This is a massive achievement for a nonprofit organization that has been …

4 months ago

Archivarix External Images Importer 2.0 - New Plugin Version for WordPress

We are pleased to introduce version 2.0 of our WordPress plugin for importing external images. This is not just an update, the plugin has been completely rewritten from scratch based on modern requir…

4 months ago