LibGuides: Web-Archiving Program: Web-Archiving FAQ's

Web Archiving FAQ's

What is “web archiving”?
Web archiving is the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use.

Why is Archives and Special Collections archiving websites?
The World Wide Web is an important communication medium for news, scholarship, social history, and cultural heritage. However, web pages are ephemeral objects – they are constantly updated, replaced, or lost. The average lifespan of a web page, according to the Internet Archive, has been calculated at between 44 and 75 days. Information not being actively archived can easily be lost forever. By preserving these websites, we are ensuring the continuing availability of their content for use by students, researchers, scholars, and the public at large.

How are websites selected for inclusion in the web archiving initiative?
This initiative aims to preserve websites belonging to the Ball State University community; businesses, organizations, clubs, religious groups, government entities, and educational institutions in Muncie and Delaware County that have had a significant impact on the history of the area; and those dedicated to the history of the built environment in Indiana.

Once Archives and Special Collections staff identify a website for inclusion in the initiative based upon the above criteria, we actively seek permission from the owner(s) of the website to collect it.

How frequently are websites collected?
Our goal is to document the changes to websites over time. For that reason, during our first collection of a website, we attempt to collect as extensively as possible to create as complete a picture of that website as we can. Afterwards, we collect those websites at varied frequencies (single time; monthly; quarterly; semi-annually; annually) depending on the site and decisions made when the site has been identified for inclusion in the initiative. Those decisions are constantly re-evaluated and collection frequencies can change over time.

What tools do you use to collect the websites?
We use the Archive-It service through the Internet Archive to collect, catalog and preserve collections of digital content. This service uses a “web crawler” to harvest the website, and allows us to offer you 24/7 access, and full text search ability of the websites (one week post-crawl).

Who owns the content included in the web archives collections?
Copyright ownership remains with the owner(s) listed on a website and is governed by all/any applicable laws and regulations. We do not assume responsibility for the accuracy or lawfulness of the archived website or the contents within. We encourage users to review the archived website’s terms of use before using any information/material found there.

How does one cite an archived website?
Citations must credit the authors or publishers of works, so standard citation guidelines for websites should be used.

On its website, the Internet Archive provides the following example for citing archived websites in MLA format (additional FAQs from the Internet Archive can be found here):

“… We asked MLA to help us with how to cite an archived URL in correct format. They did say that there is no established format for resources like the Wayback Machine, but it's best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information.

They provided the following example: McDonald, R. C. "Basic Canary Care." _Robirda Online_. 12 Sept. 2004. 18 Dec. 2006 [http://www.robirda.com/cancare.html]. _Internet Archive_. [http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html].

They added that if the date that the information was updated is missing, one can use the closest date in the Wayback Machine. Then comes the date when the page is retrieved and the original URL. Neither URL should be underlined in the bibliography itself...”

What if I have more questions?
Please feel free to contact us at libarchives@bsu.edu.