Why Web Sites Are Lost (And How They're Sometimes Found)

Frank McCown, Catherine C. Marshall, Michael L. Nelson

Research output: Contribution to journalArticlepeer-review

Abstract

The authors discuss their creation of a web-repository crawler, Warrick, that restores lost websites from Internet Archive, Google, Live Search (now known as Bing) and Yahoo, collectively known as the Web Infrastructure (WI). They present the results of their online survey surrounding lost websites and their after-loss recovery. Respondents had either personally lost one of their web sites or had recovered someone else's web site. They found that esoteric sites were being restored. They suggest that technology to preserve digital materials will become more inclusive and seamless.

Original languageAmerican English
JournalCommunications of the ACM
Volume52
DOIs
StatePublished - Nov 1 2009

Keywords

  • Information retrieval
  • Web archiving

Disciplines

  • Computer Sciences

Cite this