A framework for describing web repositories

Frank McCown, Michael L. Nelson. A framework for describing web repositories. In Fred Heath, Mary Lynn Rice-Lively, Richard Furuta, editors, Proceedings of the 2009 Joint International Conference on Digital Libraries, JCDL 2009, Austin, TX, USA, June 15-19, 2009. pages 341-344, ACM, 2009. [doi]

Abstract

In prior work we have demonstrated that search engine caches and archiving projects like the Internet Archive’s Wayback Machine can be used to “lazily preserve” website and reconstruct them when they are lost. We use the term “web repositories” for collections of automatically refreshed and migrated content, and collectively we refer to these repositories as the “web infrastructure”. In this paper we present a framework for describing web repositories and the status of web resources in them. This includes an abstract API for web repository interaction, the concepts of deep vs. flat and light/dark/grey repositories and terminology of describing the recoverability of a web resource. Our API may serve as a foundation for future web repository interfaces.