Author Archives: Marieke Guy

About Marieke Guy

I am a research officer in the Community and Outreach Team at UKOLN. Much of my work involves exploring Web 2.0 technologies and their relevance to the communities we work with.

Whose Responsibility is Web Resource Preservation?

It is possible that one of the reason why so little is being done about Web resource preservation is that everybody feels it is somebody else’s responsibility. It might be very easy for us all to avoid the issue by standing back and waiting for someone else to tackle what, we have already explained, is a very complex problem. However taking this approach may mean that nobody does anything and we all lose out.

So whose responsibility is Web resource preservation then?

There are a number of parties who may have an interest in the preservation of Web resource. These range from the international institutions down to the individual.

Continue reading

Web Resource Preservation: No One Ever Said It Would Be Easy….

If it was we’d all be at it!!

Any records manager or archivist will probably be able to give you half a dozen reasons for why digital preservation is very important. Some might well give you half a dozen more for why the preservation of Web resources in particular, which now play such a huge part in our daily lives, is very very important.

Unfortunately this critical activity isn’t easy. In fact the very nature of the Web means that the preservation and archiving of Web resources is actually a very complex task. A few of the major issues include:

  • The transient and dynamic nature of the Web – The Web is growing at a rapid rate. The average Web resource’s lifespan is short and pages are often removed. On the Web publishing is an easy process and content may be changed often and not necessarily in an orderly way. Metadata is very much an afterthought. Web 2.0 content (comprising of data mash ups, blog entries, comments etc.) is even more dynamic.
  • Selection issues – Of the billions of resources out there which and which instantiation of them should we preserve?
  • The technologies involved – The Web is dependant on technology, it uses various file formats and follows many protocols, most of which evolve quickly. The look and feel of a Web page may be determined by a number of different elements such as the code, the http protocol, the user, the browser and the server. Which of these need to be preserved? Web resources are usually held on just one server, so are at greater risk of removal, yet for some resources countless copies are made. Again which do we preserve? Web sites are held together by hypertext links meaning parts of the site could be omitted (if for example they use a robots.txt file or pages are not actually linked to) if crawled by archiving software. Whole areas of the Web are held in problematical CMS or behind authentication systems and Web 2.0 applications use layered APIs, which use data in many different ways.
  • Organisational issues – How is your institution using its Web site? Is it a publication or is it a record? Is the content being managed? Who is responsible and who has ownership?
  • The legal issues – There are many IPR and data protection issues with Web content. Who owns the photos on Flickr, the comments on a blog or the details on a social networking site?

There is no easy answer! However despite the difficulties of Web preservation some institutions may be addressing some of these issues already. We are keen to hear examples of any approaches being taken.

Introduction: Marieke Guy

Marieke GuyI’m Marieke Guy and I will be working on the The Preservation of Web Resources project (JISC-PoWR). I am a Research Officer at UKOLN, a centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities. More information about me is available from my staff page.