Preservation of Web Resources: a JISC-funded project [Archived Blog]

Archive for the 'Legal' Category

Legal scholarship recognises long-term value of blogs

Posted by Kevin Ashley on 16th December 2008

A recent post on the digital-preservation list indicates that at least one scholarly community has recognised the long-term scholarly value of online resources such as blogs, and the potential damage to future scholarship that might result from their loss. It draws attention to a symposium taking place at Georgetown University next year. The email says that the symposium:

…will build upon the fundamental assumption that blogs are an integral part of today’s legal scholarship.

and goes on to say:

This symposium will bring together academic bloggers, librarians, and experts in digital preservation …. Symposium participants will collectively develop innovative practices to ensure that valuable scholarship is not easily lost.

Join the conversation now by tagging items you think are relevant to this symposium with the del.icio.us tag FTLS2009.

It’s interesting to observe that this is an example of a community acting to preserve information of interest that is likely to be scattered over many institutions and none. (I suspect a fair amount of blogging in this area is done by practitioners who aren’t at an academic institution.) One of the concerns we identified in PoWR was that much material of this type was unlikely to be preserved as a result of institutional interests, unless one institution tried to bring materials like this into the remit of its special collections (and some have done this.)

The conference web site goes on to say:

This unique symposium will seek answers to the questions:

1. How can quality academic scholarship reliably be discovered?
2. How can future researchers be assured of perpetual access to the information currently available in blogs?
3. How can any researcher be confident that documents posted to blogs are genuine?

The symposium will include a working group break-out session to create a uniform standard for preservation of blogs, a document to be shared by bloggers and librarians alike.

That last goal of a uniform standard for blog preservation looks like a tall order and it will be interesting to see what emerges from this group, and what its wider relevance might be. But its a clear demonstration of the value of web material to some research communities, and their willingness to do something about it if their institutions can’t, or won’t, help them.

Posted in Digital preservation, Events, Legal | 1 Comment »

Yet more on iPres2008

Posted by Ed Pinsent on 14th October 2008

The web-archiving strand at the iPRES Conference was an opportunity for a number of National Libraries to describe their initiatives, their approaches to web capture and preservation, and the things that were (for them) the biggest hurdles. It’s interesting to me that librarians – as opposed to archivists – seem to have been first off the mark with web-archiving, although in the UK The National Archives have at least two parallel initiatives underway (UKWAC and European Archive) to capture government websites deemed to be records.

From the PoWR point of view, one thing I found useful was the summary statement from Iceland which suggests that the community is now starting to agree on three main approaches to selection – bulk/domain harvesting, thematic, and event-based. Sometimes I wonder if these approaches are too library-influenced (looking at websites primarily as publications, rather than as records, whereas in PoWR we are suggesting that HFE websites contain and function as a mixture of both), but it helped me sharpen up my thinking and I fed this idea back into the PoWR Handbook’s chapter on Selection.

Since they were all representing libraries, most of the presenters tended to focus on legal deposit (and attendant permission problems) as the biggest hurdle to gathering websites – and the one that eats up the most in terms of resources. Denmark have not resolved this completely, and although they have gathered lots of material from the web, they don’t make it widely available – only allowing controlled access for research or statistical purposes. France, on the other hand, have had legal deposit laws in place since 2006. Australia have found this such a problem (despite the strenuous efforts they have made to get the law changed) that it pretty much forced the decision to go the curatorial route. All permissions are negotiated, but the collections are shaped to a certain extent by community input.

With my archivist hat on, I sat up when we were told quite categorically that registration and cataloguing of web resources would not work, and that no-one should ever even attempt it. The BL admitted they were ‘not really collecting metadata’. If this is true, I immediately started to wonder why both PANDAS and Web Curator Tool (which I have used) have conventional Dublincore metadata elements built into their workflow? Don’t we anticipate cataloguing archived web resources in some way? I almost asked a question about this, until I reflected on the use of NutchWax and full-text indexing (which is probably acceptable until we can come up with some form of semantic tagging or automated metadata extraction for web resources).

Then I found myself reaching for the mic to ask a question about what I call ‘endangered resources’. We have already blogged about this on dablog, when my colleague Joanne Anthony raised a question about a web resource owned by a smallish institution which suddenly found itself with its funding removed. Was there anything such an Institution could do, I asked the panel, to preserve its website? And what were the pro-active steps being taken by these National Libraries to rescue or identify resources at risk? I know that UKWAC, for example, offers a public submission service on its website, although it is not very prominent or visible, nor is it quite clear what happens to requests for archiving once the form has been filled in. I received some interesting replies, including the amusing anecdote from France which suggests that their archival collections have been accessed by red-faced politicians who have accidentally deleted their own blogs. However, I still wasn’t quite sure what national initiatives exist to address what I perceive as a significant gap in the preservation of unrecognised (and therefore uncaptured) resources.

Posted in Digital preservation, Events, Legal, Selection, Technologies | 1 Comment »