JISC-PoWR

Preservation of Web Resources: a JISC-sponsored project

Archive for September, 2009

The digital media collection +100 years

Posted by Marieke Guy on 16th September 2009

As part of the JISC ITT Workshops & Seminars: Achievements & Challenges in Digitisation & e-Content strand JISC Digital Media have hosted two free seminars focussing on key topics for individuals involved with digital media. Today I attended the second of these entitled The digital media collection +100 years.

Obsolescence, deterioration of physical storage media or withdrawal of institutional support: just what will prove to be the greatest threat to the materials we digitise today? This seminar projects one hundred years into the future and attempts to predict the future ‘preservability’ of what we digitise today. This seminar will examine changing user demands and inevitable developments in technology.

Panel Session

After a brief opening from Dave Kilbey of JISC Digital Media the scene setting introduction was given by Dr William Kilbride, Executive Director of the Digital Preservation Coalition.

The Preservation Landscape

As well as the more conventional look at the key issues (the volumes of data available, the complexities and complicated requirements of this data teamed with rising public expecations) William gave a really interesting talk on the path of literacy. He demonstrated through the Stroop interferance test how once we can read and write we tend to process this information quicker that image information. The result is a that literate cultures tend to be hegemonic through discursive power. His point was that the consequences of our work are not inevitable or neutral: digitisation is a social practice that can be used for good and for ill. After this slight aside William ran us through some of the main challenges which include obsolescence of technologies, correct configuration of hardware, software and operators, and the need for a constantly managed service. He ended with a few ‘answers’ from a survey of recent JISC digitisation projects. When asked how long their resources were to be available answers varied from “perpetuity” to “forever or three years”. He concluded that digital preservation is possible but our legacy will be what we make of it and cannot be taken for granted.

The Camera Raw format and preservation

Nigel Goldsmith, a photographer working for JISC Digital Media gave a quick run through of the possibilities of using Raw camera format. Raw offers the photographer greater control over the processing of their images, however this flexibility comes at a price. Raw is a proprietary format which requires specialist applications to view. Nigel’s suggestion was to archive raw but to keep it along side another format, possibly tiff or Jpeg2000.

Preservation Metadata Initiatives and Standards

After coffee Getaneh Alemu from the Humanities Computing Department, the University of Portsmouth gave us a whirlwind tour of state-of-the-art metadata standards and how metadata can help ensure the integrity, identity and authenticity of digital documents. His overview included a look at OAIS, NLA PANDORA, CEDARS, NEDLIB, LMER, PREMIS, and METS metadata initiatives and standards. He concluded that at the moment preservation metadata formats tend to have element naming issues that descriptive metadata initiatives don’t tend to have.

The challenges of archiving computer games and other multipart digital interactives

After lunch Tom Woolley from the National Media Museum talked about some of the digital media preservation issues they are tackling on-site at the museum. The museum is involved in a number of initiatives that aim to let visitors ‘have a go’ at old games and old internet environments. The tricky dilema is giving users a taster of old games in a cost effective way, actually using original kit (like ZX Spectrums) would have a heavy cost attatched. The key is often emulation. The museum also try to capture the context of games by capturing fan information, discussion forums, FAQs etc. Tom was followed by James Newman from Bath Spa University who works with Tom on the National Video Game Archive.

James talked about one of the biggest challenges of video game archiving: supersession. Within the gaming world there is a tendency to be always looking for the ‘next big game’ which has resulted in an environment where games creators don’t value old games. Although there is a niche market for retro games, gaming is an area where the experience is almost completely associated with the technology, making archiving very difficult.

The importance of collaboration

Simon Tanner, director of King’s Digital Consultancy Services focused on institutional preservation and the importance of collaboration in sustainability. He started off by saying that one of the biggest challenges is that we may run out of the minerals to make microchips. He later played on the climate issue again by saying that he currently saw digital preservation as sitting in the same space as climate change: people viewed it as potentially a terrible thing (the loss of digital objects) but currently it does not impact on individuals, so it remains low on the priority list. Simon pointed out that sustainability of resources was becoming a mandate but remains an unfunded mandate. The way to deal with this was through the ecology of collaboration - within your institution and out side.

A Poisoned Chalice? Accepting Responsibility for Sustainable Access

Neil Grindley

The day concluded with a talk from Neil Grindley, JISC Programme Manager for Digital Preservation. Neil pointed out ath ensuring that an organisation’s digital assets are safe, secure and accessible for the long term should (in theory) be an interesting, responsible and useful role for anyone in an organisation to accept. The critical importance of digital assets, the ubiquity of digital methods and the need for people in all walks of life to have effective means to refer to persistent sources of data reinforce this notion. How is it then that long-term asset management, information lifecycle management, data curation, digital preservation (call it what you will) is often regarded as a peripheral specialist activity that it is difficult to resource, complex to carry out, and delivers benefits that are, at best, simply an insurance policy rather than an activity that adds value to an organisation? Neil’s presentation examined the importance of defining clear roles for those involved with digital preservation and considered the importance of associating this professional activity with strategic and tactical frameworks. He advocated the need for allocation of responsibility and internal preservation policies. JISC has spent 6 million in the digital preservation arena between 2005 and 2009, yet there is still work to be done. He concluded by pointing out the need for human judgement when deciding what to keep and predicted that in the future digital preservation will be integrated with administration departments, have better tools and will take more terms from the cultural heritage area.

After Neil’s talk there was a panel session and time for questions, unfortunately I had to leave to make the difficult drive home through rush hour traffic!

The day was an interesting one, although the talks were a real mixed bag they all offered constructive steps forward to make today’s digital media collection something that we may be able to access and use 100 years on.

Posted in Events | 1 Comment »

Why you can sometimes leave it to the University

Posted by Ed Pinsent on 8th September 2009

“Does anyone have any positive experiences to share?”, asks Brian in a recent post. Well, I have - except it’s not in the UK. Harvard University Library in the USA have recently put Harvard WAX (the Web Archive Collection Service) live, after a pilot project which began in July 2006.

Harvard WAX includes themed collections on Women’s Voices and Constitutional Revision in Japan, but of particular interest to us in PoWR is their A-Sites collection: the semi-annual captures of selected Harvard websites. “The Harvard University Archives is charged with collecting and preserving the historical records of the University,” state the curators, recognising their formal archival function in this regard. “Much of the information collected for centuries in paper form now resides on University web sites.”

Helen Hockx-Yu of the British Library met with the WAX team in May 2009. “I was impressed with many of the features of the system,” she said, “not just the user and web curator interfaces but also some of the architectural decisions. WAX is a service offered by the Library to all Harvard departments and colleges. In exchange for a fee, the Departments use the system to build their collections. The academics may not be involved with the actual crawling of websites, but spend time QAing and curating the websites, and can to some extent decide how the archive targets appear in the Access Tool. The QAed sites are submitted directly into Harvard’s institutional repository.”

It is very encouraging to read of this participatory dimension to the project, indicating how success depends on the active involvement of the creators of the resources. Already 48 Harvard websites have been put into the collection, representing Departments, Committees, Schools, Libraries, Museums, and educational programmes.

The delivery of the resources has many good features also; there’s an unobtrusive header element which lets the user know they’re looking at an archived instance (instead of the live website). There’s a link explaining why the site was added to the collection, and contextual information about the wider collection. Another useful link allows researchers, scholars and other users to cite the resource; it’s good to see this automated feature integrated directly within the site. The Terms of Use page addresses a lot of current concerns about republishing web resources, and strikes just the right balance between protecting the interests of Harvard and providing a service to its users. Like a good OAIS-compliant repository, they are perfectly clear about who their designated user community are.

Best of all, they provide a working full-text search engine for the entire collection, something that many other web archive collections have been struggling to achieve.

The collection is tightly scoped, and takes account of ongoing developments for born-digital materials: “Collection managers, working in the online environment, must continue to acquire the content that they have always collected physically. With blogs supplanting diaries, e-mail supplanting traditional correspondence, and HTML materials supplanting many forms of print collateral, collection managers have grown increasingly concerned about potential gaps in the documentation of our cultural heritage.” The project has clear ownership (it is supported by the University Library’s central infrastructure), and it built its way up from a pilot project in less than three years. Their success was partially due to having a clear brief from the outset, and through collaboration with three University partners. What Harvard have done chimes in with many of the recommendations and suggestions made in the PoWR Handbook, particularly Chapters 5 (Selection), 16 (Responsibility for preservation of web resources) and 19 (How can you effect change?)

There are many aspects of this project which UK Institutions could observe, and perhaps learn something from. It shows that it is both possible and practical to embed website collection and preservation within an Institution.

Posted in Selection, Policies, Records management, Preservation, Resources | 1 Comment »

Survey: How successful has Records Management been?

Posted by Marieke Guy on 4th September 2009

As part of his dissertation at Aberystwyth University Andrew Brown is undertaking a research project which aims to determine how successful Records Management has been in the UK by asking Records Managers for their perceptions of Records Management in their organisation and the profession as a whole. He is attempting to quantify this ‘success’
and would be very grateful if record managers could take the time to complete the survey, which will take approximately 10-15 minutes.

It is hoped that this study will generate some stimulating debate on this matter and lead to a greater understanding of the current and future state of the Records Management profession in the UK where digital and Web preservation may be key.

Please access the survey at the following link.

The survey closes at midnight on 5th September.

Posted in Records management | No Comments »