JISC-PoWR

Preservation of Web Resources: a JISC-sponsored project

The Fetish of the Digital

Posted by Marieke Guy on January 7th, 2009

Happy New Year to all our readers.

We are lucky enough to start 2009 with a guest blog post from Dr James Currall, Director of Information Strategy, IT Services & HATII Senior Research Fellow, University of Glasgow.

James has been involved with the highly successful Glasgow MPhil (now MSc) course in Information Management and Preservation since it inception, in which he teaches about the transition from storage of information on physical to digital media, information security, the role of numbers as information and a variety of other topics including risk and information management as an investment. In this latter context he was the Project Director of the espida project which developed a sustainable business-focussed model for digital preservation. He gave a plenary talk on Web preservation at last year’s Institutional Web Management Workshop (IWMW 2008) entitled The Tangled Web is but a Fleeting Dream … but then again … which was very well received and is available to watch on Google Video.

And I’ll pass you over to James…


A few weeks back, I was involved in a discussion about the skills required by people involved in Digital Curation and much of that discussion was based around the DigCCurr Project which has a long list of skills, some of which are specific to Digital Curation, but many of which are rather of a more general nature. And this set me on a dangerous course - thinking ….What exactly is this ‘profession’ of digital curator that DigCCurr amongst others are trying to define?

Let us rewind to say the second half of the 16th century and let us suppose that you were charged by Mr Shakespeare’s publishers with curating ‘The Scottish Play’.  What would you have done?  What exactly is this ‘information object’?  Is is the fonts, the layout, the pagination, the language, the story, the stage directions or what?  In spite of the absence of the profession of ‘paper curator’ we have inherited a rich heritage.  Along the way, many items will have been lost - it was always thus and, in spite of the optimistic techno-determinism of some, it always will be EVEN IN THE DIGITAL AGE. I would argue that this is all good and necessary and whilst I would mourn the passing of Algol, Reverse Polish Notation, amplifiers based on thermionic valves or chunky discrete solid state components, vinyl records, reel to reel tape and other really splendid ideas that were IMHO much better than the ‘mass market equivalents’ that replaced them, we have to discard much of our baggage as we move on.

So what is this preservation activity all about?  Is it not about the preservation and curation of information not of digits?  During a session with my MSc students, We visited the Way Back Machine and had a look at the University of Glasgow Web site (you wondered when I would get on to the web didn’t you?).  The page that we selected at random was from 18th October 2000. As a web page it is rather uninteresting, when I looked at it today there was no style sheet, the graphics were all missing and it was generally rather uninspiring, but ….  what is interesting is the headline news story ‘Funeral of the First Minister, Donald Dewar’. For those of you firth of Scotland, Donald was a leading light in the establishment of devolution for Scotland and the first First Minister of the devolved administration in Scotland. He was a graduate of the University of Glasgow and his premature passing at the age of 63 was tragic.  The news story is about ‘administrative’ details of his funeral and the passage of his cortege past the University - details of importance in relation to the history of the University and perhaps of Scotland.  It is the information contained in the web pages that is of interest and importance, whilst the layout of the pages and such ‘technical’ details of passing interest as the ‘container’ for that information.

So with 2008 now ended let us bury the idea that the digital needs its own ghetto that we need to prepend everything with ‘digital’, be it: curation, preservation, art, culture, revolution, etc.  Digital artifacts are the currently ‘fashionable’ containers for information and whilst the term continues, the technologies underneath that are radically different at every turn and often require as much conversion one to another as a paper to magnetic disc conversion.  It is not the containers that are important but what they contain.  The Eastern concept of ‘Pointing at the Moon’ has something to say here.

If we come to regard preservation/curation as a finger pointing to the moon; we might come to mistake the finger for the moon and never see beyond it to the moon itself.

This short clip of Bruce Lee in ‘Enter the Dragon‘ (1974) captures something of this in a different context.

I am also reminded of the auditors in Terry Pratchett’s ‘Thief of Time’ who take a great painting and break it down into flakes of paint which they put in little piles of each colour and then spend time looking to see where the art has gone!  These auditors are described in the Wikipedia article for DiscWorld thus:

The Auditors, cosmic bureaucrats who prefer a universe where electrons spin, rocks float in space and imagination is dead, represent the perils of handing yourself over to a completely materialist and deterministic vision of reality, devoid of the myths and stories that make us human.

From http://en.wikipedia.org/wiki/Discworld#Elves_and_Auditors

In 2009 we need to see digital preservation and curation as ‘last year’s model’, of course we need to understand the importance of custody, metadata and identifiers, but above all we need to understand the centrality of the information in the artifacts that we are seeking to curate and preserve.  This piece is recognisably ‘Currall’ not because of a digital signature, not because it is on his web site and not because the owners of the JISC PoWR say it is - it is ‘Currall’ because of its recognisably iconoclast position, poor grammar and tortured logic - that is what needs to be preserved!

Information is the thing (even if that is hard and technology is relatively easy) - lose sight of that and the game is a bogey.

PS if you are interested in a rather more rigorous treatment of this topic you might like to access “Authenticity: a red herring? (doi:10.1016/j.jal.2008.09.004)

Posted in Digital preservation | 1 Comment »

History of the First UK Institutional Web Service

Posted by Brian Kelly on January 6th, 2009

It was 15 years ago, the first week back at work after the Christmas break (I think) when I was part of the team which set up the Web service at the University of Leeds. This was, I believe, the UK’s first institutional Web service, with contributions made shortly afterwards from several academic departments, including not only the usual suspects (the Computing Service, Computer Science, Chemistry and Physics) but also the School of Music.

Various people at the University of Leeds were active in Web development activities back then. My role was in promoting its use (and I’ve discovered a copy of a special issue of the University Computing Service newsletter on the theme on online information services - in particular the Web - which is available on the Internet Archive). But in addition the Chemistry Department were, in conjunction with Imperial College, developing services which provided access to molecules on the Web; a colleague in the Computing Service provided access to the University Libraruy catalogue and Nikos Drakos, a researcher in the Computer Based Learning Unit, wrote the Latex2HTML conversion software (which was first announced in May 1993).

Fifteen years later my memories of our early involvement with the Web are beginning to fade. But as I knew this would happen I write a history of the various activities of colleagues at the University, which was published on the University’’s Web site. Sadly, but perhaps inevitably, over time this resource was deleted, no doubt following a reorganisation of the Web site.

But this does not necessarily mean that the information is no longer available. As well as being an early adopter of the Web, the Computing Service had also had long standing involvement in digital preservation. And so the file should still be available on the University’s archive service. But although the bits and bytes may still be available, what are the processes needed for this resource to be retrieved?  Is this a service which the University offers? And is it a service which can be provided to a former member of staff, who left the University over 13 years ago?

As JISC PoWR project team members have commented previously, digital preservation isn’t just about the technical aspects of preservating bits in a format suitable for processing in the future - it’s also about the policies and the procedures.  And I think it’s time I send an email to my former colleagues to see ifthis resource can be retrieved.  I’ll provide details of my experiences in a future post.

Posted in Case studies | 1 Comment »

CASPAR Training Days

Posted by Brian Kelly on January 5th, 2009

Too late to be of much use, I suspect, but just before Christmas I received an email containing details of two CASPAR (Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval ) Training Days. The CASPAR Training Day for the Cultural Domain will be held on 12 January 2009 and the CASPAR Training Day for the Scientific Domain on the following day (13 January 2009).

The seminars will take place in Rome, and are free to attend. If you require further information please email:
<info@casparpreserves.eu>

Posted in Events | No Comments »

Looking Back … and Looking Foward

Posted by Brian Kelly on December 18th, 2008

A news item entitled Preserving web resources – new advisory handbook published on the 9th December 2008 on the JISC Web site Neil Grindley, manager of JISC’s Digital Preservation programme, described how “the JISC PoWR handbook helps institutions to identify where material of interest might exist, which elements may require long-term access and how these decisions can link into wider institutional policies“.

Neil went on to add that “The PoWR handbook recognises that preservation is not an end in itself, but that it can complement an institution’s mission, whether that be improving the quality of research, conforming with national policy or avoiding the threat of legal action. It will evolve following the practical experience of its use to ensure it remains at the forefront of best practice advice for web preservation issues“.

The JISC PoWR project has been formally completed - but the interests of the project team (UKOLN and ULCC) in the area of preservation continues. We have agreed that we will continue to publish posts on this blog which are relevant to the area of the preservation of Web resources for a period of time- we will seek to publish at least 3 posts per month. Around Easter time we will review the status of this blog. As well as posts from members of the JISC PoWR project team we would also welcome guest blog posts from the community. So if you would like to write something about your interests in the area of Web site preservation please contact Marieke Guy (email M.Guy@ukoln.ac.uk).

But for now on behalf of the JISC PoWR team I’d like to wish everyone a happy and enjoyable Christmas.

Posted in Project news | No Comments »

Legal scholarship recognises long-term value of blogs

Posted by Kevin Ashley on December 16th, 2008

A recent post on the digital-preservation list indicates that at least one scholarly community has recognised the long-term scholarly value of online resources such as blogs, and the potential damage to future scholarship that might result from their loss. It draws attention to a symposium taking place at Georgetown University next year. The email says that the symposium:

…will build upon the fundamental assumption that blogs are an integral part of today’s legal scholarship.

and goes on to say:

This symposium will bring together academic bloggers, librarians, and experts in digital preservation …. Symposium participants will collectively develop innovative practices to ensure that valuable scholarship is not easily lost.

Join the conversation now by tagging items you think are relevant to this symposium with the del.icio.us tag FTLS2009.

It’s interesting to observe that this is an example of a community acting to preserve information of interest that is likely to be scattered over many institutions and none. (I suspect a fair amount of blogging in this area is done by practitioners who aren’t at an academic institution.) One of the concerns we identified in PoWR was that much material of this type was unlikely to be preserved as a result of institutional interests, unless one institution tried to bring materials like this into the remit of its special collections (and some have done this.)

The conference web site goes on to say:

This unique symposium will seek answers to the questions:

1. How can quality academic scholarship reliably be discovered?
2. How can future researchers be assured of perpetual access to the information currently available in blogs?
3. How can any researcher be confident that documents posted to blogs are genuine?

The symposium will include a working group break-out session to create a uniform standard for preservation of blogs, a document to be shared by bloggers and librarians alike.

That last goal of a uniform standard for blog preservation looks like a tall order and it will be interesting to see what emerges from this group, and what its wider relevance might be. But its a clear demonstration of the value of web material to some research communities, and their willingness to do something about it if their institutions can’t, or won’t, help them.

Posted in Digital preservation, Legal, Events | No Comments »

When Funding Bodies Shut Down

Posted by Brian Kelly on December 15th, 2008

An email sent to the MLANORTHEAST-NEWS JISCMail list provides details of the implications of the closure of the MLA North East regional Agency on the Web services it has set up or commissioned.

The message states :

MLA North East Websites after 12th December, 2008

MLA North East over recent years has set up several websites which we have managed on behalf of the sector. This brief note is to inform you of the arrangements made for each of the sites.

www.mlanortheast.org.uk:  a holding page will refer visitors to MLA council site at www.mla.gov.uk All other content will be taken down at 4.00pm on Friday 12th December, 2008.

www.thenortheast.com:  currently a portal to our sector’s on-line stores selling local studies material and other ephemera. The content will be taken down at 4.00pm on Friday 12th December, 2008. The domain name is now owned  by One NorthEast.

www.archivesnortheast.com:  a portal to North East archives services, providing links to catalogues and paid-for  professional support in researching archives. This will continue under the auspices of the North East Regional Archives Council [NERAC] Contact Liz Rees liz.rees@twas.org.uk

www.wellinever.info: a portal to learning resources to teachers, pupils, parents and carers providing venue guides, information regarding learning visits and links to some of the sector’s regional on-line learning resources. This will continue under the auspices of Tyne & Wear Museums. Contact ian.thilthorpe@twmuseums.org.uk

www.primarysources.org.uk:  basic skills resources developed by primary teachers, working alongside learning professionals from six archives in the North East region and  designed for use schools.  These resources offer a fresh and engaging approach to teaching basic skills. This will continue under the auspices of Durham University. Contact andrew.preater@durham.ac.uk

www.discs-uk.info: DiSCS provides an online directory of information technology (IT) and digital services suppliers to work with the cultural and heritage sector. This site has transferred and is managed by The Collections Trust www.collectionstrust.org.uk

www.tomorrows-history.com: The regional local studies site for archives and record offices, libraries, museums, archaeology services, the region’s universities and commercial organisations.

Additionally, community groups have created one hundred local history projects. This will continue. The domain name is now owned by Newcastle City Council. The site is managed by Newcastle City Library Services. Contact Kath Cassidy kath.cassidy@newcastle.gov.uk

www.oralhistorynortheast.com: The site for oral history in the North East of England. Support for individuals and organisations undertaking oral history projects, to provide focus and support and a forum for the sharing of ideas and experience. This site is closed.

I think this demonstrates some good practices of what organisations which have set up or commissioned Web sites should do if they are forced to close, either due to changes in Government funding and policies (as is the case with the MLA Regional Agencies).

We can see that the Web site address and a brief summary of its purpose is provided, details of when the site ceases operation, contact details and, in a couple of cases, details of how the service is being continued by other organisations.

I know the implications of the demise of our organisation on the Web services we are providing isn’t something that we like to think about. But in a personal capacity once we reach a certain age and become aware of our resoponsibilities to others we due ten to make plans for what happens after we die, perhaps by making a will. So shouldn’t our organisations be making similar plans in case the oprganisation ceases to exist. And at a time of the credit crunch this is even more important than it used to be.

Posted in Policies | No Comments »

Library Partnership Preserves End-of-Term Government Web Sites

Posted by Brian Kelly on December 8th, 2008

The news that a Library Partnership Preserves End-of-Term Government Web Sites was announced in August 2008 (and it’s about the end of the George W Bush’s term of office). However I think it’s worth drawing attention to the article for those with an interest in the preservation of Web sites. One thing that caught my eye was the comment that:

the Internet Archive will undertake a comprehensive crawl of the .gov domain.

The article concluded with a summary of the role of the Internet Archive:

The Internet Archive is a high-tech nonprofit, founded in 1996 by Brewster Kahle as an “Internet library” to provide universal and permanent access to digital information for educators, researchers, historians, and the general public. The Internet Archive captures, stores and provides access to born-digital and digitized content, and leads the development of Heritrix, the open-source archival web crawler, used to facilitate the collection of web data for this project.

What role might the Internet Archive have in the UK, I wonder?

Posted in Policies | 1 Comment »

Future in Bits

Posted by Marieke Guy on December 3rd, 2008

The BBC News Web site has published an interesting article entitled Future in Bits asking how can the ever-changing Web be archived bearing in mind the dilema of the malleable nature of digital information.

The article draws attention to the fact that no UK-based commercial online newspapers are currently being archived.

David Stuart, a research fellow in Web 2.0 Technologies at the University of Wolverhampton is quoted as saying:

The lack of an exhaustive archive of the UK web space not only risks the loss of information on web pages that are changed or taken down,” he said. “It also undermines the value of pages that link to them; the value of the web comes as much from the hyperlinks between pages as the contents of the web pages. This is especially true in the blogosphere, where so much of the content created by the public is built upon the foundations of traditional news stories

Jessie Owen, digital continuity project manager at the National Archives explains that the key to archiving is preparation.

This is something the JISC PoWR handbook can offer help with.

Posted in Challenges | 1 Comment »

Managing the Crowd: Rethinking records management for the Web 2.0 world

Posted by Marieke Guy on November 19th, 2008

My review of the Steve Bailey text Managing the Crowd: Rethinking records management for the Web 2.0 world has now been published in the latest Ariadne magazine.

This text has been mentioned at PoWR workshops, on the PoWR blog and on the JISC Information Environment Team blog. I can honestly say that it has had quite an impact on my thinking with regard to preservation and Web 2.0 resources, other members of the PoWR team may agree.

As I say in the conclusion:

This book offers up much food for thought. Bailey wants to wake up and shake his community. He wants to make them see that all is not well in the records management world and that if they don’t start moving with the times then they will be pushed out of the way. He contends there is a very real possibility that records management as we know it will cease to exist; it will be outsourced.

Go on, have a read.

Posted in Future, Records management | No Comments »

JISC Study on Digital Preservation Policies

Posted by Marieke Guy on November 10th, 2008

JISC have announced the publication of a study on Digital Preservation Policies which can be downloaded in PDF format from the JISC Web site.

This study aims to provide an outline model for digital preservation policies and to analyse the role that digital preservation can play in supporting and delivering key strategies for Higher and Further Education Institutions. Although focussing on the UK Higher and Further Education sectors, the study draws widely on policy and implementations from other sectors and countries and will be of interest to those wishing to develop policy and justify investment in digital preservation within a wide range of institutions.

The study concludes “that for institutions digital preservation must be seen as “a means to an end” rather than an end in itself: any digital preservation policy must be framed in terms of the key business drivers and strategies of the institution.

Two tools have been created in the study:

1) a model/framework for digital preservation policy and implementation clauses based on examination of existing digital preservation policies;

2) a series of mappings of digital preservation to other key institutional strategies in UK universities and colleges including Research, Teaching and Learning, Information, Libraries, and Records Management.

These tools are definitely worth taking a look at if you are embarking on a Web preservation strategy.

Posted in Digital preservation | No Comments »