Category Archives: Web 1.0

Making any Upgrades to your Blog Sir?

This blog is hosted by JISC Involve who provide blogs for the JISC community.

Till recently JISC Involve was running on an old version of WordPress (1.2.5). Earlier this month the JISC Digital Communications Team upgraded their server to the latest version of WordPress (2.9.2) and then migrated all the JISC Involve’s blogs over to the new installation.

Although all blog posts, comments, attachments, user accounts, permissions and customisations were supposed to move over easily JISC Involve users were encouraged to back-up the content of drafts etc. ‘just in case’.

Unfortunately there were some technical problems migrating the content and as a consequence the original theme was lost and URLs now redirect.

Luckily the JISC PoWR team were able to locate the original theme and reinstall it.

However the process has made them aware of the need to record details of the technical components and architecture of the blog. This information can be critical in a migration process and when ‘closing down’ a blog.

The JISC PoWR team will ensure that such information is routinely recorded.

Is there any other information that is important for preservation or migration purposes?

“A Fifth Of BBC Sites Are Already Dead”

The Paid:Content:UK blog has recently published an article which informs us that “A Fifth Of BBC Sites Are Already Dead“. The article begins by annocing that “Nearly half of the websites most likely to be closed as part of its big Strategic Review have already long been shut, some for as much as eight years“.

A list of a number of the sites which have been ‘mothballed’ is given in the article. Some of the sites are for programmes that have ceased broadcasting (eg. On The Record) and others are for events which are now over (e.g. Politics ‘97).

I was particularly interested to read about the BBC policies regarding the decommissioning of such Web sites. The article provide a link to the BBC’s policy which describes that inactive pages are left online for reference as “We don’t want to delete pages which users may have bookmarked or linked to in other ways. In general, our policy is only to remove pages where the information provided has become so outdated that it may lead to actual harm or damage.”

With the promises of large cuts for public sector organisations in the offing after the general election I suspect that we will find Web sites in many higher education origanisation being decommissioned.  But will  content be simply deleted, will the content be left ‘as is’ or will a more manged approach to such decommissioning take place? 

I feel there will be a renewed interest in the decommisioning of Web sites.  I hope the JISC PoWR’s Handbook on the Preservation of Web Resources will be of interest to organisations which find themselves  in this position.

Official Launch of the UK Web Archive

The British Library has officially launched the UK Web Archive, offering access in perpetuity to thousands of UK websites for generations of researchers.

The site was unveiled earlier this week by the Minister for Culture and Tourism, the Rt Hon Margaret Hodge MBE MP, and Chief Executive of the British Library, Dame Lynne Brindley, this project demonstrates the importance and value of the nation’s digital memory.

Websites included in the UK Web Archive include:

  • The Credit Crunch – initiated in July 2008, this collection contains records of high-street victims of the recession – including Woolworths and Zavvi.
  • Antony Gormley’s ‘One & Other’ Trafalgar Square Fourth Plinth Project – involving 2,400 participants and streamed live by Sky Arts over the web to an audience of millions, this site will no longer exist online from March 2010.
  • 2010 General Election – work has started to preserve the websites of MPs such as Derek Wyatt, who will be retiring at the next election, creating a permanent record of his time as a Member of Parliament.

This important research resource has been developed in partnership with the National Library of Wales, JISC and the Wellcome Library, as well as technology partners such as IBM.

British Library Chief Executive, Dame Lynne Brindley said:

Since 2004 the British Library has led the UK Web Archive in its mission to archive a record of the major cultural and social issues being discussed online. Throughout the project the Library has worked directly with copyright holders to capture and preserve over 6,000 carefully selected websites, helping to avoid the creation of a ‘digital black hole’ in the nation’s memory.

“Limited by the existing legal position, at the current rate it will be feasible to collect just 1% of all free UK websites by 2011. We hope the current DCMS consultation will enact the 2003 Legal Deposit Libraries Act and extend theprovision of legal deposit through regulationto cover freely available UK websites, providingregular snapshots ofthe free UK web domainforthebenefit of future research.

Further details are available from the British Library.

The Demise of Geocities – But a Renewed Interest in Web Site Archeology

An article published today on the Guardian Technology Web site entitled “Geocities: dead but not lost” describes how Geocities, which was founded in 1994 and was at one stage the third most-browsed site on the web, is now dead.

Geocities pageWe discussed Yahoo’s announcement that the Geocities service was to be shut down some time ago in a post entitled ““Seething With Anger” at the Demise of Geocities“. What I find interesting in the article is the information that “… there’s the real effort, by the Archive Team, who have been trying to archive as many Geocities pages and sites as they could“.

I’d not come across the Archive Team wiki before. They describe themselves as a “project composed of volunteers, currently coordinated by Jason Scott” which invites.

  • Writers, who can create clear essays and instructions for archivists and concerned parties.
  • People with Lots of Hosted Disk Space who have a proper hosted webserver and fat pipe, who are willing (when asked) to consider hosting mirrored dead sites or archives.
  • People who love setting up torrents who can do the same as the mirror folks, but do so hosting torrents.
  • OCD-rich individuals who want to download things who will respond to our alerts and call outs and download entire sites or diagnose ways to get at obfuscated data.

The wiki home page informs us that “This website is intended to be an offloading point and information depot for a number of archiving projects, all related to saving websites or data that is in danger of being lost. Besides serving as a hub for team-based pulling down and mirroring of data, this site will provide advice on managing your own data and rescuing it from the brink of destruction.”

Hmm. I wonder how effective a volunteer organisation is likely to me? My initial thoughts were fairly sceptical, but other volunteer-led initiatives, such as Wikipedia, do seem to be successful. What are your thoughts?

What’s the average lifespan of a Web page?

…or is it easier to ask how long is a piece of string?

The statistic much banded about (for Web pages not pieces of string!) is 44 days, believed to originate in an article by Brewster Kahle (of Internet Archive fame) published in 1997 and titled Preserving the Internet. Brewster’s original quote is specifically about URLs, “…estimates put the average lifetime for a URL at 44 days.

Whether this figure still stands today is a matter currently being discussed on the CURATORS@LIST.NETPRESERVE.ORG list after a query from Abigail Grotke of the Library of Congress.

Abbie offered up the 44 day statistic and pointed out that on the Digital Preservation Web site they have a graphic that discusses Web volatility stating “44% of the sites available on the internet in 1998 had vanished one year later“.

The other figure often cited is 75 days from a Michael Day’s report Collecting and preserving the world wide web.

The dynamic nature of the Web means that pages and whole sites are continually evolving, meaning that pages are frequently changed or deleted. Alexa Internet once estimated that Web pages disappear after an average time of 75 days. (Lawrence, et al.,2001, p. 30).

Another figure sometimes suggested is 100 days, this seems to come from Rick Weiss article for the The Washington Post, Washington, DC, 24 November 2003, On the Web, Research Work Proves Ephemeral –  no longer available.

So what is the average lifespan of a Web page today? Is it getting shorter or longer? The Internet Archive now gives 44 -75 days as its ball park figure. I’d have to hazard a guess that with the rise in use of Web 2.0 technologies the Web is actually getting more transient by the day.

Is this OK?

Maybe if it’s just a tweet you sent your friend, however if it’s something more substantial that’s disapearing then it’s a real worry.

New Study – Web Archives: Now and in the Future

A news item on The National Archives Web site has recently announced a new study on “Web Archives: Now and in the Future“. This study, which is funded by the JISC and will take place in collaboration with the UK Web Archiving Consortium, will look into how archived Web sites are collected and made available to users.

The study aims to:

  • Investigate how UK Web archives are delivered to users now, and how they might be delivered in the future
  • Define the long-term historical and research value of online content in the UK
  • Look at different organisations that collect Web archives, and their interests

The study will run until late July 2009, and the results will be published on The National Archives and UK Web Archiving Consortium Web sites in August 2009.

We’ll published details on the availability of the study once it is published.

“Seething With Anger” at the Demise of Geocities

A blog post entitled “The Death and Life of Geocities” has been published recently on the Adactio blog by Jeremy Keith, a Web developer living and working in Brighton, England. In the post Jeremy describes how he is “seething with anger” but then goes on to add that “I hope I can tap into that anger to do something productive“. The reason for the anger is his concern that “Yahoo are planning to destroy their Geocities property. All those URLs, all that content, all those memories will be lost …like tears in the rain“.

Although in an update to his post Jeremy does admit that “no data has been destroyed yet; no links have rotted” and that his “toys-from-pram-throwage may yet prove to be completely unfounded” Jeremy is right to raise concerns regarding the recent announcement that “Yahoo [is] to shut down GeoCities“.

Some people, as illustrated by JR Raphael’s article in PC World entitled “So Long, GeoCities: We Forgot You Still Existed” are not losing any sleep over GeoCities demise whilst others, such as the Online Lunchpail blog feel that “the demise of GeoCities … proves my point that the U.S. government never should have approved the takeover of GeoCities by Yahoo!“.

From my perspective I feel that the concerns raised by Jeremy Keith (who, it should be pointed out, is a professional Web developers) will become more widely appreciated as ordinary Web users, who might have used the first generation of public-facing Web-hosting services such as GeoCities for their initial simple Web development activities, realise that their may be sentimental attachments to one’s early work – just as I regret having lost my scrap book from primary school (I remember writing “When I grow up I want to be a Beatle, sing ‘She loves you, yer, yer, yer’ and earn £100 a week“). And what of the social historians – have we lost our cultural memories of the initial take-up of the Web outside of the universities and business sector?

In a blog post by Jason Scott on the ASCII  “weblog of computer history, punditry and trivia” Jason describes the efforts being made to preserve content published on GeoCities. But Jason admits that

I can’t do this alone. I’m going to be pulling data from these twitching, blood-in-mouth websites for weeks, in the background. I could use help, even if we end up being redundant. More is better. We’re in #archiveteam on EFnet. Stop by. Bring bandwidth and disks. Help me save Geocities. Not because we love it. We hate it. But if you only save the things you love, your archive is a very poor reflection indeed.”

What is to be done? Should the digital preservation for the general public’s digital heritage (as opposed to an institutional digital heritage) be left to volunteers? Or will future generations regard us as having failed in our responsibilities as previous generations failed to preserve the built environment and left us with the soulless shopping centres and high-rise building which were developed during the 1960s?

“Your List Will Be Closed In One Week’s Time”

The dangers of reliance of externally-hosted Web 2.0 services has been mentioned previously. And there have been recent incidents in which companies have given a short period of notice of impending closure of services, with users having little time to migrate their data to alternative providers. A recent article in The Guardian (Thursday 2 April 2009)  entitled “Can I assume that my online data is safe for ever?” addressed such concerns in an article on the closure of the Filefront.com service, who gave their users just 5 days to migrate their data.

Coincidentally I recently received the following email from a service I subscribe to:

Our previous request to you to provide a new owner for the  list has not produced a response.  Therefore, we assume the list is no longer useful and aim to close it in one week’s time.
We would be happy to provide a zipped copy of the archives and any files on deletion of the list, should they be required.

In this case it appears that the service has been little used for over a year. And yet what if useful information is still available on the service? Is a week’s notice enough for users of the service to consider the implications of this decision, identify appropriate solutions and then implement them? And let’s not forget that this email was sent outside of term time when researchers could be away.

The email did not make it clear if data was to be deleted, the service was to continue to be made available in a read-only mode or the interface to the data hidden – all possible solutions if it is felt necessary for a little-used service to be withdrawn.

There’s still a need to establish the best practices when Web-based interfaces to services are to be removed, I feel. And such issues do not just affect the third party services outside of our community.

Who Should Preserve The Web?

Members of the JISC PoWR Team will be participating at next week’s JISC conference, which takes place in Edinburgh on 24th March 2009.

In the session, entitled “Who should preserve the web?” a panel will

“Outline the key issues with archiving and preserving the web and will describe practical ways of approaching these issues. Looking at the international picture and the role of major consortia working in this area, the session will also offer practical advice from the JISC Preservation of Web Resources (PoWR) project on the institutional benefits of preserving web resources, what tools and processes are needed, and how a records management approach may be appropriate.”

If you are attending the conference we hope you will attend the session and participate in the discussions. If you are attending one of the other parallel sessions you can meet the UKOLN members of the  JISC PoWR team at the UKOLN staff. And if you haven’t bookeda place at the conference (which is now fully subscribed) feel free to participate in the discussions on the online forum.

TASI Is No More! Welcome To JISC Digital Media

The JISC-funded TASI (Technical Advisory Service for Images) is no more. This service, which is based at ILRT, University of Bristol has been reborn as JISC Digital Media, with an expanded remit for supporting digital media in general and not just images, which was the focus of the TASI service. Further information is available on the JISC Web site.

This change has been accompanied by a new domain name – http://www.jiscdigitalmedia.ac.uk/ rather than http://www.tasi.ac.uk/.

Now the TASI service provided many useful resources on best practices for digitisation.  But what has happened to links to these resources? Will we get a 404 error message? Or, even worse, will we get a message saying the domain no longer exists?

The QA Focus briefing document on “Improving The Quality Of Digitised Images” contains a reference to a Digital Imaging Basics resource which was available at the URL <http://www.tasi.ac.uk/advice/using/basics.html>. Following the link takes you to the resource, which is now available at <http://www.jiscdigitalmedia.ac.uk/advice/using/basics.html>.

There seems to have been a simple mapping of resources from the TASI domain to the new JISC Digital Media domain. And as the original resource has ‘cool URIs’ (i.e. they had no dependencies on a specific technology (such as a CMS, Java server pages, etc.) it was technically not a difficult task to migrate the links to the new domain.

Well done TASI / JISC Digital Media. The challenge now is to see how long such redirects will continue to function.

JISC Advisory Services to be Closed – But Don’t Panic!

A message sent to the JISC infoNet  JISCMail (and other) lists back in November described significant changes to the structure of the JISC Advisory Services:

 JISC and the Advisory Services have been looking at ways to be more agile and flexible to respond to the changing needs and demands of thefurther and higher education communities. The outcome of this review is to create a new company called JISC Services.

JISC infoNet, JISC Legal, JISC TechDis, Netskills, Procureweb and TASI are coming together to create JISC Services which will formally come into existence on 1 August 2009.

The aim of the new company is to create a more flexible and comprehensive source of advice, with increased opportunities for addressing new and changing needs across the community. This change is designed to ensure that our services continue to offer the internationally acclaimed advice for which they are renowned. Putting the further and higher education communities at the centre of what we do will be strengthened by working together as one company to deliver expertise and advice.

You will still be able to access all of the services you currently value via the usual channels and over the next few months the services will increasingly join together at events, on projects and in producing resources.

Find out more about the JISC Services at: http://www.jisc-services.ac.uk

I recently wrote about the closure of organisations and best practices for preserving the resources hosted on the organisational Web sites. This case is rather different – rather than closing down organisations JISC is building on the strengths of the advisory services and seeking to provide benefits to the user community by providing a more seamless interface (and remember, if the advisory services were regarding as failing to deliver a valuable service we might have expected the organisational changes to have provied an opportunity to close any lame ducks).

The challenge, from the perspective of Web site preservation, is to try to ensure that valuable resources are not lost in the merger process.  I feel that this change could provide valuable lessons for the wider community – the JISC Advisory Services, after all, won;t be the last organisations to be reorganised! And let’s hope that the lessons are based on a successful migration of the Web resources, and not lessons on what can go wrong!

Heritage Records and the Changing Filter through which we View our World

At both of the JISC-PoWR workshops delegates have been keen for the project team to spell out the reasons why institutions might want to preserve Web resources. These ‘drivers’ then give fuel to their case for the funds needed to archive the institutional Web site.

The idea of ‘heritage records’ is one that is often mentioned. Using Web sites as a ‘cultural snap shot’ has the potential to be a highly useful activity.

In his interesting and functional text Managing the Crowd: Rethinking Records Management for the Web 2.0 World Steve Bailey puts forward the point that deciding what will be important in the future is a tricky business. As he explains in the section on appraisal, retention and destruction: “The passage of time inevitably changes the filter through which we view our world and assess its priorities.”

Steve gives the example of the current plethora of Web sites that offer what we might call ‘quack’ remedies for medical problems. These sites may not seem to be of great interest right now but they may be invaluable to future historians who wish to demonstrate the distrust of the medical profession exhibited in 21st century western culture.

James Curral in his recent plenary talk at the recent Institutional Web Management Workshop used the example of blog posts made by soldiers out in Iraq and Afghanistan to demonstrate the irony of modern technology; these highly informative records could easily be lost while the diaries of World War II soldiers remain accessible.

Preservation mistakes have been made aplenty in the past. The destruction of much of the BBC’s flagship programmes in the 1970s has been well documented and in 2001 the BBC launched a a treasure hunt campaign to locate recordings of pre-1980 television or radio programmes. Ironically the Web site is no longer being updated, though it is still hosted on the BBc server.

So who can know what the future will bring? Which Web resources will we wish we had kept? Which student blog writer will go on to be a future prime minister or an infamous criminal? What bit of the terrabytes is the most important?

As Steve Bailey points out there is no crystal ball. It has always has been, and always will be, very difficult to predict what resources may prove to be valuable to future generations.

Although this offers little recompense for those making these choices, it does at least argue the case that we do need to preserve and we need to do so soon.

RSS Feeds Of Changes To Web Pages

Lorcan Dempsey picked up on the work of the JISC PoWR project in a blog post entitled The institutional record and web archiving. Lorcan described the presentation given at the first JISC PoWR workshop by Alison Wildish and Lizzie Richmond in which they described the changes to the University of Bath printed prospectus over the lifetime of the University of Bath.  Lorcan drew parallels between this print publication and the digital environment:

The University would always have kept the print manifestation; what now to do with the web manifestation? One of the interesting changes they note over this time is the ‘rise of the logo’, and tracing changes in how the institution presents itself over time is also interesting.

In a response to Lorcan’s post Tony Hirst referenced a blog post by Michael Nolan on the Edge Hill Web Services team blog in which Michael pointed out “one [example of interesting use of RSS] that caught my eye was the University of Warwick’s recent changes feed which allows you to subscribe to find out when the homepage changes. Better still, they have this for every page in their CMS.

An example of this can be seen for the Research page on the University of Warck Web site. Although not nornmally visibile to most end users who visit this page, there is a link to an RSS feed of recent changes to the page. Using tools such as the Greasemonkey RSS Panel (available for Firefox) you can view the changes, as shown below.

News feed of change on a page on the University of Warwick Web site 

In his comment on Lorcan’s blog Tony Hirst went on to suggest that “A change feed, like on a wiki, could be one way (maybe) of facilitating 1st, 2nd or 3rd party web page archiving?“. I think Tony might be right. And maybe we are seeing the University of Warwick pioneering this approach, as the feed of recent changes seems to be provided by their in-house Sitebuilder 2 software, “the University’s web publishing tool“.

Perhaps when institutions are next procuring a CMS system they should be asking if vendors provide RSS feeds of changes to pages.  

The History of Your Institution’s Web Site

A recent blog post by Lorcan Dempsey on “The institutional web presence again” provided a link to a page on “the history of U.Va. on the web” which provides details of 14 year’s history for the University of Virginia’s Web site from 1994-2008.

The page provides details of the Web usage statistics in the early years, with screen images shown of major changes to the home page from 1997 (unfortunately no screen images are available for the first three years of the service).

Information is provided on the people and groups responsible for the design, the changes which were made as new technologies became available, significant additional content that was added and details of awards which the site won.

This is an approach which I feel all institutions should consider taking.  And let’s start recording the history of those early years quickly, before the first generation of institutional Web managers start to retire, leave or forget the details of the institution’s Web history.

University of Virginia Web Site History

Collective Memory For Our Web Sites

I recently posted an article about the history of the University of Bath home page which included a link to a display of versions of the home page, based on data taken from the Internet Archive from 1997-2007.

Andy Powell, a former colleague of mine who used to work at the University of Bath, posted a Twitter message in response to my post in which he said:

@briankelly all pages prior to http://tinyurl.com/47pydq were mine – that’s what web design was like back then! – but all records now lost

03:56 PM June 18, 2008 from twhirl in reply to briankelly

But although formal records of the decisions made related to the home page (its design, the content, the links and the technologies used) may have been lost (or perhaps not even kept) I do wonder whether it may be possible to document such history based on anecdotal evidence from those who were either direectly involved with the decision-making process or perhaps who observed the results of the decisions.

From the museum’s sector and the experiences of The National Archive (with the public Wiki service) we know that the general public does seem willing to provide anecdotal information on resources such as old photographs.

This approach seems to reflect some of the discussions held at the first JISC PoWR workshop. As described in Ed Pinsent’s summary of the eventthere was a lot of ‘folk memory’ and anecdotal evidence, also sometimes called ‘tacit knowledge’“.

Would it be possible, I wonder, to provide access to images of an institution’s old Web pages and, though use of social networking technologies, encourage members of the institution (and perhaps the wider community) to document their recollections of the Web site?

When Web Sites Outlast Their Welcome

The JISC PoWR is concerned with ensuring that Web sites and their content don’t disappear. Right? Actually this would be to misunderstand what Web site preservation is about. Sometimes there may be a need for Web sites to be deleted. Indeed there may be dangers (both in terms of brand management and legal issues) if the content of Web sites outlasts its welcome.

Take, for example, the Web site for the National Open Centre, which is illustrated.

National Open Centre Home Page
If you visit the Web site you will find a nicely designed and easy to use Web site for the National Open Centre (NOC) which is a:

“national policy institute, a think tank to understand and articulate strategies to make effective use of Open Source Software and Open Standards (OS&S) for the benefit of all. It will focus on nationally relevant issues leading to proactive strategies to ensure that the UK effectively exploits the opportunities that arise with OS&S. The NOC will be independent, strategic and proactive and seeks the participation of interested and informed people.”

A very worthy organisation, it would seem (and I should add that I was a member of the NOC’s Advisory Group and attended the first meeting). Sadly, despite having a launch event at the Houses of Parliament, the NOC was unsuccessful in its attempts to gain funding, despite having a launch event at the Houses of Parliament. To paraphrase the Monty Python sketch “the NOC is not resting. The NOC is no more! It is bereft of life. It is an ex-NOC!“.

But this isn’t what you’d think if you explored the Web site. The home page urges visitors to “Get Involved!” and describes how it has “established the first set of subject panels. The topics being researched/discussed are: Public procurement, Open Standards and Open Source/Open Standard for SMEs. The Get Involved page then encourages visitors to participate with the NOC in a number of ways, including joining the Advisory Board, Subject Panels or the NOC Community. The only subtle indications that the NOC is no longer operational are the dates on various pages (206 or 2007) and the broken link to the NOC’s wiki from the Get Involved page.

The failure to provide any indication that the NOC failed to receive funding may be embarrassing to the partners of the service, which are list on the home page. But as well as such possible embarrassment, what would happen if visitors arrive at the Events page and read details of the one-day event on Document Standards planned for 4 July, which is illustrated below.

National Open Centre Home Page

There is no indication that this refers to an event which was planned for 4 July 2007.
And there are no details about registration, although a location for the event is given (NCC offices, London). What might happen if someone travels to London to attend the workshop (which covers interesting aspects related to open document formats, with apparent participation from companies such as Microsoft). If this happened, I’m sure the potential participants would be pretty upset to discover that the NOC folded last year.

This is, I would agree, unlikely to happen. But what if the information about the event had been held on one of the NOC’s partner organisations, such as Birmingham City Council?

This example is taken from the wider public sector. But within the higher and further education sector, with short term project funding provided for much development work, institutions may find themselves in a simple situation, with the intentions of a project team failing to be realised due to a failure to win funding, and perhaps a loss of project staff.

How should this possible scenario be addressed? This is something to be addressed in future posts, but for now your comments and suggestions would be welcomed.

Seeing Eye to Eye: Web Managers and Records Managers

The technological and cultural changes brought about by the advancement of the Web have, on numerous occasions, required co-ordinated interdisciplinary work. 0ne of the intended aims of the JISC-PoWR project is to help to bring together the differing perspectives of information professionals such records managers and Web managers in the context of the preservation of Web resource – and there are probably at least four sets of expertise involved: Web content creation (as perceived by Web authors), Web content management from a technical perspective (as perceived by those who choose or configure the underlying software), records and/or information management and digital preservation. So there’s the bringing together of intellectual perspectives: (What content needs to be preserved? How long for? Who is responsible?) and there’s the technical perspectives, assuming that the above questions come up with anything that needs preserving (How do we do it ? Are site-level tools more appropriate than national services? Does CMS X make preservation easier or harder than CMS Y? Is a more accessible site also a more preservable one? Are there configuration choices that affect preservation without (significantly) affecting other aspects of management?)

Within the JISC-PoWR team there have been a number of interesting discussions that have highlighted how differently the different players see Web preservation. To quote Ed Pinsent:

“The fundamental thing here is bringing together two sets of information professionals from differing backgrounds who, in many cases, don’t tend to speak to each other. Many records managers and archivists are, quite simply, afraid of IT and are content to let it remain a mystery. Conversely, it is quite possible to work in an IT career path in any organisation (not just HE/FE) and never be troubled by retention or preservation issues of any sort. “

The cliched view might regard Web managers as concerning themselves primarily with the day to day running of an organisation’s Web site, with preservation as an afterthought, and records managers focussing mainly on the preservation of resources and failing to understand some of the technical challenges presented. And although this may be a superficial description of the complexitities of they ways in which institutions go about the management of the digital resources, perhaps like many cliches, there could be an element of truth in such views.

Continue reading

The History Of the University of Bath Home Page

How has your institutional home page changed over time? And have you kept records of the changes and the decisions which were made?

In order to illustrate how an institution’s home page may change over a period of over 11 years the Internet Archive’s WayBack Machine was used to view the first occurrence of the University of Bath home page in every year from 1997 until 2007. (Note that in browsers which support Flash you can interact with the display and a more interactive access can be obtained if your install the PicLens plugin, although there are also links to the static images and an automated rolling display of the pages in the Internet Archive).

In addition to this display a 4 minute video with accompanying commentary has also been created, which discusses some of the changes to the home page over the 11 years. A screenshot of the video is given below:

 

Is this example of interest to other institutions? Would it be helpful if tools could be provided to assist the creation of a similar visualisation of the history of your institutional home page?

[Note image of video replaced by embedded YouTube video on 20 July 2009.]

Don’t Web Managers Care About Preservation?

In response to a post on ULCC’s DA Blog Chris Rushbridge, director of the DCC (and contributor to the Digital Curation Blog) commented:

The enthusiastic way in which web-site owners “re-brand” or “re-launch” their web-sites suggests that they are not particularly interested, long-term, in the details of the experience; continuous improvement means continuous discarding. One hopes that they are more interested in the information content, in some more abstract sense. Maybe we could measure this by tracking older pages across re-launches?

Perhaps a measure of commitment to the “look and feel” might be the lifetime since last reorganised?

Is this right? Don’t Web site owners care about preservation, preferring instead to continually add new features to their services?

I have to say that I disagree. Rather than continual changes to Web sites due to the Web site owners’ enthusiasms, I would argue that such changes usually occur in response to user needs and expectations, the growing importance of Web services (which mean that institutions have greater expectations of the services which will be provided) and an increasing understanding of the limitations of approaches taken to Web site development in the past.

One example of this has been the obligation (for legal and moral reasons) to enhance the accessibility of Web resources. Initially HTML authoring tools and Content Management Systems (CMSs) provided little support to enhance accessibility – indeed many CMSs generated low quality HTML which could not be processed by assistive technologies. Continue reading

Case Study for the Exploit Interactive and Cultivate Interactive E-Journals

Exploit Interactive was an e-journal which was funded by the EU’s Telematics for Libraries programme. Nine issues of the journal were published between May 1999 and October 2000.

After the project funding had ceased, additional funding from the EU was obtained to publish a new e-journal known as Cultivate Interactive which was launched in July 2000. However there was a need to define a policy and accompanying procedures for the preservation of the Exploit Interactive Web site and content.

As described in a case study document on Providing Access to an EU-funded Project Web Site after Completion of Funding the following policy decisions were taken:

  • The Web site’s domain name will be kept for at least 3 years after the end of funding.
  • We will seek to ensure the Web site continues for at least 10 years after
    the end of funding.
  • We will seek to ensure that the Web site continues to function, although we
    cannot give an absolute commitment to this.
  • We will not commit to fixing broken links to external resources.
  • We will not commit to fixing non-compliant HTML resources.

The case study went on to measure the disk storage used by the Web site and to quantify the costs. The storage requirements of less than 500 Mg of disk space were not significant so it was agreed to continue to pay for the www.exploit-lib.org domain name until at least October 2008.

Periodic automated links checks were carried out on the Web site to ensure that the internal links on the Web site continued to work.

A similar process was established for the Cultivate Interactive e-journal, when its funding ceased in February 2003. Continue reading