UK Library To Archive Every British Web Page On The Internet

The British Library plans to record every single British web page on the internet. Over 4.8 million websites will be committed to self-replicating servers order to preserve the nation's "digital memory" for future historians.

They point out that in just these few short years, firsthand accounts of things like the 2005 London transit bombings and Britain's 2010 election campaign have already vanished.

"Stuff out there on the Web is ephemeral," says Lucie Burgess, the library's head of content strategy.

"The average life of a web page is only 75 days, because websites change, and contents get taken down... If we don't capture this material, a critical piece of the jigsaw puzzle of our understanding of the 21st century will be lost."

The British Library appears serious in its intent to record EVERY web page, as in the demo for the project they've shown so far, record have been made of things like Mumsnet (a parenting
resource website) to pages from Amazon Marketplace to a blog kept by a 9-year-old-girl about her school lunches.

Previously they have been unable to achieve this due to having to ask site owners for permission before snapshotting their sites. Now, however, they are able to record the contents of any and every site
ending in the suffix .uk; that's over 4.8 million websites, with over 1 billion individual pages.

1 billion pages which will be recorded once every year at its most sparce, but as much as once a day for news websites. For the sake of precious mother earth, the majority of this work will not be kept on
paper, but rather as self-replicating copies on servers around the country, which will be constantly updated as technology evolves.

"It is trying to capture an unstable, dynamic process in a fixed way, which is all a librarian can hope to do, but it is missing one of the most positive and negative aspects of the web" said technology
historian Edward Tenner.

"Librarians want things as fixed as possible, so people know where something is, people know the content of something. The problem is, the goals of the library profession and the structure of information have been diverging."

Will the British Library be able to keep a constant, fixed source of digital information, a still steel anchor in the ever-swirling maelstrom of information around it? It's hard to say yet, but for now, it looks like they're committed.

 

Source: The British Library