Press release

Posted January 27, 2011
Areas : General, Events.

Press release originally published in german by L3S.
January 13, 2011

Internet pages are like stars in the sky: They are uncountable many, and every day new appear. They bring new texts, information, and pictures into existence, some of which only exist in the Internet. But who is to decide which pages are worth to preserve? Libraries and archives are currently rather helpless to deal with such gigantic amounts of data. With the current state of technology, it is not feasible to review and select all material for archiving. But this is about to change.
Two European projects will help to preserve the digital cultural heritage of the World Wide Web. The “Archive Community Memories” project focuses on automatic selection of web-content that is socially relevant. A new archiving method will not specifically search for topics or events in the Web and rate their importance. To achieve this not only Web pages of organizations or companies are evaluated, but also private content like publicly accessible blogs or social networks like Facebook. Social networks can be very helpful in discovering important Web pages, as users will suggest such pages to their friends. By harnessing such and other information, the project will help to optimize and to speed-up the reviewing process of national libraries or archives.
The total size of the EU project is eight million Euro. L3S Research Center at Leibniz University Hannover receives one million euro, and leads the scientific management. The overall management is led by researchers from University of Sheffield. There are also several other partners involved in the project, like Yahoo!, Südwestrundfunk, and Deutsche Welle.
The project is a follow-up of “Living Web Archives”, in which researchers from L3S Research Center and other European partners have been working in the past years. The goal was to improve the quality of Web archives, especially regarding multi-media content, spam detection, as well as enabling the use of the archive for future generations.