LiWA technologies released in Open Source

Posted August 30, 2010
Areas : Archive Fidelity, Spam Cleansing, Temporal Coherence, Semantic Evolution, Social Web, Rich Media, General.

LiWA partners are pleased to announce the release in open-source of the complete list of components and tools issued from the LiWA project.

They are all grouped under the “liwa-technologies” project on Google code:
http://code.google.com/p/liwa-technologies/.


1° The Rich Media Capture Module - a plug-in dedicated to the capture of streaming video content:
http://code.google.com/p/liwa-technologies/source/browse/rich-media-capture
http://code.google.com/p/liwa-technologies/downloads/detail?name=rich-media-capture-plugin-1.0.jar

2° The Temporal Coherence Analyser - a plug-in dedicated to the analysis of the temporal coherence of the archived Web content:
http://code.google.com/p/liwa-technologies/source/browse/temporal-coherence

3° The Spam Assessment Interface - a Web service that enables the quality assessment of the archived Web content:
http://code.google.com/p/liwa-technologies/source/browse/assessment-interface

4° The Semantic Analizer - a component dedicated to the detection of terminology evolution:
http://code.google.com/p/liwa-technologies/source/browse/SemanticAnalyser
http://code.google.com/p/liwa-technologies/downloads/detail?name=SemanticAnalyser-1.0.zip

5° The Web Archive UI Framework - a client-side framework that helps creating User Interface helpers for Web archive browsing:
http://code.google.com/p/liwa-technologies/source/browse/web-archive-ui-framework



To learn more about each component, the Google project provides also a wiki space, giving a brief description of each module and the necessary steps for its deployment: http://code.google.com/p/liwa-technologies/w/list



You are all welcome to download and try out the LiWA components. Your feedback and comments will be greatly appreciated, helping us to improve the documentation and the usability of the technologies.

 

LiWA Third Newsletter published

Posted April 07, 2011
Areas : Archive Fidelity, Spam Cleansing, Temporal Coherence, Semantic Evolution, Social Web, Rich Media, General, Events.

The LiWA Newsletter No3 is now available, summarizing the findings and results of the 36 months project. Enjoy reading it!

 

The SHARC framework for data quality in Web archiving

Posted March 10, 2011
Areas : Archive Fidelity, General, Events.

The publication of “The SHARC framework for data quality in Web archiving”, co-written by D. Denev, A. Mazeika, M. Spaniol and G. Weikum, to the VLDB Journal 2011 (Impact factor: 4.517 (2009) has been accepted.

The download is available to download via online first in the VLDB Journal.

 

LiWA development mentioned at FIAT 2010

Posted December 01, 2010
Areas : Archive Fidelity, General, Events.

The poster entitled “What if web archiving was as reliable as a simple button?” has been presented at FIAT 2010, on 16th to 18th of October in Dublin

This poster focused on:
- an example of a shared platform Archivethe.net dedicated to heritage institutions
- archiving web video, its main issues and developments

 

LiWA papers at IWAW10

Posted September 21, 2010
Areas : Archive Fidelity, Temporal Coherence, Semantic Evolution, General, Events.

IWAW10 takes place on 22nd and 23rd of September in Vienna at the Austrian National Library

The following papers have been accepted for presentation at this International Web Archiving Workshop:
- “Archiving Web Video”, Radu Pop, Gabrile Vasile and Julien Masanes
- “The SOLAR System for Sharp Web Archiving”, Arturas Mazeika, Dimitar Denev, Marc Spaniol and Gerhard Weikum
- “Terminology Evolution Module for Web Archives in the LiWA Context”, Nina Tahmasebi, Gideon Zenz, Tereza Iofciu and Thomas Risse
- “Archiving Data Objects using Web Feeds”, Marilena Oita and Pierre Senellart.

 

IWAW Proceedings online

Posted November 13, 2009
Areas : Archive Fidelity, Temporal Coherence, Semantic Evolution, Social Web, General, Events.

IWAW09 took take place the 30th of September and 1st of October 2009, in conjunction with ECDL in Corfu (Greece). The proceedings are now available online.

Around 40 participants attended IWAW2009, which took place on Sep. 30 / Oct. 1 2009, in conjunction with ECDL in Corfu (Greece). The workshop provided a comprehensive overview on active research and practice on the preservation of the Web. This year’s workshop also addressed several new approaches and research (from virtual worlds preservation to temporal dimension of Web Archives) as well as practical issues addressed by Archiving institutions, specifically with respect to managing the storage of large volumes of digital material. In this context, a special Session was devoted to the WARC storage format, which has been accepted as a new ISO standard (ISO 28500:2009), as well as emerging tool support to handle these container objects.  In general, scalability issues and managing large-volume crawls were topics of intensive discussions, based on the increasing body of experience available in numerous institutions by now, running a series of Web archiving activities in a range of different configurations.

 

Talk about “Turning pure Web Page Storages into Living Web Archives” at Cultural Heritage on line

Posted October 20, 2009
Areas : Archive Fidelity, Temporal Coherence, Social Web, Rich Media, General, Events.

The LiWA applications and its R&D challenges will be presented at the Conference “Cultural Heritage on line Empowering users: an active role for user communities” at Florence, Italy on the 15th and 16th of December, 2009

Web content plays an increasingly important role in the knowledge-based society, and the preservation and long-term accessibility of Web history has high value (e.g., for scholarly studies, market analyses, intellectual property disputes, etc.). There is strongly growing interest in its preservation by libraries and archival organizations as well as emerging industrial services. Web content characteristics (high dynamics, volatility, contributor and format variety) make adequate Web archiving a challenge.
LiWA will look beyond the pure “freezing” of Web content snapshots for a long time, transforming pure snapshot storage into a “Living” Web Archive. In order to create Living Web Archives, the LiWA project will address R&D challenges in the three areas: Archive Fidelity, Archive coherence and Archive interpretability. The results of the project will be demonstrated within two application scenarios namely “Streaming Archive” and “Social Web Archive”. The Streaming Archive application will showcase the building of an audio-visual Web archive and how audio and video broadcast related web information can be preserved. The Social Web application will demonstrate how web archives can capture the dynamics and the different types of user interaction of the social web.

 

Talk “From Web page storages to Living Web Archive” at London

Posted June 17, 2009
Areas : Archive Fidelity, Temporal Coherence, General, Events.

Dr. Thomas Risse (L3S) will give a talk at the “JISC, the DPC and the UK Web Archiving Consortium Workshop”, at the The British Library Conference Centre in London, on July 21st.

The paper on “From Web page storages to Living Web Archive” will be presented by Dr. Thomas Risse, at the JISC, the DPC and the UK Web Archiving Consortium Workshop which will take place at The British Library Conference Centre in London, on July 21st.

 

Half day session on LiWA during IWAW

Posted September 18, 2008
Areas : Archive Fidelity, Spam Cleansing, Temporal Coherence, Semantic Evolution, General, Events.

A dedicated session took place during the 8th International Web Archiving Workshop

image
Over 70 web archivists and researchers in this domain attended the 8th edition of IWAW during which a full session was dedicated to present research objectives and early results from LiWA.
image Lots of questions and interest from the audience, which is good sign for us. See below links to presentations from this session:

Web Spam: a Survey with Vision for the Archivist
Andras Benczur, David Siklosi, Jacint Szabo, Istvan Biro, Zsolt Fekete, Miklos Kurucz, Attila Pereszlenyi, Simon Racz, Adrienn Szabo (paper, presentation)

imageTerminology Evolution in Web Archiving: Open Issues
Nina Tahmasebi, Tereza Iofciu, Thomas Risse, Claudia Niederée, Wolf Siberski (paper,presentation)

Liwa Architecture
Radu Pop, Wolf Siberski, Mark Williamson (presentation)

“Catch me if you can”. Temporal Coherence of Web Archives
Marc Spaniol (presentation)

The Challenge of Dynamic Links
Mark Williamson (presentation)