SHARC: Framework for Quality-Conscious Web Archiving

Posted May 29, 2009
Areas : Temporal Coherence, General, Events.

A paper on quality-conscious web archiving has been accepted in the 35th International Conference on Very Large Data Bases (VLDB 2009)

The paper on quality-conscious web archiving by Dimitar Denev, Arturas Mazeika, Marc Spaniol, and Gerhard Weikum has been accepted for presentation to the 35th International Conference on Very Large Data Bases (VLDB 2009). The conference takes place on 24-28 August in Lyon, France. The paper presents the SHARC framework for assessing the data quality in Web archives and for tuning capturing strategies towards better quality with given resources. The paper defines quality measures, characterise their properties, and derives a suite of quality-conscious scheduling strategies for archive crawling.