“Catch me if you can”: Visual Analysis of Coherence Defects in Web Archiving

Posted November 13, 2009
Areas : Temporal Coherence.

A paper on “Visual Analysis of Coherence Defects in Web Archiving” has been published as part of the IWAW09 that took take place the 30th of September and 1st of October 2009, in conjunction with ECDL in Corfu (Greece). The paper is available online as part of the IWAW 2009 proceedings.

The paper “‘Catch me if you can’: Visual Analysis of Coherence Defects in Web Archiving” by Marc Spaniol, Arturas Mazeika, Dimitar Denev and Gerhard Weikum deals with the problems in Web archiving arising from the World Wide Web is a continuously evolving network of contents (e.g. Web pages, images, sound les, etc.) and an interconnecting link structure. The papers discusses questions that arise about detecting, measuring them and - finally - understanding coherence defects. To this end, visualization strategies are being presented that might be applied on diff erent level of granularities: working with (in the ideal case) properly set last-modi ed timestamps, based on metadata extracted from the crawler in accelerated crawl-revisit pairs, or from the Internet Archive’s WARC les. In order to help
the archivist in understanding the nature of these defects, this paper investigates means for visualizing change behavior and archive coherence.