<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
    xmlns:admin="http://webns.net/mvcb/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:content="http://purl.org/rss/1.0/modules/content/">

    <channel>
    
    <title>Liwa : News</title>
    <link>http://liwa-project.eu/index.php/news/</link>
    <description></description>
    <dc:language>en</dc:language>
    <dc:creator>nathalie@europarchive.org</dc:creator>
    <dc:rights>Copyright 2011</dc:rights>
    <dc:date>2011-04-22T13:42:31+00:00</dc:date>
    <admin:generatorAgent rdf:resource="http://expressionengine.com/" />
    

    <item>
      <title>Press release</title>
      <link>http://liwa-project.eu/index.php/news/press_release/</link>
      <guid>http://liwa-project.eu/index.php/news/press_release/#When:09:31:50Z</guid>
      <description>Press release originally published in german by L3S.
January 13, 2011
Internet pages are like stars in the sky: They are uncountable many, and every day new appear. They bring new texts, information, and pictures into existence, some of which only exist in the Internet. But who is to decide which pages are worth to preserve? Libraries and archives are currently rather helpless to deal with such gigantic amounts of data. With the current state of technology, it is not feasible to review and select all material for archiving. But this is about to change.
Two European projects will help to preserve the digital cultural heritage of the World Wide Web. The &#8220;Archive Community Memories&#8221; project focuses on automatic selection of web&#45;content that is socially relevant. A new archiving method will not specifically search for topics or events in the Web and rate their importance. To achieve this not only Web pages of organizations or companies are evaluated, but also private content like publicly accessible blogs or social networks like Facebook. Social networks can be very helpful in discovering important Web pages, as users will suggest such pages to their friends. By harnessing such and other information, the project will help to optimize and to speed&#45;up the reviewing process of national libraries or archives.
The total size of the EU project is eight million Euro. L3S Research Center at Leibniz University Hannover receives one million euro, and leads the scientific management. The overall management is led by researchers from University of Sheffield. There are also several other partners involved in the project, like Yahoo!, Südwestrundfunk, and Deutsche Welle.
The project is a follow&#45;up of &#8220;Living Web Archives&#8221;, in which researchers from L3S Research Center and other European partners have been working in the past years. The goal was to improve the quality of Web archives, especially regarding multi&#45;media content, spam detection, as well as enabling the use of the archive for future generations.</description>
      <dc:subject>General, Events</dc:subject>
      <dc:date>2011-01-27T09:31:50+00:00</dc:date>
    </item>

    <item>
      <title>LiWA technologies released in Open Source</title>
      <link>http://liwa-project.eu/index.php/news/liwa_technologies_released_in_open_source/</link>
      <guid>http://liwa-project.eu/index.php/news/liwa_technologies_released_in_open_source/#When:08:18:22Z</guid>
      <description>LiWA partners are pleased to announce the release in open&#45;source of the complete list of components and tools issued from the LiWA project. 
They are all grouped under the “liwa&#45;technologies” project on Google code:
http://code.google.com/p/liwa&#45;technologies/.


1° The Rich Media Capture Module &#45; a plug&#45;in dedicated to the capture of streaming video content:
http://code.google.com/p/liwa&#45;technologies/source/browse/rich&#45;media&#45;capture
http://code.google.com/p/liwa&#45;technologies/downloads/detail?name=rich&#45;media&#45;capture&#45;plugin&#45;1.0.jar

2° The Temporal Coherence Analyser &#45; a plug&#45;in dedicated to the analysis of the temporal coherence of the archived Web content:
http://code.google.com/p/liwa&#45;technologies/source/browse/temporal&#45;coherence

3° The Spam Assessment Interface &#45; a Web service that enables the quality assessment of the archived Web content:
http://code.google.com/p/liwa&#45;technologies/source/browse/assessment&#45;interface

4° The Semantic Analizer &#45; a component dedicated to the detection of terminology evolution:
http://code.google.com/p/liwa&#45;technologies/source/browse/SemanticAnalyser
http://code.google.com/p/liwa&#45;technologies/downloads/detail?name=SemanticAnalyser&#45;1.0.zip

5° The Web Archive UI Framework &#45; a client&#45;side framework that helps creating User Interface helpers for Web archive browsing:
http://code.google.com/p/liwa&#45;technologies/source/browse/web&#45;archive&#45;ui&#45;framework

To learn more about each component, the Google project provides also a wiki space, giving a brief description of each module and the necessary steps for its deployment: http://code.google.com/p/liwa&#45;technologies/w/list

 You are all welcome to download and try out the LiWA components. Your feedback and comments will be greatly appreciated, helping us to improve the documentation and the usability of the technologies.</description>
      <dc:subject>Archive Fidelity, Spam Cleansing, Temporal Coherence, Semantic Evolution, Social Web, Rich Media, General</dc:subject>
      <dc:date>2010-08-30T08:18:22+00:00</dc:date>
    </item>

    <item>
      <title>LiWA Evolution Tracking Module released</title>
      <link>http://liwa-project.eu/index.php/news/liwa_evolution_tracking_module_released/</link>
      <guid>http://liwa-project.eu/index.php/news/liwa_evolution_tracking_module_released/#When:13:42:31Z</guid>
      <description>LiWA partners are pleased to announce the release in open&#45;source of the LiWA Evolution Tracking Module.
The LiWA Terminology Evolution Tracking Module is a java module for Word sense evolution tracking, released under the “liwa&#45;technologies” project on Google code:
http://code.google.com/p/liwa&#45;technologies/downloads/detail?name=LiWAEvoTracking.zip&amp;amp;can=2&amp;amp;q=</description>
      <dc:subject>Semantic Evolution</dc:subject>
      <dc:date>2011-04-22T13:42:31+00:00</dc:date>
    </item>

    <item>
      <title>LiWA Third Newsletter published</title>
      <link>http://liwa-project.eu/index.php/news/liwa_third_newsletter_published/</link>
      <guid>http://liwa-project.eu/index.php/news/liwa_third_newsletter_published/#When:16:36:53Z</guid>
      <description>The LiWA Newsletter No3 is now available, summarizing the findings and results of the 36 months project. Enjoy reading it!</description>
      <dc:subject>Archive Fidelity, Spam Cleansing, Temporal Coherence, Semantic Evolution, Social Web, Rich Media, General, Events</dc:subject>
      <dc:date>2011-04-07T16:36:53+00:00</dc:date>
    </item>

    <item>
      <title>The SHARC framework for data quality in Web archiving</title>
      <link>http://liwa-project.eu/index.php/news/the_sharc_framework_for_data_quality_in_web_archiving/</link>
      <guid>http://liwa-project.eu/index.php/news/the_sharc_framework_for_data_quality_in_web_archiving/#When:17:38:41Z</guid>
      <description>The publication of &#8220;The SHARC framework for data quality in Web archiving&#8221;, co&#45;written by D. Denev, A. Mazeika, M. Spaniol and G. Weikum, to the VLDB Journal 2011 (Impact factor: 4.517 (2009) has been accepted.
The download is available to download via online first in the VLDB Journal.</description>
      <dc:subject>Archive Fidelity, General, Events</dc:subject>
      <dc:date>2011-03-10T17:38:41+00:00</dc:date>
    </item>

    <item>
      <title>Web spam classification: a few features worth more</title>
      <link>http://liwa-project.eu/index.php/news/web_spam_classification_a_few_features_worth_more/</link>
      <guid>http://liwa-project.eu/index.php/news/web_spam_classification_a_few_features_worth_more/#When:17:25:40Z</guid>
      <description>The paper entitled &#8220;Web spam classification: a few features worth more&#8221;, co&#45;written by M. Erdélyi, A. Garzó, and A. A. Benczúr has been accepted for presentation in Joint Web Quality 2011 in conjunction with the WWW2011, Hyderabad, India, ACM Press 2011.
In this paper we investigate how much various classes of Web spam features, some requiring very high computational effort, add to the classification accuracy. We realize that advances in machine learning, an area that has received less attention in the adversarial IR community, yields more improvement than new features and result in low cost yet accurate spam filters.</description>
      <dc:subject>Spam Cleansing, General, Events</dc:subject>
      <dc:date>2011-03-10T17:25:40+00:00</dc:date>
    </item>

    <item>
      <title>Temporal Analysis for Web Spam Detection: An Overview</title>
      <link>http://liwa-project.eu/index.php/news/temporal_analysis_for_web_spam_detection_an_overview/</link>
      <guid>http://liwa-project.eu/index.php/news/temporal_analysis_for_web_spam_detection_an_overview/#When:17:25:28Z</guid>
      <description>The paper &#8220;Temporal Analysis for Web Spam Detection: An Overview&#8221; co&#45;written by M. Erdélyi, and A. A. Benczúr has been accepted for presentation in TWAW 2011 in conjunction with the WWW2011, Hyderabad, India, CEUR Workshop Proceedings 2011. 
In this paper we give a comprehensive overview of temporal features devised for Web spam detection providing measurements for different feature sets.</description>
      <dc:subject>Temporal Coherence, General, Events</dc:subject>
      <dc:date>2011-03-10T17:25:28+00:00</dc:date>
    </item>

    <item>
      <title>Language Evolution On The Go</title>
      <link>http://liwa-project.eu/index.php/news/language_evolution_on_the_go1/</link>
      <guid>http://liwa-project.eu/index.php/news/language_evolution_on_the_go1/#When:17:24:46Z</guid>
      <description>Zenz. G., Tahmasebi, N., and T. Risse have been invited for submission of their paper &#8220;Language Evolution On The Go&#8221; (Extended Version) to the Journal on Multimedia Tools and Applications.</description>
      <dc:subject>Semantic Evolution, General, Events</dc:subject>
      <dc:date>2011-03-10T17:24:46+00:00</dc:date>
    </item>

    <item>
      <title>On the Applicability of Word Sense Discrimination on 201 Years of Modern English</title>
      <link>http://liwa-project.eu/index.php/news/on_the_applicability_of_word_sense_discrimination_on_201_years_of_modern_en/</link>
      <guid>http://liwa-project.eu/index.php/news/on_the_applicability_of_word_sense_discrimination_on_201_years_of_modern_en/#When:17:22:56Z</guid>
      <description>The paper &#8220;On the Applicability of Word Sense Discrimination on 201 Years of Modern English&#8221;, co&#45;written by Tahmasebi, N., K. Niklas, G. Zenz, and T. Risse has been submitted to the Journal of Computational Linguistics. 
Word sense discrimination is the first, important step towards automatic detection of language evolution within large, historic document collections. By comparing the found word senses over time, we can reveal and use important information that will improve understanding and accessibility of a digital archive. Algorithms for word sense discrimination have been developed while keeping today’s language in mind and have thus been evaluated on well selected, modern datasets. The quality of the word senses found in the discrimination step has a large impact on the detection of language evolution. Therefore, as a first step, we verify that word sense discrimination can successfully be applied to digitized historic documents and that the results correctly correspond to word senses. Because accessibility of digitized historic collections is influenced also by the quality of the optical character recognition (OCR), as a second step we investigate the effects of OCR errors on word sense discrimination results. All evaluations in this paper are performed on The Times Archive, a collection of newspaper articles from 1785 &#45; 1985.</description>
      <dc:subject>Semantic Evolution, General, Events</dc:subject>
      <dc:date>2011-03-10T17:22:56+00:00</dc:date>
    </item>

    <item>
      <title>Talk at IWAW 2010</title>
      <link>http://liwa-project.eu/index.php/news/talk_at_iwaw_2010/</link>
      <guid>http://liwa-project.eu/index.php/news/talk_at_iwaw_2010/#When:10:14:47Z</guid>
      <description>Jaap Blom gave a talk about Internet Archiving in an Audiovisual Institute at IWAW 2010 in Vienna, Austria on September 22, 2010
In this presentation three use cases were presented:

&amp;nbsp;   * preserve Dutch public broadcasting websites (preservation of Dutch cultural heritage)
&amp;nbsp;   * collect Internet AV materials (mainly AV content that is broadcasted on the internet but not on traditional media)
&amp;nbsp;   * preserve web context (to be used by archivists for looking up relevant context information for annotating Radio &amp;amp; Television items)</description>
      <dc:subject>Rich Media, General, Events</dc:subject>
      <dc:date>2011-02-01T10:14:47+00:00</dc:date>
    </item>

    
    </channel>
</rss>