Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation

Identifieur interne : 002206 ( Istex/Curation ); précédent : 002205; suivant : 002207

Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation

Auteurs : Marco Turchi [Italie] ; Josef Steinberger [Italie] ; Mijail Kabadjov [Italie] ; Ralf Steinberger [Italie]

Source :

RBID : ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6

Abstract

Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.

Url:
DOI: 10.1007/978-3-642-15998-5_7

Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6

Curation

No country items

Marco Turchi
<affiliation>
<mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
Josef Steinberger
<affiliation>
<mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
Mijail Kabadjov
<affiliation>
<mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
Ralf Steinberger
<affiliation>
<mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author>
<name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15998-5_7</idno>
<idno type="url">https://api.istex.fr/document/0E86CDCBC38DF61735F16B424BFE01559A3650F6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002369</idno>
<idno type="wicri:Area/Istex/Curation">002206</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author>
<name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1">
<mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<idno type="DOI">10.1007/978-3-642-15998-5_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002206 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 002206 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Curation
   |type=    RBID
   |clé=     ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6
   |texte=   Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024