Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Identifieur interne : 002206 ( Istex/Curation ); précédent : 002205; suivant : 002207Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Auteurs : Marco Turchi [Italie] ; Josef Steinberger [Italie] ; Mijail Kabadjov [Italie] ; Ralf Steinberger [Italie]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.
Url:
DOI: 10.1007/978-3-642-15998-5_7
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :002369
Links to Exploration step
ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6Curation
No country items
Marco Turchi<affiliation><mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
<affiliation><mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
<affiliation><mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
<affiliation><mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15998-5_7</idno>
<idno type="url">https://api.istex.fr/document/0E86CDCBC38DF61735F16B424BFE01559A3650F6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002369</idno>
<idno type="wicri:Area/Istex/Curation">002206</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: marco.turchi@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: josef.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: mijail.kabadjov@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1"><mods:affiliation>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA), Italy</mods:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: ralf.steinberger@jrc.ec.europa.eu</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<idno type="DOI">10.1007/978-3-642-15998-5_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002206 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 002206 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Istex |étape= Curation |type= RBID |clé= ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6 |texte= Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation }}
This area was generated with Dilib version V0.6.32. |