Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Identifieur interne : 000646 ( Main/Merge ); précédent : 000645; suivant : 000647Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation
Auteurs : Marco Turchi [Italie] ; Josef Steinberger [Italie] ; Mijail Kabadjov [Italie] ; Ralf Steinberger [Italie]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.
Url:
DOI: 10.1007/978-3-642-15998-5_7
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 002369
- to stream Istex, to step Curation: 002206
- to stream Istex, to step Checkpoint: 000221
Links to Exploration step
ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15998-5_7</idno>
<idno type="url">https://api.istex.fr/document/0E86CDCBC38DF61735F16B424BFE01559A3650F6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002369</idno>
<idno type="wicri:Area/Istex/Curation">002206</idno>
<idno type="wicri:Area/Istex/Checkpoint">000221</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Turchi M:using:parallel:corpora</idno>
<idno type="wicri:Area/Main/Merge">000646</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation</title>
<author><name sortKey="Turchi, Marco" sort="Turchi, Marco" uniqKey="Turchi M" first="Marco" last="Turchi">Marco Turchi</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: marco.turchi@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Josef" sort="Steinberger, Josef" uniqKey="Steinberger J" first="Josef" last="Steinberger">Josef Steinberger</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: josef.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Kabadjov, Mijail" sort="Kabadjov, Mijail" uniqKey="Kabadjov M" first="Mijail" last="Kabadjov">Mijail Kabadjov</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: mijail.kabadjov@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Steinberger, Ralf" sort="Steinberger, Ralf" uniqKey="Steinberger R" first="Ralf" last="Steinberger">Ralf Steinberger</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>European Commission - Joint Research Centre (JRC), IPSC - GlobSec, Via Fermi 2749, 21027, Ispra (VA)</wicri:regionArea>
<wicri:noRegion>Ispra (VA)</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: ralf.steinberger@jrc.ec.europa.eu</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">0E86CDCBC38DF61735F16B424BFE01559A3650F6</idno>
<idno type="DOI">10.1007/978-3-642-15998-5_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000646 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 000646 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:0E86CDCBC38DF61735F16B424BFE01559A3650F6 |texte= Using Parallel Corpora for Multilingual (Multi-document) Summarisation Evaluation }}
This area was generated with Dilib version V0.6.32. |