Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

"This sentence is wrong." Detecting errors in machine-translated sentences

Identifieur interne : 000132 ( PascalFrancis/Corpus ); précédent : 000131; suivant : 000133

"This sentence is wrong." Detecting errors in machine-translated sentences

Auteurs : Sylvain Raybaud ; David Langlois ; Kamel Smaïli

Source :

RBID : Francis:12-0189218

Descripteurs français

English descriptors

Abstract

Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0922-6567
A03   1    @0 Mach. transl.
A05       @2 25
A06       @2 1
A08 01  1  ENG  @1 "This sentence is wrong." Detecting errors in machine-translated sentences
A11 01  1    @1 RAYBAUD (Sylvain)
A11 02  1    @1 LANGLOIS (David)
A11 03  1    @1 SMAÏLI (Kamel)
A14 01      @1 PAROLE, LORIA, BP 239 @2 54506 Nancy @3 FRA @Z 1 aut. @Z 2 aut. @Z 3 aut.
A20       @1 1-34
A21       @1 2011
A23 01      @0 ENG
A43 01      @1 INIST @2 21070 @5 354000508919860010
A44       @0 0000 @1 © 2012 INIST-CNRS. All rights reserved.
A45       @0 1 p.1/4
A47 01  1    @0 12-0189218
A60       @1 P
A61       @0 A
A64 01  1    @0 Machine translation
A66 01      @0 DEU
C01 01    ENG  @0 Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.
C02 01  L    @0 52477 @1 XV
C02 02  L    @0 524
C03 01  L  FRE  @0 Traduction automatique @2 NI @5 01
C03 01  L  ENG  @0 Machine translation @2 NI @5 01
C03 02  L  FRE  @0 Classificateur @2 NI @5 02
C03 02  L  ENG  @0 Classifier @2 NI @5 02
C03 03  L  FRE  @0 Linguistique informatique @2 NI @5 03
C03 03  L  ENG  @0 Computational linguistics @2 NI @5 03
C03 04  L  FRE  @0 Traitement automatique des langues naturelles @2 NI @5 04
C03 04  L  ENG  @0 Natural language processing @2 NI @5 04
C03 05  L  FRE  @0 Erreur @2 NI @5 05
C03 05  L  ENG  @0 Error @2 NI @5 05
C03 06  L  FRE  @0 Evaluation @2 NI @5 06
C03 06  L  ENG  @0 Assessment @2 NI @5 06
C03 07  L  FRE  @0 Réseau neuronal @2 NI @5 07
C03 07  L  ENG  @0 Neural network @2 NI @5 07
C03 08  L  FRE  @0 Mathématiques @2 NI @5 08
C03 08  L  ENG  @0 Mathematics @2 NI @5 08
C03 09  L  FRE  @0 Machine vecteur support @4 INC @5 31
N21       @1 149

Format Inist (serveur)

NO : FRANCIS 12-0189218 INIST
ET : "This sentence is wrong." Detecting errors in machine-translated sentences
AU : RAYBAUD (Sylvain); LANGLOIS (David); SMAÏLI (Kamel)
AF : PAROLE, LORIA, BP 239/54506 Nancy/France (1 aut., 2 aut., 3 aut.)
DT : Publication en série; Niveau analytique
SO : Machine translation; ISSN 0922-6567; Allemagne; Da. 2011; Vol. 25; No. 1; Pp. 1-34; Bibl. 1 p.1/4
LA : Anglais
EA : Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.
CC : 52477; 524
FD : Traduction automatique; Classificateur; Linguistique informatique; Traitement automatique des langues naturelles; Erreur; Evaluation; Réseau neuronal; Mathématiques; Machine vecteur support
ED : Machine translation; Classifier; Computational linguistics; Natural language processing; Error; Assessment; Neural network; Mathematics
LO : INIST-21070.354000508919860010
ID : 12-0189218

Links to Exploration step

Francis:12-0189218

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">"This sentence is wrong." Detecting errors in machine-translated sentences</title>
<author>
<name sortKey="Raybaud, Sylvain" sort="Raybaud, Sylvain" uniqKey="Raybaud S" first="Sylvain" last="Raybaud">Sylvain Raybaud</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Langlois, David" sort="Langlois, David" uniqKey="Langlois D" first="David" last="Langlois">David Langlois</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Smaili, Kamel" sort="Smaili, Kamel" uniqKey="Smaili K" first="Kamel" last="Smaïli">Kamel Smaïli</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0189218</idno>
<date when="2011">2011</date>
<idno type="stanalyst">FRANCIS 12-0189218 INIST</idno>
<idno type="RBID">Francis:12-0189218</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000132</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">"This sentence is wrong." Detecting errors in machine-translated sentences</title>
<author>
<name sortKey="Raybaud, Sylvain" sort="Raybaud, Sylvain" uniqKey="Raybaud S" first="Sylvain" last="Raybaud">Sylvain Raybaud</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Langlois, David" sort="Langlois, David" uniqKey="Langlois D" first="David" last="Langlois">David Langlois</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Smaili, Kamel" sort="Smaili, Kamel" uniqKey="Smaili K" first="Kamel" last="Smaïli">Kamel Smaïli</name>
<affiliation>
<inist:fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Machine translation</title>
<title level="j" type="abbreviated">Mach. transl.</title>
<idno type="ISSN">0922-6567</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Machine translation</title>
<title level="j" type="abbreviated">Mach. transl.</title>
<idno type="ISSN">0922-6567</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Assessment</term>
<term>Classifier</term>
<term>Computational linguistics</term>
<term>Error</term>
<term>Machine translation</term>
<term>Mathematics</term>
<term>Natural language processing</term>
<term>Neural network</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traduction automatique</term>
<term>Classificateur</term>
<term>Linguistique informatique</term>
<term>Traitement automatique des langues naturelles</term>
<term>Erreur</term>
<term>Evaluation</term>
<term>Réseau neuronal</term>
<term>Mathématiques</term>
<term>Machine vecteur support</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0922-6567</s0>
</fA01>
<fA03 i2="1">
<s0>Mach. transl.</s0>
</fA03>
<fA05>
<s2>25</s2>
</fA05>
<fA06>
<s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>"This sentence is wrong." Detecting errors in machine-translated sentences</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>RAYBAUD (Sylvain)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>LANGLOIS (David)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>SMAÏLI (Kamel)</s1>
</fA11>
<fA14 i1="01">
<s1>PAROLE, LORIA, BP 239</s1>
<s2>54506 Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA20>
<s1>1-34</s1>
</fA20>
<fA21>
<s1>2011</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21070</s2>
<s5>354000508919860010</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2012 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>1 p.1/4</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>12-0189218</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Machine translation</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.</s0>
</fC01>
<fC02 i1="01" i2="L">
<s0>52477</s0>
<s1>XV</s1>
</fC02>
<fC02 i1="02" i2="L">
<s0>524</s0>
</fC02>
<fC03 i1="01" i2="L" l="FRE">
<s0>Traduction automatique</s0>
<s2>NI</s2>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="L" l="ENG">
<s0>Machine translation</s0>
<s2>NI</s2>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="L" l="FRE">
<s0>Classificateur</s0>
<s2>NI</s2>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="L" l="ENG">
<s0>Classifier</s0>
<s2>NI</s2>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="L" l="FRE">
<s0>Linguistique informatique</s0>
<s2>NI</s2>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="L" l="ENG">
<s0>Computational linguistics</s0>
<s2>NI</s2>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="L" l="FRE">
<s0>Traitement automatique des langues naturelles</s0>
<s2>NI</s2>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="L" l="ENG">
<s0>Natural language processing</s0>
<s2>NI</s2>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="L" l="FRE">
<s0>Erreur</s0>
<s2>NI</s2>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="L" l="ENG">
<s0>Error</s0>
<s2>NI</s2>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="L" l="FRE">
<s0>Evaluation</s0>
<s2>NI</s2>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="L" l="ENG">
<s0>Assessment</s0>
<s2>NI</s2>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="L" l="FRE">
<s0>Réseau neuronal</s0>
<s2>NI</s2>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="L" l="ENG">
<s0>Neural network</s0>
<s2>NI</s2>
<s5>07</s5>
</fC03>
<fC03 i1="08" i2="L" l="FRE">
<s0>Mathématiques</s0>
<s2>NI</s2>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="L" l="ENG">
<s0>Mathematics</s0>
<s2>NI</s2>
<s5>08</s5>
</fC03>
<fC03 i1="09" i2="L" l="FRE">
<s0>Machine vecteur support</s0>
<s4>INC</s4>
<s5>31</s5>
</fC03>
<fN21>
<s1>149</s1>
</fN21>
</pA>
</standard>
<server>
<NO>FRANCIS 12-0189218 INIST</NO>
<ET>"This sentence is wrong." Detecting errors in machine-translated sentences</ET>
<AU>RAYBAUD (Sylvain); LANGLOIS (David); SMAÏLI (Kamel)</AU>
<AF>PAROLE, LORIA, BP 239/54506 Nancy/France (1 aut., 2 aut., 3 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>Machine translation; ISSN 0922-6567; Allemagne; Da. 2011; Vol. 25; No. 1; Pp. 1-34; Bibl. 1 p.1/4</SO>
<LA>Anglais</LA>
<EA>Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.</EA>
<CC>52477; 524</CC>
<FD>Traduction automatique; Classificateur; Linguistique informatique; Traitement automatique des langues naturelles; Erreur; Evaluation; Réseau neuronal; Mathématiques; Machine vecteur support</FD>
<ED>Machine translation; Classifier; Computational linguistics; Natural language processing; Error; Assessment; Neural network; Mathematics</ED>
<LO>INIST-21070.354000508919860010</LO>
<ID>12-0189218</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000132 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000132 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Francis:12-0189218
   |texte=   "This sentence is wrong." Detecting errors in machine-translated sentences
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022