Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

The Importance of Segmentation Probability in Segment Based Speech Recognizers

Identifieur interne : 001E88 ( Crin/Curation ); précédent : 001E87; suivant : 001E89

The Importance of Segmentation Probability in Segment Based Speech Recognizers

Auteurs : Jan Verhasselt ; Irina Illina ; Jean-Pierre Martens ; Yifan Gong ; Jean-Paul Haton

Source :

RBID : CRIN:verhasselt97b

English descriptors

Abstract

In segment based recognizers, variable length speech segments are mapped to the basic speech units (phones, diphones, ...). In this paper, we address the acoustical modeling of these basic units in the framework of segmental posterior distribution models (SPDM). The joint posterior probability of a unit sequence \underline{u} and a segmentation \underline{s}, Pr(\underline{u},\underline{s}|\underline{\bf x}) can be written as the product of the segmentation probability Pr(\underline{s}|\underline{bf x}) and the unit classification probability Pr(\underline{u}|\underline{s},\underline{\bf x}), where \underline{\bf x} is the sequence of acoustic observation parameter vectors. In particular, we point out the role of the segmentation probability and demonstrate that it does improve the recognition accuracy. We present evidence for this in two different tasks (speaker dependent continuous word recognition in French and speaker independent phone recognition in American English) in combination with two different unit classification models.

Links toward previous steps (curation, corpus...)


Links to Exploration step

CRIN:verhasselt97b

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" wicri:score="492">The Importance of Segmentation Probability in Segment Based Speech Recognizers</title>
</titleStmt>
<publicationStmt>
<idno type="RBID">CRIN:verhasselt97b</idno>
<date when="1997" year="1997">1997</date>
<idno type="wicri:Area/Crin/Corpus">001E88</idno>
<idno type="wicri:Area/Crin/Curation">001E88</idno>
<idno type="wicri:explorRef" wicri:stream="Crin" wicri:step="Curation">001E88</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">The Importance of Segmentation Probability in Segment Based Speech Recognizers</title>
<author>
<name sortKey="Verhasselt, Jan" sort="Verhasselt, Jan" uniqKey="Verhasselt J" first="Jan" last="Verhasselt">Jan Verhasselt</name>
</author>
<author>
<name sortKey="Illina, Irina" sort="Illina, Irina" uniqKey="Illina I" first="Irina" last="Illina">Irina Illina</name>
</author>
<author>
<name sortKey="Martens, Jean Pierre" sort="Martens, Jean Pierre" uniqKey="Martens J" first="Jean-Pierre" last="Martens">Jean-Pierre Martens</name>
</author>
<author>
<name sortKey="Gong, Yifan" sort="Gong, Yifan" uniqKey="Gong Y" first="Yifan" last="Gong">Yifan Gong</name>
</author>
<author>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>segment-based recognizer</term>
<term>speech recognition</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en" wicri:score="2697">In segment based recognizers, variable length speech segments are mapped to the basic speech units (phones, diphones, ...). In this paper, we address the acoustical modeling of these basic units in the framework of segmental posterior distribution models (SPDM). The joint posterior probability of a unit sequence \underline{u} and a segmentation \underline{s}, Pr(\underline{u},\underline{s}|\underline{\bf x}) can be written as the product of the segmentation probability Pr(\underline{s}|\underline{bf x}) and the unit classification probability Pr(\underline{u}|\underline{s},\underline{\bf x}), where \underline{\bf x} is the sequence of acoustic observation parameter vectors. In particular, we point out the role of the segmentation probability and demonstrate that it does improve the recognition accuracy. We present evidence for this in two different tasks (speaker dependent continuous word recognition in French and speaker independent phone recognition in American English) in combination with two different unit classification models.</div>
</front>
</TEI>
<BibTex type="inproceedings">
<ref>verhasselt97b</ref>
<crinnumber>97-R-002</crinnumber>
<category>3</category>
<equipe>RFIA</equipe>
<author>
<e>Verhasselt, Jan</e>
<e>Illina, Irina</e>
<e>Martens, Jean-Pierre</e>
<e>Gong, Yifan</e>
<e>Haton, Jean-Paul</e>
</author>
<title>The Importance of Segmentation Probability in Segment Based Speech Recognizers</title>
<booktitle>{International Conference on Acoustics, Speech, and Signal Processing - ICASSP'97, Munich, Germany}</booktitle>
<year>1997</year>
<volume>2</volume>
<pages>1407-1410</pages>
<month>apr</month>
<publisher>IEEE Computer society Press</publisher>
<keywords>
<e>speech recognition</e>
<e>segment-based recognizer</e>
</keywords>
<abstract>In segment based recognizers, variable length speech segments are mapped to the basic speech units (phones, diphones, ...). In this paper, we address the acoustical modeling of these basic units in the framework of segmental posterior distribution models (SPDM). The joint posterior probability of a unit sequence \underline{u} and a segmentation \underline{s}, Pr(\underline{u},\underline{s}|\underline{\bf x}) can be written as the product of the segmentation probability Pr(\underline{s}|\underline{bf x}) and the unit classification probability Pr(\underline{u}|\underline{s},\underline{\bf x}), where \underline{\bf x} is the sequence of acoustic observation parameter vectors. In particular, we point out the role of the segmentation probability and demonstrate that it does improve the recognition accuracy. We present evidence for this in two different tasks (speaker dependent continuous word recognition in French and speaker independent phone recognition in American English) in combination with two different unit classification models.</abstract>
</BibTex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Crin/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001E88 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Crin/Curation/biblio.hfd -nk 001E88 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Crin
   |étape=   Curation
   |type=    RBID
   |clé=     CRIN:verhasselt97b
   |texte=   The Importance of Segmentation Probability in Segment Based Speech Recognizers
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022