InforLorV4, Main, Merge, bibRecord, 00B649

A General Joint Additive and Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition

Identifieur interne : 00B649 ( Main/Merge ); précédent : 00B648; suivant : 00B650

A General Joint Additive and Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition

Auteurs : Mohamed Afify ; Yifan Gong ; Jean-Paul Haton [France]

Source :

IEEE transactions on Speech and Audio processing ; 1998.

RBID : CRIN:afify98b

English descriptors

KwdEn :
- HMM, Lombard Speech, compensation, noise.

Abstract

In this paper, a unified approach to the acoustic mismatch problem is proposed. A maximum likelihood state-based additive bias compensation algorithm is developed for the continuous density hidden Markov model (CDHMM). Based on this technique, specific bias models in the mel cepstral and the linear spectral domains are presented. Among these a new polynomial trend bias model in the mel cepstral domain is derived, which proved effective for Lombard speech compensation. In addition, a joint estimation algorithm for additive and convolutive bias compensation is proposed. This algorithm is based on applying the above EM technique in both domains, in conjunction with a parallel model combination (PMC) based transformation. The compensation of the difference coefficients in the proposed framework is also studied.The database c onsists of a 21 confusable word vocabulary uttered by 24 speakers. Three mismatched versions of the database are considered,i.e., Lombar d speech, 15 dB noisy Lombard speech, and 5 dB noisy Lombard speech. The proposed techniques result in 50.9========percnt;, 74.6========percnt;, and 67.3========percnt; reduction in the performance difference between matched and uncompensated word error rates for the three mismatch conditions, respectively. When dynamic coefficients are considered the corresponding reductions are 46.8========percnt;, 72.4========percnt;, and 70.9========percnt;.

Links toward previous steps (curation, corpus...)

to stream Crin, to step Corpus: 002409
to stream Crin, to step Curation: 002409
to stream Crin, to step Checkpoint: 002176

Links to Exploration step

CRIN:afify98b

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" wicri:score="703">A General Joint Additive and  Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition</title>
</titleStmt>
<publicationStmt><idno type="RBID">CRIN:afify98b</idno>
<date when="1998" year="1998">1998</date>
<idno type="wicri:Area/Crin/Corpus">002409</idno>
<idno type="wicri:Area/Crin/Curation">002409</idno>
<idno type="wicri:explorRef" wicri:stream="Crin" wicri:step="Curation">002409</idno>
<idno type="wicri:Area/Crin/Checkpoint">002176</idno>
<idno type="wicri:explorRef" wicri:stream="Crin" wicri:step="Checkpoint">002176</idno>
<idno type="wicri:Area/Main/Merge">00B649</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">A General Joint Additive and  Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition</title>
<author><name sortKey="Afify, Mohamed" sort="Afify, Mohamed" uniqKey="Afify M" first="Mohamed" last="Afify">Mohamed Afify</name>
</author>
<author><name sortKey="Gong, Yifan" sort="Gong, Yifan" uniqKey="Gong Y" first="Yifan" last="Gong">Yifan Gong</name>
</author>
<author><name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
<affiliation><country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j">IEEE transactions on Speech and Audio processing</title>
<imprint><date when="1998" type="published">1998</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>HMM</term>
<term>Lombard Speech</term>
<term>compensation</term>
<term>noise</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en" wicri:score="3806">In this paper, a unified approach to the acoustic mismatch problem is proposed. A maximum likelihood state-based additive bias compensation  algorithm is developed for the continuous density hidden Markov model (CDHMM). Based on this technique, specific bias models in the mel cepstral and the linear spectral domains are presented. Among these a new polynomial trend bias model in the mel cepstral domain is derived, which proved effective for Lombard speech compensation. In addition, a joint estimation algorithm for additive and convolutive bias compensation is proposed. This algorithm is based on applying the above EM technique in both domains, in conjunction with a parallel model combination (PMC) based transformation. The compensation of the difference coefficients in the proposed framework is also studied.The database c onsists of a 21 confusable word vocabulary uttered by 24 speakers. Three mismatched versions of the database are considered,i.e.,  Lombar d speech, 15 dB noisy Lombard speech, and 5 dB noisy Lombard speech. The proposed techniques result in 50.9========percnt;, 74.6========percnt;, and 67.3========percnt; reduction in the performance difference between matched and uncompensated word error rates for the three mismatch conditions, respectively. When dynamic coefficients are considered the corresponding reductions are 46.8========percnt;, 72.4========percnt;, and 70.9========percnt;.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 00B649 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 00B649 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     CRIN:afify98b
   |texte=   A General Joint Additive and  Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

A General Joint Additive and Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition

A General Joint Additive and Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri