Serveur d'exploration sur les dispositifs haptiques

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel

Identifieur interne : 001674 ( Pmc/Curation ); précédent : 001673; suivant : 001675

Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel

Auteurs : Dave F. Kleinschmidt ; T. Florian Jaeger

Source :

RBID : PMC:4744792

Abstract

Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker’s /p/ might be physically indistinguishable from another talker’s /b/ (cf. lack of invariance). We characterize the computational problem posed by such a subjectively non-stationary world and propose that the speech perception system overcomes this challenge by (1) recognizing previously encountered situations, (2) generalizing to other situations based on previous similar experience, and (3) adapting to novel situations. We formalize this proposal in the ideal adapter framework: (1) to (3) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on two critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires listeners learn to represent the structured component of cross-situation variability in the speech signal. We discuss how these two aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.


Url:
DOI: 10.1037/a0038695
PubMed: 25844873
PubMed Central: 4744792

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:4744792

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel</title>
<author>
<name sortKey="Kleinschmidt, Dave F" sort="Kleinschmidt, Dave F" uniqKey="Kleinschmidt D" first="Dave F." last="Kleinschmidt">Dave F. Kleinschmidt</name>
</author>
<author>
<name sortKey="Jaeger, T Florian" sort="Jaeger, T Florian" uniqKey="Jaeger T" first="T. Florian" last="Jaeger">T. Florian Jaeger</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">25844873</idno>
<idno type="pmc">4744792</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4744792</idno>
<idno type="RBID">PMC:4744792</idno>
<idno type="doi">10.1037/a0038695</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">001674</idno>
<idno type="wicri:Area/Pmc/Curation">001674</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel</title>
<author>
<name sortKey="Kleinschmidt, Dave F" sort="Kleinschmidt, Dave F" uniqKey="Kleinschmidt D" first="Dave F." last="Kleinschmidt">Dave F. Kleinschmidt</name>
</author>
<author>
<name sortKey="Jaeger, T Florian" sort="Jaeger, T Florian" uniqKey="Jaeger T" first="T. Florian" last="Jaeger">T. Florian Jaeger</name>
</author>
</analytic>
<series>
<title level="j">Psychological review</title>
<idno type="ISSN">0033-295X</idno>
<idno type="eISSN">1939-1471</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p id="P1">Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker’s /p/ might be physically indistinguishable from another talker’s /b/ (cf.
<italic>lack of invariance</italic>
). We characterize the computational problem posed by such a subjectively non-stationary world and propose that the speech perception system overcomes this challenge by (1) recognizing previously encountered situations, (2) generalizing to other situations based on previous similar experience, and (3) adapting to novel situations. We formalize this proposal in the
<italic>ideal adapter</italic>
framework: (1) to (3) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on two critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires listeners learn to represent the
<italic>structured</italic>
component of cross-situation variability in the speech signal. We discuss how these two aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.</p>
</div>
</front>
</TEI>
<pmc article-type="research-article">
<pmc-comment>The publisher of this article does not allow downloading of the full text in XML form.</pmc-comment>
<pmc-dir>properties manuscript</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-journal-id">0376476</journal-id>
<journal-id journal-id-type="pubmed-jr-id">6783</journal-id>
<journal-id journal-id-type="nlm-ta">Psychol Rev</journal-id>
<journal-id journal-id-type="iso-abbrev">Psychol Rev</journal-id>
<journal-title-group>
<journal-title>Psychological review</journal-title>
</journal-title-group>
<issn pub-type="ppub">0033-295X</issn>
<issn pub-type="epub">1939-1471</issn>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">25844873</article-id>
<article-id pub-id-type="pmc">4744792</article-id>
<article-id pub-id-type="doi">10.1037/a0038695</article-id>
<article-id pub-id-type="manuscript">NIHMS755563</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Kleinschmidt</surname>
<given-names>Dave F.</given-names>
</name>
<aff id="A1">University of Rochester, Department of Brain and Cognitive Sciences</aff>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jaeger</surname>
<given-names>T. Florian</given-names>
</name>
<aff id="A2">University of Rochester, Departments of Brain and Cognitive Sciences and Computer Science</aff>
</contrib>
</contrib-group>
<author-notes>
<corresp id="cor1">Corresponding author:
<email>dkleinschmidt@bcs.rochester.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="nihms-submitted">
<day>2</day>
<month>2</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="ppub">
<month>4</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>07</day>
<month>2</month>
<year>2016</year>
</pub-date>
<volume>122</volume>
<issue>2</issue>
<fpage>148</fpage>
<lpage>203</lpage>
<pmc-comment>elocation-id from pubmed: 10.1037/a0038695</pmc-comment>
<abstract>
<p id="P1">Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker’s /p/ might be physically indistinguishable from another talker’s /b/ (cf.
<italic>lack of invariance</italic>
). We characterize the computational problem posed by such a subjectively non-stationary world and propose that the speech perception system overcomes this challenge by (1) recognizing previously encountered situations, (2) generalizing to other situations based on previous similar experience, and (3) adapting to novel situations. We formalize this proposal in the
<italic>ideal adapter</italic>
framework: (1) to (3) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on two critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires listeners learn to represent the
<italic>structured</italic>
component of cross-situation variability in the speech signal. We discuss how these two aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.</p>
</abstract>
<kwd-group>
<kwd>speech perception</kwd>
<kwd>generalization</kwd>
<kwd>adaptation</kwd>
<kwd>statistical learning</kwd>
<kwd>hierarchical structure</kwd>
<kwd>lack of invariance</kwd>
<kwd>non-stationarity</kwd>
</kwd-group>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/HapticV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001674 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 001674 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    HapticV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:4744792
   |texte=   Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:25844873" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a HapticV1 

Wicri

This area was generated with Dilib version V0.6.23.
Data generation: Mon Jun 13 01:09:46 2016. Site generation: Wed Mar 6 09:54:07 2024