Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition

Identifieur interne : 000144 ( Pmc/Corpus ); précédent : 000143; suivant : 000145

Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition

Auteurs : Simon Rigoulot ; Eugen Wassiliwizky ; Marc D. Pell

Source :

RBID : PMC:3690349

Abstract

Recent studies suggest that the time course for recognizing vocal expressions of basic emotion in speech varies significantly by emotion type, implying that listeners uncover acoustic evidence about emotions at different rates in speech (e.g., fear is recognized most quickly whereas happiness and disgust are recognized relatively slowly; Pell and Kotz, 2011). To investigate whether vocal emotion recognition is largely dictated by the amount of time listeners are exposed to speech or the position of critical emotional cues in the utterance, 40 English participants judged the meaning of emotionally-inflected pseudo-utterances presented in a gating paradigm, where utterances were gated as a function of their syllable structure in segments of increasing duration from the end of the utterance (i.e., gated syllable-by-syllable from the offset rather than the onset of the stimulus). Accuracy for detecting six target emotions in each gate condition and the mean identification point for each emotion in milliseconds were analyzed and compared to results from Pell and Kotz (2011). We again found significant emotion-specific differences in the time needed to accurately recognize emotions from speech prosody, and new evidence that utterance-final syllables tended to facilitate listeners' accuracy in many conditions when compared to utterance-initial syllables. The time needed to recognize fear, anger, sadness, and neutral from speech cues was not influenced by how utterances were gated, although happiness and disgust were recognized significantly faster when listeners heard the end of utterances first. Our data provide new clues about the relative time course for recognizing vocally-expressed emotions within the 400–1200 ms time window, while highlighting that emotion recognition from prosody can be shaped by the temporal properties of speech.


Url:
DOI: 10.3389/fpsyg.2013.00367
PubMed: 23805115
PubMed Central: 3690349

Links to Exploration step

PMC:3690349

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition</title>
<author>
<name sortKey="Rigoulot, Simon" sort="Rigoulot, Simon" uniqKey="Rigoulot S" first="Simon" last="Rigoulot">Simon Rigoulot</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff2">
<institution>McGill Centre for Research on Brain, Language and Music</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wassiliwizky, Eugen" sort="Wassiliwizky, Eugen" uniqKey="Wassiliwizky E" first="Eugen" last="Wassiliwizky">Eugen Wassiliwizky</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff3">
<institution>Cluster of Excellence “Languages of Emotion”, Freie Universität Berlin</institution>
<country>Berlin, Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pell, Marc D" sort="Pell, Marc D" uniqKey="Pell M" first="Marc D." last="Pell">Marc D. Pell</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff2">
<institution>McGill Centre for Research on Brain, Language and Music</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23805115</idno>
<idno type="pmc">3690349</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3690349</idno>
<idno type="RBID">PMC:3690349</idno>
<idno type="doi">10.3389/fpsyg.2013.00367</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000144</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000144</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition</title>
<author>
<name sortKey="Rigoulot, Simon" sort="Rigoulot, Simon" uniqKey="Rigoulot S" first="Simon" last="Rigoulot">Simon Rigoulot</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff2">
<institution>McGill Centre for Research on Brain, Language and Music</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wassiliwizky, Eugen" sort="Wassiliwizky, Eugen" uniqKey="Wassiliwizky E" first="Eugen" last="Wassiliwizky">Eugen Wassiliwizky</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff3">
<institution>Cluster of Excellence “Languages of Emotion”, Freie Universität Berlin</institution>
<country>Berlin, Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Pell, Marc D" sort="Pell, Marc D" uniqKey="Pell M" first="Marc D." last="Pell">Marc D. Pell</name>
<affiliation>
<nlm:aff id="aff1">
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff2">
<institution>McGill Centre for Research on Brain, Language and Music</institution>
<country>Montreal, QC, Canada</country>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in Psychology</title>
<idno type="eISSN">1664-1078</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Recent studies suggest that the time course for recognizing vocal expressions of basic emotion in speech varies significantly by emotion type, implying that listeners uncover acoustic evidence about emotions at different rates in speech (e.g.,
<italic>fear</italic>
is recognized most quickly whereas
<italic>happiness</italic>
and
<italic>disgust</italic>
are recognized relatively slowly; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). To investigate whether vocal emotion recognition is largely dictated by the amount of time listeners are exposed to speech or the position of critical emotional cues in the utterance, 40 English participants judged the meaning of emotionally-inflected pseudo-utterances presented in a gating paradigm, where utterances were gated as a function of their syllable structure in segments of increasing duration from the
<italic>end</italic>
of the utterance (i.e., gated syllable-by-syllable from the
<italic>offset</italic>
rather than the onset of the stimulus). Accuracy for detecting six target emotions in each gate condition and the mean identification point for each emotion in milliseconds were analyzed and compared to results from Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
). We again found significant emotion-specific differences in the time needed to accurately recognize emotions from speech prosody, and new evidence that utterance-final syllables tended to facilitate listeners' accuracy in many conditions when compared to utterance-initial syllables. The time needed to recognize
<italic>fear</italic>
,
<italic>anger</italic>
,
<italic>sadness</italic>
, and
<italic>neutral</italic>
from speech cues was not influenced by how utterances were gated, although
<italic>happiness</italic>
and
<italic>disgust</italic>
were recognized significantly faster when listeners heard the end of utterances first. Our data provide new clues about the relative time course for recognizing vocally-expressed emotions within the 400–1200 ms time window, while highlighting that emotion recognition from prosody can be shaped by the temporal properties of speech.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Alter, K" uniqKey="Alter K">K. Alter</name>
</author>
<author>
<name sortKey="Rank, E" uniqKey="Rank E">E. Rank</name>
</author>
<author>
<name sortKey="Kotz, S A" uniqKey="Kotz S">S. A. Kotz</name>
</author>
<author>
<name sortKey="Toepel, U" uniqKey="Toepel U">U. Toepel</name>
</author>
<author>
<name sortKey="Besson, M" uniqKey="Besson M">M. Besson</name>
</author>
<author>
<name sortKey="Schirmer, A" uniqKey="Schirmer A">A. Schirmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Banse, R" uniqKey="Banse R">R. Banse</name>
</author>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barkhuysen, P" uniqKey="Barkhuysen P">P. Barkhuysen</name>
</author>
<author>
<name sortKey="Krahmer, E" uniqKey="Krahmer E">E. Krahmer</name>
</author>
<author>
<name sortKey="Swerts, M" uniqKey="Swerts M">M. Swerts</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Becker, D V" uniqKey="Becker D">D. V. Becker</name>
</author>
<author>
<name sortKey="Neel, R" uniqKey="Neel R">R. Neel</name>
</author>
<author>
<name sortKey="Srinivasan, N" uniqKey="Srinivasan N">N. Srinivasan</name>
</author>
<author>
<name sortKey="Neufeld, S" uniqKey="Neufeld S">S. Neufeld</name>
</author>
<author>
<name sortKey="Kumar, D" uniqKey="Kumar D">D. Kumar</name>
</author>
<author>
<name sortKey="Fouse, S" uniqKey="Fouse S">S. Fouse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boersma, P" uniqKey="Boersma P">P. Boersma</name>
</author>
<author>
<name sortKey="Weenink, D" uniqKey="Weenink D">D. Weenink</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bolinger, D" uniqKey="Bolinger D">D. Bolinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bolinger, D" uniqKey="Bolinger D">D. Bolinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cacioppo, J T" uniqKey="Cacioppo J">J. T. Cacioppo</name>
</author>
<author>
<name sortKey="Gardner, W L" uniqKey="Gardner W">W. L. Gardner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Calder, A J" uniqKey="Calder A">A. J. Calder</name>
</author>
<author>
<name sortKey="Keane, J" uniqKey="Keane J">J. Keane</name>
</author>
<author>
<name sortKey="Lawrence, A D" uniqKey="Lawrence A">A. D. Lawrence</name>
</author>
<author>
<name sortKey="Manes, F" uniqKey="Manes F">F. Manes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Calder, A J" uniqKey="Calder A">A. J. Calder</name>
</author>
<author>
<name sortKey="Lawrence, A D" uniqKey="Lawrence A">A. D. Lawrence</name>
</author>
<author>
<name sortKey="Young, A W" uniqKey="Young A">A. W. Young</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Calder, A J" uniqKey="Calder A">A. J. Calder</name>
</author>
<author>
<name sortKey="Keane, J" uniqKey="Keane J">J. Keane</name>
</author>
<author>
<name sortKey="Young, A W" uniqKey="Young A">A. W. Young</name>
</author>
<author>
<name sortKey="Lawrence, A D" uniqKey="Lawrence A">A. D. Lawrence</name>
</author>
<author>
<name sortKey="Mason, S" uniqKey="Mason S">S. Mason</name>
</author>
<author>
<name sortKey="Barker, R" uniqKey="Barker R">R. Barker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Calvo, M G" uniqKey="Calvo M">M. G. Calvo</name>
</author>
<author>
<name sortKey="Nummenmaa, L" uniqKey="Nummenmaa L">L. Nummenmaa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carretie, L" uniqKey="Carretie L">L. Carretié</name>
</author>
<author>
<name sortKey="Mercado, F" uniqKey="Mercado F">F. Mercado</name>
</author>
<author>
<name sortKey="Tapia, M" uniqKey="Tapia M">M. Tapia</name>
</author>
<author>
<name sortKey="Hinojosa, J A" uniqKey="Hinojosa J">J. A. Hinojosa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cornew, L" uniqKey="Cornew L">L. Cornew</name>
</author>
<author>
<name sortKey="Carver, L" uniqKey="Carver L">L. Carver</name>
</author>
<author>
<name sortKey="Love, T" uniqKey="Love T">T. Love</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cosmides, L" uniqKey="Cosmides L">L. Cosmides</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ekman, P" uniqKey="Ekman P">P. Ekman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ekman, P" uniqKey="Ekman P">P. Ekman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ekman, P" uniqKey="Ekman P">P. Ekman</name>
</author>
<author>
<name sortKey="Levenson, R W" uniqKey="Levenson R">R. W. Levenson</name>
</author>
<author>
<name sortKey="Friesen, W V" uniqKey="Friesen W">W. V. Friesen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grosjean, F" uniqKey="Grosjean F">F. Grosjean</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grosjean, F" uniqKey="Grosjean F">F. Grosjean</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hess, U" uniqKey="Hess U">U. Hess</name>
</author>
<author>
<name sortKey="Beaupre, M G" uniqKey="Beaupre M">M. G. Beaupré</name>
</author>
<author>
<name sortKey="Cheung, N" uniqKey="Cheung N">N. Cheung</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Izard, C E" uniqKey="Izard C">C. E. Izard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jaywant, A" uniqKey="Jaywant A">A. Jaywant</name>
</author>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Johnstone, T" uniqKey="Johnstone T">T. Johnstone</name>
</author>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Juslin, P N" uniqKey="Juslin P">P. N. Juslin</name>
</author>
<author>
<name sortKey="Laukka, P" uniqKey="Laukka P">P. Laukka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ladd, D R" uniqKey="Ladd D">D. R. Ladd</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ladd, D R" uniqKey="Ladd D">D. R. Ladd</name>
</author>
<author>
<name sortKey="Silverman, K" uniqKey="Silverman K">K. Silverman</name>
</author>
<author>
<name sortKey="Tolkmitt, F" uniqKey="Tolkmitt F">F. Tolkmitt</name>
</author>
<author>
<name sortKey="Bergmann, G" uniqKey="Bergmann G">G. Bergmann</name>
</author>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Laukka, P" uniqKey="Laukka P">P. Laukka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Laukka, P" uniqKey="Laukka P">P. Laukka</name>
</author>
<author>
<name sortKey="Juslin, P" uniqKey="Juslin P">P. Juslin</name>
</author>
<author>
<name sortKey="Bresin, R" uniqKey="Bresin R">R. Bresin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leinonen, L" uniqKey="Leinonen L">L. Leinonen</name>
</author>
<author>
<name sortKey="Hiltunen, T" uniqKey="Hiltunen T">T. Hiltunen</name>
</author>
<author>
<name sortKey="Linnankoski, I" uniqKey="Linnankoski I">I. Linnankoski</name>
</author>
<author>
<name sortKey="Laakso, M L" uniqKey="Laakso M">M.-L. Laakso</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Levenson, R W" uniqKey="Levenson R">R. W. Levenson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Levitt, E A" uniqKey="Levitt E">E. A. Levitt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Oller, D K" uniqKey="Oller D">D. K. Oller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Palermo, R" uniqKey="Palermo R">R. Palermo</name>
</author>
<author>
<name sortKey="Coltheart, M" uniqKey="Coltheart M">M. Coltheart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulmann, S" uniqKey="Paulmann S">S. Paulmann</name>
</author>
<author>
<name sortKey="Kotz, S A" uniqKey="Kotz S">S. A. Kotz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulmann, S" uniqKey="Paulmann S">S. Paulmann</name>
</author>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulmann, S" uniqKey="Paulmann S">S. Paulmann</name>
</author>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulmann, S" uniqKey="Paulmann S">S. Paulmann</name>
</author>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
<author>
<name sortKey="Kotz, S A" uniqKey="Kotz S">S. A. Kotz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
<author>
<name sortKey="Baum, S R" uniqKey="Baum S">S. R. Baum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
<author>
<name sortKey="Kotz, S" uniqKey="Kotz S">S. Kotz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
<author>
<name sortKey="Paulmann, S" uniqKey="Paulmann S">S. Paulmann</name>
</author>
<author>
<name sortKey="Dara, C" uniqKey="Dara C">C. Dara</name>
</author>
<author>
<name sortKey="Alasseri, A" uniqKey="Alasseri A">A. Alasseri</name>
</author>
<author>
<name sortKey="Kotz, S A" uniqKey="Kotz S">S. A. Kotz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
<author>
<name sortKey="Skorup, V" uniqKey="Skorup V">V. Skorup</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rigoulot, S" uniqKey="Rigoulot S">S. Rigoulot</name>
</author>
<author>
<name sortKey="Pell, M D" uniqKey="Pell M">M. D. Pell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rozin, P" uniqKey="Rozin P">P. Rozin</name>
</author>
<author>
<name sortKey="Lowery, L" uniqKey="Lowery L">L. Lowery</name>
</author>
<author>
<name sortKey="Ebert, R" uniqKey="Ebert R">R. Ebert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sauter, D" uniqKey="Sauter D">D. Sauter</name>
</author>
<author>
<name sortKey="Eisner, F" uniqKey="Eisner F">F. Eisner</name>
</author>
<author>
<name sortKey="Ekman, P" uniqKey="Ekman P">P. Ekman</name>
</author>
<author>
<name sortKey="Scott, S K" uniqKey="Scott S">S. K. Scott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sauter, D" uniqKey="Sauter D">D. Sauter</name>
</author>
<author>
<name sortKey="Scott, S K" uniqKey="Scott S">S. K. Scott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
<author>
<name sortKey="Banse, R" uniqKey="Banse R">R. Banse</name>
</author>
<author>
<name sortKey="Wallbott, H G" uniqKey="Wallbott H">H. G. Wallbott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
<author>
<name sortKey="Banse, R" uniqKey="Banse R">R. Banse</name>
</author>
<author>
<name sortKey="Wallbott, H G" uniqKey="Wallbott H">H. G. Wallbott</name>
</author>
<author>
<name sortKey="Goldbeck, T" uniqKey="Goldbeck T">T. Goldbeck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schirmer, A" uniqKey="Schirmer A">A. Schirmer</name>
</author>
<author>
<name sortKey="Kotz, S A" uniqKey="Kotz S">S. A. Kotz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simon Thomas, E" uniqKey="Simon Thomas E">E. Simon-Thomas</name>
</author>
<author>
<name sortKey="Keltner, D" uniqKey="Keltner D">D. Keltner</name>
</author>
<author>
<name sortKey="Sauter, D" uniqKey="Sauter D">D. Sauter</name>
</author>
<author>
<name sortKey="Sinicropi Yao, L" uniqKey="Sinicropi Yao L">L. Sinicropi-Yao</name>
</author>
<author>
<name sortKey="Abramson, A" uniqKey="Abramson A">A. Abramson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sobin, C" uniqKey="Sobin C">C. Sobin</name>
</author>
<author>
<name sortKey="Alpert, M" uniqKey="Alpert M">M. Alpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Szameitat, D P" uniqKey="Szameitat D">D. P. Szameitat</name>
</author>
<author>
<name sortKey="Kreifelts, B" uniqKey="Kreifelts B">B. Kreifelts</name>
</author>
<author>
<name sortKey="Alter, K" uniqKey="Alter K">K. Alter</name>
</author>
<author>
<name sortKey="Szameitat, A J" uniqKey="Szameitat A">A. J. Szameitat</name>
</author>
<author>
<name sortKey="Sterr, A" uniqKey="Sterr A">A. Sterr</name>
</author>
<author>
<name sortKey="Grodd, W" uniqKey="Grodd W">W. Grodd</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thompson, W F" uniqKey="Thompson W">W. F. Thompson</name>
</author>
<author>
<name sortKey="Balkwill, L L" uniqKey="Balkwill L">L.-L. Balkwill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tracy, J L" uniqKey="Tracy J">J. L. Tracy</name>
</author>
<author>
<name sortKey="Robins, R W" uniqKey="Robins R">R. W. Robins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Bezooijen, R" uniqKey="Van Bezooijen R">R. Van Bezooijen</name>
</author>
<author>
<name sortKey="Otto, S" uniqKey="Otto S">S. Otto</name>
</author>
<author>
<name sortKey="Heenan, T" uniqKey="Heenan T">T. Heenan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wagner, H L" uniqKey="Wagner H">H. L. Wagner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wallbott, H G" uniqKey="Wallbott H">H. G. Wallbott</name>
</author>
<author>
<name sortKey="Scherer, K R" uniqKey="Scherer K">K. R. Scherer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilson, D" uniqKey="Wilson D">D. Wilson</name>
</author>
<author>
<name sortKey="Wharton, T" uniqKey="Wharton T">T. Wharton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zuckerman, M" uniqKey="Zuckerman M">M. Zuckerman</name>
</author>
<author>
<name sortKey="Lipets, M S" uniqKey="Lipets M">M. S. Lipets</name>
</author>
<author>
<name sortKey="Koivumaki, J H" uniqKey="Koivumaki J">J. H. Koivumaki</name>
</author>
<author>
<name sortKey="Rosenthal, R" uniqKey="Rosenthal R">R. Rosenthal</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Front Psychol</journal-id>
<journal-id journal-id-type="iso-abbrev">Front Psychol</journal-id>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title-group>
<journal-title>Frontiers in Psychology</journal-title>
</journal-title-group>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23805115</article-id>
<article-id pub-id-type="pmc">3690349</article-id>
<article-id pub-id-type="doi">10.3389/fpsyg.2013.00367</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research Article</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Rigoulot</surname>
<given-names>Simon</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wassiliwizky</surname>
<given-names>Eugen</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pell</surname>
<given-names>Marc D.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Faculty of Medicine, School of Communication Sciences and Disorders, McGill University</institution>
<country>Montreal, QC, Canada</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>McGill Centre for Research on Brain, Language and Music</institution>
<country>Montreal, QC, Canada</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Cluster of Excellence “Languages of Emotion”, Freie Universität Berlin</institution>
<country>Berlin, Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Anjali Bhatara, Université Paris Descartes, France</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: David V. Becker, Arizona State University, USA; Emiel Krahmer, Tilburg University, Netherlands</p>
</fn>
<corresp id="fn001">*Correspondence: Simon Rigoulot, Faculty of Medicine, School of Communication Sciences and Disorders, McGill University, 1266 Avenue des Pins Ouest, Montreal, QC H3G 1A8, Canada e-mail:
<email xlink:type="simple">simon.rigoulot@mail.mcgill.ca</email>
</corresp>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Frontiers in Emotion Science, a specialty of Frontiers in Psychology.</p>
</fn>
</author-notes>
<pub-date pub-type="epreprint">
<day>27</day>
<month>3</month>
<year>2013</year>
</pub-date>
<pub-date pub-type="epub">
<day>24</day>
<month>6</month>
<year>2013</year>
</pub-date>
<pub-date pub-type="collection">
<year>2013</year>
</pub-date>
<volume>4</volume>
<elocation-id>367</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>2</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>6</month>
<year>2013</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2013 Rigoulot, Wassiliwizky and Pell.</copyright-statement>
<copyright-year>2013</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.</license-p>
</license>
</permissions>
<abstract>
<p>Recent studies suggest that the time course for recognizing vocal expressions of basic emotion in speech varies significantly by emotion type, implying that listeners uncover acoustic evidence about emotions at different rates in speech (e.g.,
<italic>fear</italic>
is recognized most quickly whereas
<italic>happiness</italic>
and
<italic>disgust</italic>
are recognized relatively slowly; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). To investigate whether vocal emotion recognition is largely dictated by the amount of time listeners are exposed to speech or the position of critical emotional cues in the utterance, 40 English participants judged the meaning of emotionally-inflected pseudo-utterances presented in a gating paradigm, where utterances were gated as a function of their syllable structure in segments of increasing duration from the
<italic>end</italic>
of the utterance (i.e., gated syllable-by-syllable from the
<italic>offset</italic>
rather than the onset of the stimulus). Accuracy for detecting six target emotions in each gate condition and the mean identification point for each emotion in milliseconds were analyzed and compared to results from Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
). We again found significant emotion-specific differences in the time needed to accurately recognize emotions from speech prosody, and new evidence that utterance-final syllables tended to facilitate listeners' accuracy in many conditions when compared to utterance-initial syllables. The time needed to recognize
<italic>fear</italic>
,
<italic>anger</italic>
,
<italic>sadness</italic>
, and
<italic>neutral</italic>
from speech cues was not influenced by how utterances were gated, although
<italic>happiness</italic>
and
<italic>disgust</italic>
were recognized significantly faster when listeners heard the end of utterances first. Our data provide new clues about the relative time course for recognizing vocally-expressed emotions within the 400–1200 ms time window, while highlighting that emotion recognition from prosody can be shaped by the temporal properties of speech.</p>
</abstract>
<kwd-group>
<kwd>vocal emotions</kwd>
<kwd>prosody</kwd>
<kwd>speech perception</kwd>
<kwd>auditory gating</kwd>
<kwd>acoustics</kwd>
</kwd-group>
<counts>
<fig-count count="4"></fig-count>
<table-count count="2"></table-count>
<equation-count count="0"></equation-count>
<ref-count count="63"></ref-count>
<page-count count="14"></page-count>
<word-count count="10628"></word-count>
</counts>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Emotional events, and more specifically social displays of emotion—the expression of a face, the tone of a speaker's voice, and/or their body posture and movements—must be decoded successfully and
<italic>quickly</italic>
to avoid negative outcomes and to promote individual goals. Emotional expressions vary according to many factors, such as their mode of expression (auditory/visual), valence (positive/negative), power to arouse (low/high), antecedents, and potential outcomes (see Scherer,
<xref ref-type="bibr" rid="B50">2009</xref>
for a discussion). As early as the seventeenth century, these differences raised the question of the
<italic>specificity</italic>
of emotions; in his Traité
<italic>“Les Passions de l'Ame,”</italic>
the French philosopher Descartes proposed the existence of six “primary” emotions from which all other emotions are derived. In recent decades, studies demonstrating accurate pan-cultural recognition of emotional faces (Izard,
<xref ref-type="bibr" rid="B22">1971</xref>
; Ekman,
<xref ref-type="bibr" rid="B16">1972</xref>
) and distinct patterns of autonomic nervous system activity in response to certain emotions (e.g., Ekman et al.,
<xref ref-type="bibr" rid="B18">1983</xref>
; Levenson,
<xref ref-type="bibr" rid="B31">1992</xref>
) have served to fuel the idea of a fixed set of discrete and hypothetically “basic” emotions, typically
<italic>anger, fear, disgust, sadness</italic>
, and
<italic>happiness</italic>
, although opinions vary (see Ekman,
<xref ref-type="bibr" rid="B17">1992</xref>
; Sauter et al.,
<xref ref-type="bibr" rid="B46">2010</xref>
). Within this theoretical framework, expressions of basic emotion possess unique physical characteristics that render them discrete in communication when conveyed in the face as well as in the voice (Ekman,
<xref ref-type="bibr" rid="B17">1992</xref>
), although the vast majority of this work has focused on communication in the facial channel.</p>
<p>The structure of
<italic>vocal</italic>
emotion expressions embedded in spoken language, or
<italic>emotional prosody</italic>
, is now being investigated systematically from different perspectives. Perceptual-acoustic studies show that basic emotions can be reliably identified and differentiated at high accuracy levels from prosodic cues alone, and that these expressions are marked by distinct acoustic patterns characterized by differences in perceived duration, speech rate, intensity, pitch register and variation, and other speech parameters (among many others, Cosmides,
<xref ref-type="bibr" rid="B15">1983</xref>
; Scherer et al.,
<xref ref-type="bibr" rid="B52">1991</xref>
; Banse and Scherer,
<xref ref-type="bibr" rid="B2">1996</xref>
; Sobin and Alpert,
<xref ref-type="bibr" rid="B56">1999</xref>
; Johnstone and Scherer,
<xref ref-type="bibr" rid="B24">2000</xref>
; Juslin and Laukka,
<xref ref-type="bibr" rid="B25">2003</xref>
; Laukka et al.,
<xref ref-type="bibr" rid="B29">2005</xref>
; Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
). For example, speech rate tends to decrease when speakers are sad and increase when speakers experience fear; at the same time, differences in relative pitch height, variation, and other cue configurations serve to differentiate these (and other) emotional meanings (see Juslin and Laukka,
<xref ref-type="bibr" rid="B25">2003</xref>
for a comprehensive review). Similar to observations in the visual modality, cross-cultural studies on the identification of vocal emotions show that
<italic>anger</italic>
,
<italic>fear</italic>
,
<italic>sadness</italic>
,
<italic>happiness</italic>
, and
<italic>disgust</italic>
can be recognized by listeners at levels significantly above chance when they hear semantically-meaningless “pseudo-utterances” or utterances spoken in a foreign language (Scherer et al.,
<xref ref-type="bibr" rid="B51">2001</xref>
; Thompson and Balkwill,
<xref ref-type="bibr" rid="B58">2006</xref>
; Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
; Sauter et al.,
<xref ref-type="bibr" rid="B46">2010</xref>
). These data argue that basic emotions conveyed by speech prosody exhibit a core set of unique physical/acoustic properties that are emotion-specific and seemingly shared across languages (Scherer et al.,
<xref ref-type="bibr" rid="B51">2001</xref>
; Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
).</p>
<p>A critical process that has been underestimated in the characterization of how vocal emotions are communicated is the
<italic>time course</italic>
for recognizing basic emotions in speech. In the visual modality, the time course for recognizing emotional facial expressions has been investigated by presenting static displays of facial expressions (Tracy and Robins,
<xref ref-type="bibr" rid="B59">2008</xref>
) or animated face stimuli (Becker et al.,
<xref ref-type="bibr" rid="B4">2012</xref>
). In this latter study, the authors used a morphed continuum running from a neutral exemplar to either a happy or an angry expression and found that happy faces were recognized faster than angry faces, suggesting temporal specificities in the process for recognizing basic emotions in the visual modality (see Palermo and Coltheart,
<xref ref-type="bibr" rid="B34">2004</xref>
). Since emotional meanings encoded by prosody can
<italic>only</italic>
be accessed from their temporal acoustic structure, it is surprising that comparative data on the time course for recognizing basic emotions from prosody remain sparse.</p>
<p>Recently, two studies (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
) examined the temporal processing of vocal emotion expressions using a modified version of Grosjean's (
<xref ref-type="bibr" rid="B19">1980</xref>
) gating paradigm. The auditory gating procedure—originally designed to pinpoint how much acoustic information is needed for lexical access and word recognition—consists of artificially constructing “gates” as a function of specific time increments or of relevant linguistic units of spoken language; the gated stimuli are judged by listeners in blocks of increasing gate duration, typically starting at the onset of the relevant stimulus, where the last gate presented usually corresponds to the entire stimulus event (see Grosjean,
<xref ref-type="bibr" rid="B20">1996</xref>
for a discussion of methodological variables). An emotional variant of this paradigm considers how much acoustic information is needed for vocal emotions to be registered and consciously accessed for explicit recognition, using a forced-choice emotion-labeling paradigm. Given the hypothesis that acoustic patterns reflect “natural codes” that progressively activate stored conceptual information about basic emotions (e.g., Schirmer and Kotz,
<xref ref-type="bibr" rid="B54">2006</xref>
; Wilson and Wharton,
<xref ref-type="bibr" rid="B63">2006</xref>
), this emotional gating procedure allows inferences about the time course of emotion processing in the specific context of speech, and whether the time needed varies as a function of the emotional signal being transmitted.</p>
<p>In the first study, Cornew and colleagues (
<xref ref-type="bibr" rid="B14">2010</xref>
) presented English-like pseudo-utterances spoken in a
<italic>happy</italic>
,
<italic>angry</italic>
, or
<italic>neutral</italic>
prosody to English listeners spliced into 250 millisecond (ms) gates of increasing duration. Following each stimulus, participants made a three-choice forced response to identify the meaning conveyed. The authors found that listeners required less time (i.e., exposure to acoustic information) to identify
<italic>neutral</italic>
sentences when compared to
<italic>angry</italic>
and
<italic>happy</italic>
sentences, suggesting that vocal emotion expressions unfold at different rates (an effect the authors attributed to a
<italic>neutral</italic>
bias in perception). The idea that vocal emotions unfold at different rates was replicated by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), who gated English-like pseudo-utterances as a function of their
<italic>syllable structure</italic>
as opposed to specific time increments. Forty-eight English participants listened to 7-syllable utterances conveying one of five basic emotions (anger, disgust, fear, sadness, happiness) or neutral prosody, beginning with presentation of only the first syllable of the utterance, the first two syllables, and so forth until the full sentence was presented (a six-choice forced response was recorded). Emotion identification times were then calculated by converting the number of syllables needed to accurately identify the target emotion of each utterance without further changes in the participant's response at longer gate intervals, into their actual duration for recognition.</p>
<p>Results showed that there were important emotion-specific differences in the accuracy and time course for recognizing vocal emotions, with specific evidence that
<italic>fear</italic>
,
<italic>sadness</italic>
,
<italic>neutral</italic>
, and
<italic>anger</italic>
were recognized from significantly less acoustic information than
<italic>happiness</italic>
or
<italic>disgust</italic>
, from otherwise identical pseudo-utterances. Prosodic cues conveying
<italic>neutral</italic>
,
<italic>fear</italic>
, and
<italic>sadness</italic>
and
<italic>anger</italic>
could be detected from utterances lasting approximately 500–700 ms (
<italic>M</italic>
= 510, 517, 576, and 710 ms, respectively), whereas
<italic>happiness</italic>
(
<italic>M</italic>
= 977 ms) and
<italic>disgust</italic>
(
<italic>M</italic>
= 1486 ms) required substantially longer stimulus analysis. Despite the fact that Cornew et al. (
<xref ref-type="bibr" rid="B14">2010</xref>
) focused on a restricted set of emotions when compared to Pell and Kotz (3-choice vs. 6-choice task), and gated their stimuli in a different manner (250 ms increments vs. syllables), there were notable similarities between the two studies in the average times needed to identify neutral (444 vs. 510 ms), angry (723 vs. 710 ms), and happy expressions (802 vs. 977 ms, respectively), although Pell and Kotz's (
<xref ref-type="bibr" rid="B41">2011</xref>
) results show that this does not reflect a bias for recognizing neutral prosody as initially proposed (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
). Together, these studies establish that the time course of vocal emotion recognition in speech varies significantly according to the emotional meaning being conveyed, in line with results demonstrating emotion-specificity in facial emotion recognition (Becker et al.,
<xref ref-type="bibr" rid="B4">2012</xref>
), although the relative pattern of emotion-specific differences observed in the auditory vs. visual modality appears to be quite different as noted elsewhere in the literature using different experimental paradigms (e.g., Wallbott and Scherer,
<xref ref-type="bibr" rid="B62">1986</xref>
; Paulmann and Pell,
<xref ref-type="bibr" rid="B36">2011</xref>
).</p>
<p>Of interest here, closer inspection of Pell and Kotz's (
<xref ref-type="bibr" rid="B41">2011</xref>
) data reveal that recognition of
<italic>happiness</italic>
and
<italic>disgust</italic>
, in contrast to other basic emotions, improved at relatively long utterance durations (5–7 syllables); in fact, when full sentences were presented, recognition of
<italic>happy</italic>
prosody was comparable in accuracy to
<italic>sadness, anger</italic>
, and
<italic>fear</italic>
despite the fact that these latter emotions were recognized much more accurately than happiness following brief stimulus exposure. Some emotions such as
<italic>happiness</italic>
and
<italic>fear</italic>
seemed to be particularly salient when the last syllables were presented, leading to significant increases in recognition accuracy at the end of utterances in that study. These results imply that the amount of time needed to identify basic emotions from prosody depends partly on the
<italic>position</italic>
of salient acoustic properties in speech, at least for certain emotions. Interestingly, Pell (
<xref ref-type="bibr" rid="B39">2001</xref>
) reported that
<italic>happy</italic>
utterances exhibit unique acoustic differences in sentence-final position when compared to linguistically identical
<italic>angry</italic>
,
<italic>sad</italic>
, and
<italic>neutral</italic>
utterances, arguing that the position of acoustic cues, and not just time, is a key factor in communicating vocal emotions in speech. Other data underscore that the ability to recognize basic emotions varies significantly depending on the channel of expression—i.e., whether conveyed by facial expressions, vocal expressions, or linguistic content (Paulmann and Pell,
<xref ref-type="bibr" rid="B36">2011</xref>
)—with evidence that
<italic>fear, sadness, anger</italic>
, and
<italic>neutral</italic>
are effectively conveyed by speech prosody, whereas other emotions such as
<italic>happiness</italic>
or
<italic>disgust</italic>
are much more salient in other channels (Paulmann and Pell,
<xref ref-type="bibr" rid="B36">2011</xref>
). These findings raise the possibility that when basic emotions are preferentially communicated in channels other than the voice, vocal concomitants of these emotions are encoded and recognized somewhat differently; for example, they could be partly marked by local variations in acoustic cues that signal the interpersonal function or social relevance of these cues to the listener at the end of a discourse, similar to how the smile may reflect
<italic>happiness</italic>
or may serve social functions such as appeasement or dominance (Hess et al.,
<xref ref-type="bibr" rid="B21">2002</xref>
).</p>
<p>Further investigations are clearly needed to understand the time course of vocal emotion recognition in speech and to inform whether temporal specificities documented by initial studies (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
) are solely dictated by the
<italic>amount of time</italic>
listeners require to identify vocal emotions, or whether linguistic structure plays a role for identifying some emotions. We tested this question using the same gating paradigm and emotionally-inflected utterances as Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), although here we presented pseudo-utterances gated syllable-by-syllable from the
<italic>offset</italic>
rather than the onset of the stimulus (i.e., in a “backwards” or reverse direction) to test whether recognition times depend on how utterances are presented. If the critical factor for recognizing certain basic emotions in the voice is the unfolding of acoustic evidence over a set period of time, we expected similar outcomes/emotion identification times as those reported by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) irrespective of how utterances were gated; this result would establish that modal acoustic properties for understanding emotions tend to permeate the speech signal (perhaps due to their association with distinct physiological “push effects,” e.g., Scherer,
<xref ref-type="bibr" rid="B48">1986</xref>
,
<xref ref-type="bibr" rid="B50">2009</xref>
) and are decoded according to a standard time course. However, if important acoustic cues for recognizing vocal emotions are differentially encoded within an utterance, we should witness significantly different emotion identification times here when utterances are gated from their offset when compared to when they are presented from their onset (Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). This result could supply evidence that some emotions are “socialized” to a greater extent in the context of speech prosody through functionally distinct encoding processes.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec>
<title>Participants</title>
<p>Forty native English speakers recruited through campus advertisements (20 men/20 women, mean age: 25 ± 5 years) took part in the study. All participants were right-handed and reported normal hearing and normal or corrected-to-normal vision. Informed written consent was obtained from each participant prior to the study which was ethically approved by the Faculty of Medicine Institutional Review Board at McGill University (Montréal, Canada). Before the experiment, each participant completed a questionnaire to establish basic demographic information (age, education, language skills).</p>
</sec>
<sec>
<title>Stimuli</title>
<p>As described by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), the stimuli were emotionally-inflected pseudo-utterances (e.g.,
<italic>The placter jabored the tozz</italic>
) selected from an existing database of recorded exemplars, validated and successfully used in previous work (e.g., Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
; Paulmann and Pell,
<xref ref-type="bibr" rid="B35">2010</xref>
; Rigoulot and Pell,
<xref ref-type="bibr" rid="B44">2012</xref>
). Pseudo-utterances mimic the phonotactic and morpho-syntactic properties of the target language but lack meaningful lexical-semantic cues about emotion, allowing researchers to study the isolated effects of emotional prosody in speech (see Scherer et al.,
<xref ref-type="bibr" rid="B52">1991</xref>
; Pell and Baum,
<xref ref-type="bibr" rid="B40">1997</xref>
for earlier examples). The selected utterances were digitally recorded by two male and two female speakers in a sound-attenuated booth, saved as individual audio files, and perceptually validated by a group of 24 native listeners using a seven forced-choice emotion recognition task (see Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
, for full details). For this study we selected a subset of 120 pseudo-utterances that reliably conveyed
<italic>anger, disgust, fear, happiness, sadness</italic>
and
<italic>neutral</italic>
expressions to listeners (20 exemplars per emotion). Thirteen unique pseudo-utterance phrases produced by the four speakers to convey each emotion were repeated throughout the experiment (see Section Appendix). These sentences were the same in their (pseudo) linguistic content as those presented by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), although the precise recordings selected here were sometimes different because some phrases were emotionally expressed by a different speaker (75% of the chosen recordings were identical to those presented by Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). For all emotions, the target meaning encoded by prosody for these items was recognized at very high accuracy levels based on data from the validation study (anger = 86%; disgust = 76%; fear = 91%; happiness = 84%; sadness = 93%; neutral = 83%, where chance in the validation study was approximately 14%). Pseudo-utterances conveying each emotion were produced in equal numbers by two male and two female speakers and were all seven syllables in length prior to gate construction.</p>
</sec>
<sec>
<title>Gate construction</title>
<p>Each utterance was deconstructed into seven gates according to the syllable structure of the sentence using Praat speech analysis software (Boersma and Weenink,
<xref ref-type="bibr" rid="B5">2012</xref>
). As we were interested in the time course of emotion recognition when utterances were presented from their end to their beginning, the first Gate (Gate_7) of each stimulus consisted of only the last syllable of the utterance, the second gate (Gate_6-7) consisted of the last two syllables, and so on to Gate_1-7 (presentation of the full utterance). For each of the 120 items, this procedure produced seven gated stimuli (Gate_7, Gate_6-7, Gate_5-7, Gate_4-7, Gate_3-7, Gate_2-7, Gate_1-7) each composed of a different number of syllables (120 × 7 = 840 unique items). Note that since the onset of most gated stimuli occurred at a syllable break
<italic>within</italic>
the utterance (with the exception of Gate_1-7), these stimuli gave the impression of being “chopped off” at the beginning and starting abruptly. As shown in Table
<xref ref-type="table" rid="T1">1</xref>
, the duration of items presented in each gate condition differed by emotion type due to well-documented temporal differences in the specification of vocal emotion expressions (Juslin and Laukka,
<xref ref-type="bibr" rid="B25">2003</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
).</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>
<bold>Duration of the stimuli presented in the experiment in each gate duration condition as a function of emotion</bold>
.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1">
<bold>Emotion</bold>
</th>
<th align="center" colspan="7" rowspan="1">
<bold>Gate condition (# syllables)</bold>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1">
<bold>G_7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_6-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_5-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_4-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_3-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_2-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_1-7</bold>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Duration</td>
<td align="left" rowspan="1" colspan="1">Anger</td>
<td align="left" rowspan="1" colspan="1">370</td>
<td align="left" rowspan="1" colspan="1">585</td>
<td align="left" rowspan="1" colspan="1">771</td>
<td align="left" rowspan="1" colspan="1">1004</td>
<td align="left" rowspan="1" colspan="1">1230</td>
<td align="left" rowspan="1" colspan="1">1581</td>
<td align="left" rowspan="1" colspan="1">1759</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Disgust</td>
<td align="left" rowspan="1" colspan="1">481</td>
<td align="left" rowspan="1" colspan="1">748</td>
<td align="left" rowspan="1" colspan="1">984</td>
<td align="left" rowspan="1" colspan="1">1290</td>
<td align="left" rowspan="1" colspan="1">1555</td>
<td align="left" rowspan="1" colspan="1">1958</td>
<td align="left" rowspan="1" colspan="1">2153</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Fear</td>
<td align="left" rowspan="1" colspan="1">329</td>
<td align="left" rowspan="1" colspan="1">498</td>
<td align="left" rowspan="1" colspan="1">636</td>
<td align="left" rowspan="1" colspan="1">795</td>
<td align="left" rowspan="1" colspan="1">930</td>
<td align="left" rowspan="1" colspan="1">1151</td>
<td align="left" rowspan="1" colspan="1">1269</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Sadness</td>
<td align="left" rowspan="1" colspan="1">405</td>
<td align="left" rowspan="1" colspan="1">626</td>
<td align="left" rowspan="1" colspan="1">815</td>
<td align="left" rowspan="1" colspan="1">1071</td>
<td align="left" rowspan="1" colspan="1">1286</td>
<td align="left" rowspan="1" colspan="1">1645</td>
<td align="left" rowspan="1" colspan="1">1846</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Happiness</td>
<td align="left" rowspan="1" colspan="1">375</td>
<td align="left" rowspan="1" colspan="1">601</td>
<td align="left" rowspan="1" colspan="1">763</td>
<td align="left" rowspan="1" colspan="1">978</td>
<td align="left" rowspan="1" colspan="1">1164</td>
<td align="left" rowspan="1" colspan="1">1478</td>
<td align="left" rowspan="1" colspan="1">1648</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Neutral</td>
<td align="left" rowspan="1" colspan="1">354</td>
<td align="left" rowspan="1" colspan="1">540</td>
<td align="left" rowspan="1" colspan="1">703</td>
<td align="left" rowspan="1" colspan="1">896</td>
<td align="left" rowspan="1" colspan="1">1122</td>
<td align="left" rowspan="1" colspan="1">1401</td>
<td align="left" rowspan="1" colspan="1">1553</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<italic>Pseudo-utterances were always gated at syllable boundaries from the offset of the utterance in gates of increasing syllable duration</italic>
.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Experimental design/procedure</title>
<p>Participants were invited to take part in a study of “communication and emotion”; they were seated in a quiet, dimly lit room at a 75 cm distance from a laptop screen. SuperLab 4.0 software (Cedrus, USA) was used to present auditory stimuli played over volume-adjustable, high-quality headphones.</p>
<p>Seven presentation blocks were built, each containing 120 items with the same gate duration (i.e., number of syllables) presented successively in blocks of increasing syllable duration. The first block contained all Gate_7 stimuli (tokens with only the last syllable), the second block contained all Gate_6-7 stimuli (last two syllables), and so on until the Gate_1-7 block containing the full utterances was presented. As in Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), this block design was chosen to mitigate potential artifacts such as response perseveration (Grosjean,
<xref ref-type="bibr" rid="B20">1996</xref>
). Individual stimuli were randomized within blocks, and participants were instructed to identify the emotion expressed by the speaker as accurately and quickly as possible from six alternatives presented on the computer screen (
<italic>anger, disgust, fear, sadness, happiness, neutral</italic>
). Responses were recorded by a mouse click on the corresponding emotion label. Following the emotion response, a new screen appeared asking participants to rate how confident they were about their emotional decision along a 7-point scale, where 1 indicated they were “very unsure” and 7 meant that they were “very sure” about their judgment. After recording the confidence rating, a gap of 2 s separated the onset of the next trial.</p>
<p>Participants completed ten practice trials at the beginning of the testing session and additional practice trials prior to each block to become familiar with stimuli representing each gate duration condition. Participants were allowed to adjust the volume during the first practice block of each session. Since the volume of our stimuli was homogenized, only one adjustment at the beginning was necessary to meet the participants' individual preferences. The full experiment was administered during two separate 60-min sessions (session 1 = first three gate conditions, session 2 = last four gate conditions) to reduce fatigue and familiarity with the stimuli. Participants received $25 CAD compensation for their involvement.</p>
</sec>
<sec>
<title>Statistical analyses</title>
<p>Participants' ability to identify emotional target meanings (% correct) and their associated confidence ratings (7-pt scale) were each analyzed. From the uncorrected accuracy (hit) rates of each participant, Hu-scores were computed for each gate and emotion to adjust for individual response biases when several emotion categories are used (see Wagner,
<xref ref-type="bibr" rid="B61">1993</xref>
). The computation of Hu-scores takes into account how many stimulus categories and answer possibilities are given in the forced choice task. If only two stimulus categories and two answer possibilities are used (e.g., neutral and anger) the Hu-score for the correct identification of one category, say anger, would be computed as follows:
<italic>H</italic>
u =
<italic>a</italic>
/
<italic>a</italic>
+
<italic>b</italic>
×
<italic>a</italic>
/
<italic>a</italic>
+
<italic>c</italic>
. Here
<italic>a</italic>
is the number of correctly identified stimuli (anger was recognized as anger),
<italic>b</italic>
is the number of misidentifications, in which anger was incorrectly labeled as neutral, whereas
<italic>c</italic>
is the number of misidentifications, in which neutral was incorrectly labeled as anger. Wagner (
<xref ref-type="bibr" rid="B61">1993</xref>
) describes the Hu-scores as “[…] the joint probability that a stimulus category is correctly identified given that it is presented at all and that a response is correctly used given that it is used at all.”</p>
<p>Hu-scores and confidence scores were submitted to separate 7 × 6 ANOVAs with repeated measures of gate duration (seven levels) and emotion (
<italic>anger, disgust, fear, happiness, sadness, neutral</italic>
). To infer how much time participants required to correctly identify emotions, we computed the “emotion identification point” for each of the 120 pseudo-utterances by determining the gate condition where a participant identified the target emotion without subsequent changes at longer gate durations of the same stimulus. The emotion identification points were then transformed into “emotion identification times” by converting the number of syllables needed to identify the target into the exact speech duration in milliseconds, which was then averaged across items for each participant (see Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
for detailed procedures). Of the 4800 possible identification points (20 items × 6 emotions × 40 participants), 419 items that were not correctly identified by a participant even when the full utterance was presented were labeled as “errors” and omitted from the calculation of emotion identification times (a total of 4381 data points were included). Mean emotion identification times were submitted to a one-way ANOVA with repeated measures on emotion (
<italic>anger, disgust, fear, happiness, sadness, neutral</italic>
).</p>
<p>Since the stimuli, procedures, and analyses adopted here were virtually identical to those of Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), our experiment allows unprecedented comparisons of how recognition of emotional prosody evolves over time as a function of the gating
<italic>direction</italic>
, shedding light on how the position of acoustic patterns for detecting emotions influences recognition processes. For each of our three dependent measures (accuracy scores, confidence ratings, emotion identification times), we therefore performed a second analysis to directly compare the current results to those of Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) by entering the between-groups factor of Presentation Direction (gating from offset vs. onset). Separate
<italic>t</italic>
-tests first compared the age and education (in years) of the current participant group (
<italic>n</italic>
= 40) with participants studied by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
,
<italic>n</italic>
= 48); there was no difference in the formal education of the two samples [17 vs. 16 years, respectively;
<italic>t</italic>
<sub>(86)</sub>
= 1.548;
<italic>p</italic>
= 0.125], although participants in the present study were older on average [25 vs. 22 years;
<italic>t</italic>
<sub>(86)</sub>
= 2.578;
<italic>p</italic>
= 0.012]. Given the age difference, we entered age as a covariate in separate mixed ANCOVAs on the Hu-scores, confidence ratings, and emotion identification times as described above with the additional grouping variable of presentation Direction (onset, offset) of key theoretical interest in these analyses. For all statistical analyses, a significance level of 5% (two-sided) was selected and
<italic>post-hoc</italic>
comparisons (Tukey's HSD,
<italic>p</italic>
< 0.05) were applied whenever a significant main or interactive effect was observed.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec>
<title>Accuracy (HU-scores) and confidence ratings</title>
<sec>
<title>Effects of backwards gating on accuracy and confidence scores</title>
<p>Table
<xref ref-type="table" rid="T2">2</xref>
shows the mean accuracy of participants (% correct target recognition) in each emotion and gate condition when utterances were presented from their offset, prior to correcting these scores for participant response bias. A 7 (Gate) × 6 (Emotion) ANOVA performed on the
<italic>unbiased</italic>
emotion recognition rates (i.e., calculated Hu-Scores) revealed a main effect of Gate duration [
<italic>F</italic>
<sub>(6, 228)</sub>
= 390.48;
<italic>p</italic>
< 0.001], Emotion [
<italic>F</italic>
<sub>(5, 190)</sub>
= 142.57;
<italic>p</italic>
< 0.001], and a significant interaction of these factors [
<italic>F</italic>
<sub>(30, 1140)</sub>
= 10.684;
<italic>p</italic>
< 0.001]. Post hoc (Tukey's) tests of the interaction first considered how the recognition of each emotion evolved as a function of gate duration when sentences were gated from their offset. As shown in Figure
<xref ref-type="fig" rid="F1">1</xref>
, the recognition of
<italic>fear</italic>
,
<italic>anger</italic>
, and
<italic>sadness</italic>
significantly improved over the course of hearing the first three gates (i.e., the last three syllables of the utterance,
<italic>p</italic>
s < 0.003) with no further accuracy gains by the fourth gate condition (Gate_4-7,
<italic>p</italic>
s > 0.115). In contrast, accurate recognition of
<italic>neutral</italic>
,
<italic>happiness</italic>
, and
<italic>disgust</italic>
each significantly improved over a longer time frame corresponding to the first four gate conditions (Gate_7 to Gate_4-7,
<italic>p</italic>
s < 0.001) without further changes after this point (
<italic>p</italic>
s > 0.087).</p>
<table-wrap id="T2" position="float">
<label>Table 2</label>
<caption>
<p>
<bold>Mean accuracy (% target recognition) of the 40 listeners who judged pseudo-utterances conveying each emotion according to the gate duration, when utterances were gated from the offset of the sentence</bold>
.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1">
<bold>Emotion</bold>
</th>
<th align="center" colspan="7" rowspan="1">
<bold>Gate condition (# syllables)</bold>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1">
<bold>G_7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_6-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_5-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_4-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_3-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_2-7</bold>
</th>
<th align="left" rowspan="1" colspan="1">
<bold>G_1-7</bold>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Accuracy</td>
<td align="left" rowspan="1" colspan="1">Anger</td>
<td align="left" rowspan="1" colspan="1">51.9 (33.9)</td>
<td align="left" rowspan="1" colspan="1">73.0 (25.7)</td>
<td align="left" rowspan="1" colspan="1">79.9 (22.6)</td>
<td align="left" rowspan="1" colspan="1">79.0 (26.9)</td>
<td align="left" rowspan="1" colspan="1">80.3 (26.3)</td>
<td align="left" rowspan="1" colspan="1">81.8 (22.9)</td>
<td align="left" rowspan="1" colspan="1">85.8 (17.9)</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Disgust</td>
<td align="left" rowspan="1" colspan="1">27.5 (16.9)</td>
<td align="left" rowspan="1" colspan="1">44.3 (17.2)</td>
<td align="left" rowspan="1" colspan="1">59.3 (13.1)</td>
<td align="left" rowspan="1" colspan="1">64.3 (15.7)</td>
<td align="left" rowspan="1" colspan="1">71.0 (15.2)</td>
<td align="left" rowspan="1" colspan="1">71.4 (15.8)</td>
<td align="left" rowspan="1" colspan="1">74.5 (14.5)</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Fear</td>
<td align="left" rowspan="1" colspan="1">77.5 (15.6)</td>
<td align="left" rowspan="1" colspan="1">85.4 (16.0)</td>
<td align="left" rowspan="1" colspan="1">91.9 (8.3)</td>
<td align="left" rowspan="1" colspan="1">95.9 (3.7)</td>
<td align="left" rowspan="1" colspan="1">96.3 (4.1)</td>
<td align="left" rowspan="1" colspan="1">95.4 (4.7)</td>
<td align="left" rowspan="1" colspan="1">94.6 (3.9)</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Sadness</td>
<td align="left" rowspan="1" colspan="1">65.6 (22.8)</td>
<td align="left" rowspan="1" colspan="1">83.9 (13.9)</td>
<td align="left" rowspan="1" colspan="1">87.0 (11.3)</td>
<td align="left" rowspan="1" colspan="1">90.9 (12.1)</td>
<td align="left" rowspan="1" colspan="1">92.8 (7.5)</td>
<td align="left" rowspan="1" colspan="1">95.1 (5.0)</td>
<td align="left" rowspan="1" colspan="1">94.4 (6.4)</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Happiness</td>
<td align="left" rowspan="1" colspan="1">30.6 (27.9)</td>
<td align="left" rowspan="1" colspan="1">53.8 (34.4)</td>
<td align="left" rowspan="1" colspan="1">66.1 (32.7)</td>
<td align="left" rowspan="1" colspan="1">71.8 (32.0)</td>
<td align="left" rowspan="1" colspan="1">77.6 (25.9)</td>
<td align="left" rowspan="1" colspan="1">82.4 (24.5)</td>
<td align="left" rowspan="1" colspan="1">89.1 (13.5)</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">Neutral</td>
<td align="left" rowspan="1" colspan="1">55.3 (12.8)</td>
<td align="left" rowspan="1" colspan="1">68.5 (11.8)</td>
<td align="left" rowspan="1" colspan="1">73.4 (12.5)</td>
<td align="left" rowspan="1" colspan="1">83.9 (10.8)</td>
<td align="left" rowspan="1" colspan="1">81.5 (11.2)</td>
<td align="left" rowspan="1" colspan="1">85.4 (8.5)</td>
<td align="left" rowspan="1" colspan="1">86.6 (8.2)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<italic>Standard deviations are shown in parentheses</italic>
.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Mean Hu-scores (unbiased accuracy) for each emotion as a function of the gate duration (number of syllables)</bold>
.</p>
</caption>
<graphic xlink:href="fpsyg-04-00367-g0001"></graphic>
</fig>
<p>Further inspection of the interaction then looked at emotional differences on accuracy at each gate condition. When listeners heard only the utterance-final syllable (Gate_7),
<italic>fear</italic>
and
<italic>anger</italic>
prosody were recognized significantly better than all other emotional voices (
<italic>p</italic>
s < 0.006), and
<italic>fear</italic>
was also recognized significantly better than
<italic>anger</italic>
(
<italic>p</italic>
< 0.001). After fear and anger,
<italic>sad</italic>
expressions were identified significantly better from the last syllable than
<italic>happy</italic>
and
<italic>neutral</italic>
expressions (
<italic>p</italic>
s < 0.001), which did not differ (
<italic>p</italic>
= 1.000), followed by
<italic>disgust</italic>
which was recognized more poorly than any other emotion (
<italic>p</italic>
s < 0.046). This pattern was similar for stimuli composed of the last two and the last three syllables (Gate_6-7 and Gate_5-7, respectively) but changed somewhat as stimulus duration increased. After presenting the last four syllables (Gate_4-7),
<italic>fear</italic>
continued to exhibit the highest accuracy score (this was true in all gate conditions;
<italic>p</italic>
s < 0.017) but recognition of
<italic>anger</italic>
and
<italic>sad</italic>
expressions was equivalent (
<italic>p</italic>
= 1.0), followed by
<italic>happiness</italic>
which was recognized significantly better than
<italic>disgust</italic>
(
<italic>p</italic>
< 0.001). After the last five syllables were presented (Gate_3-7),
<italic>angry</italic>
,
<italic>sad</italic>
and
<italic>happy</italic>
sentences were recognized at a similar rate (
<italic>p</italic>
s > 0.555), surpassing
<italic>neutral</italic>
and
<italic>disgust</italic>
(
<italic>p</italic>
s < 0.001). In the two longest gate conditions (Gate_2-7, Gate_1-7), accuracy scores for
<italic>anger</italic>
,
<italic>sad</italic>
,
<italic>happy</italic>
and
<italic>neutral</italic>
sentences were not statistically different (
<italic>p</italic>
s > 0.407) while vocal expressions of
<italic>fear</italic>
and
<italic>disgust</italic>
were respectively the best and worst recognized from speech prosody (
<italic>p</italic>
s < 0.017).</p>
<p>The analysis of associated confidence ratings (on a scale of 1–7) was restricted to trials in which the emotional target of the prosody was correctly identified. Two male participants who failed to recognize any of the
<italic>disgust</italic>
expressions (producing an empty cell) were excluded from this analysis. The ANOVA on the confidence scores revealed a main effect of gate duration [
<italic>F</italic>
<sub>(6, 192)</sub>
= 48.653;
<italic>p</italic>
< 0.001], a main effect of emotional prosody [
<italic>F</italic>
<sub>(5, 160)</sub>
= 46.991;
<italic>p</italic>
< 0.001] and a significant interaction of Gate × Emotion [
<italic>F</italic>
<sub>(30, 960)</sub>
= 3.814;
<italic>p</italic>
< 0.001]. Confidence scores tended to increase with stimulus/gate duration, although there were differences across emotions as a function of gate duration. After listening to the final one or two syllables, participants were significantly more confident about their detection of
<italic>fear</italic>
and
<italic>anger</italic>
(
<italic>p</italic>
s < 0.001) and least confident when they correctly recognized
<italic>neutral</italic>
and
<italic>disgust</italic>
(
<italic>p</italic>
s < 0.001). Confidence ratings for
<italic>happiness</italic>
and
<italic>sadness</italic>
were between those extremes, differing significantly from the other two emotion sets (
<italic>p</italic>
s < 0.048). By the third gate condition (Gate_5-7), confidence about
<italic>neutral</italic>
prosody began to increase over
<italic>disgust</italic>
(
<italic>p</italic>
< 0.001), and by the fourth gate condition and when exposed to longer stimuli, confidence ratings for
<italic>fear</italic>
,
<italic>anger</italic>
,
<italic>happiness</italic>
, and
<italic>sadness</italic>
were all comparable, although confidence about
<italic>disgust</italic>
remained significantly lower even when full utterances were presented (Gate_1-7).</p>
</sec>
<sec>
<title>Impact of gating direction on accuracy and confidence scores</title>
<p>The 2 × 7 × 6 ANCOVA on Hu-scores gathered here and by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) showed a significant three-way interaction of Direction, Gate duration, and Emotion [
<italic>F</italic>
<sub>(30, 2550)</sub>
= 12.636;
<italic>p</italic>
< 0.001]. This interaction allowed us to explore the influence of presentation direction (onset vs. offset) on the accuracy of emotional prosody recognition as additional syllables revealed acoustic evidence about each emotion; these relationships are demonstrated for each emotion in Figure
<xref ref-type="fig" rid="F2">2</xref>
. Step-down analyses (2x7 ANOVAs) showed that the interaction of Direction × Gate duration was significant for
<italic>anger</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 14.218;
<italic>p</italic>
< 0.001],
<italic>fear</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 33.096;
<italic>p</italic>
< 0.001],
<italic>disgust</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 10.851;
<italic>p</italic>
< 0.001],
<italic>sadness</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 11.846;
<italic>p</italic>
< 0.001], and
<italic>happiness</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 9.663;
<italic>p</italic>
< 0.001]. For each of these emotions, recognition always improved when the
<italic>end</italic>
of utterances were heard first (i.e., when gated from their offset vs. onset), although the temporal region where accuracy improved within the utterance varied by emotion type.
<italic>Post-hoc</italic>
comparisons showed that
<italic>anger</italic>
and
<italic>fear</italic>
were recognized significantly better in the offset presentation condition even when little acoustic evidence was available; listeners detected
<italic>anger</italic>
better over the course of the first to third syllable in the offset vs. onset condition, and over the course of the first to sixth syllables for
<italic>fear</italic>
(
<italic>p</italic>
s < 0.001).
<italic>Happiness</italic>
showed an advantage in the offset condition beginning at the second up to the fourth gate (
<italic>p</italic>
s = 0.027),
<italic>disgust</italic>
showed a similar advantage beginning at the third to the fifth gate (
<italic>p</italic>
< 0.049), and
<italic>sadness</italic>
displayed the offset advantage beginning at the third up to the sixth gate (
<italic>p</italic>
s < 0.031). Interestingly, there was no effect of the direction of utterance presentation on the recognition of
<italic>neutral</italic>
prosody [
<italic>F</italic>
<sub>(6, 516)</sub>
= 0.409;
<italic>p</italic>
= 0.873].</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>Comparison of mean accuracy (Hu) scores for each emotion as a function of gate duration (number of syllables) and the direction of presentation (forward vs. backward)</bold>
. Data in the forward condition are taken from Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
).</p>
</caption>
<graphic xlink:href="fpsyg-04-00367-g0002"></graphic>
</fig>
<p>The ANCOVA on confidence ratings between studies yielded a significant three-way interaction of Direction, Gate duration and Emotion [
<italic>F</italic>
<sub>(30, 2370)</sub>
= 4.337;
<italic>p</italic>
< 0.001]. Step-down analyses (2 × 7 ANOVAs) run separately by emotion showed that the interaction of Direction × Gate duration was significant for
<italic>anger</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 35.800;
<italic>p</italic>
< 0.001],
<italic>fear</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 19.656;
<italic>p</italic>
< 0.001],
<italic>happiness</italic>
[
<italic>F</italic>
<sub>(6, 504)</sub>
= 18.783;
<italic>p</italic>
< 0.001], and
<italic>sadness</italic>
[
<italic>F</italic>
<sub>(6, 516)</sub>
= 10.898;
<italic>p</italic>
< 0.001]. Listeners were more confident that they had correctly identified these four emotions only when one syllable was presented in isolation (i.e., at the first gate duration,
<italic>ps</italic>
< 0.049), with increased confidence when they heard the sentence-final as opposed to the sentence-initial syllable. For
<italic>disgust</italic>
and
<italic>neutral</italic>
, the two-way interaction was also significant [
<italic>F</italic>
<sub>(6, 492)</sub>
= 7.522;
<italic>p</italic>
< 0.001;
<italic>F</italic>
<sub>(6, 516)</sub>
= 7.618;
<italic>p</italic>
< 0.001, respectively] but
<italic>post hoc</italic>
tests revealed only minor differences in the pattern of confidence ratings in each presentation condition with no differences in listener confidence at specific gates (
<italic>p</italic>
s > 0.618). These patterns are illustrated for each emotion in Figure
<xref ref-type="fig" rid="F3">3</xref>
.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Comparison of mean confidence ratings for each emotion as a function of gate duration (number of syllables) and the direction of presentation (forward vs. backward)</bold>
. Data in the forward condition are taken from Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
).</p>
</caption>
<graphic xlink:href="fpsyg-04-00367-g0003"></graphic>
</fig>
</sec>
</sec>
<sec>
<title>Emotion identification times</title>
<sec>
<title>Effects of backwards gating on the time course of vocal emotion recognition</title>
<p>As described earlier, emotion identification times were computed by identifying the gate condition from sentence offset where the target emotion was correctly recognized for each item and participant, which was then converted into the precise time value of the gated syllables in milliseconds. A one-way ANOVA performed on the mean emotion identification times with repeated measures of emotion type (
<italic>anger, disgust, fear</italic>
,
<italic>happiness</italic>
,
<italic>sadness</italic>
and
<italic>neutral</italic>
) revealed a highly significant effect of emotion [
<italic>F</italic>
<sub>(5, 190)</sub>
= 113.68;
<italic>p</italic>
< 0.001]. As can be seen in Figure
<xref ref-type="fig" rid="F3">3</xref>
,
<italic>fearful</italic>
voices were correctly identified at the shortest presentation times (
<italic>M</italic>
= 427 ms), significantly faster than
<italic>sadness</italic>
(
<italic>M</italic>
= 612 ms),
<italic>neutral</italic>
(
<italic>M</italic>
= 654 ms) and
<italic>anger</italic>
(
<italic>M</italic>
= 672 ms) which did not significantly differ one from another. These emotions required significantly less time to identify than
<italic>happiness</italic>
(
<italic>M</italic>
= 811 ms), which in turn took significantly less time than
<italic>disgust</italic>
(
<italic>M</italic>
= 1197 ms) which required the longest stimulus exposure for accurate recognition (all
<italic>p</italic>
s < 0.001).</p>
</sec>
<sec>
<title>Impact of gating direction on emotion identification times</title>
<p>Finally, a 2 × 6 (Direction × Emotion) mixed ANCOVA was performed on the emotion identification times to compare the present results to those of Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
); this analysis revealed a significant interaction of presentation Direction and Emotion [
<italic>F</italic>
<sub>(5, 425)</sub>
= 13.235;
<italic>p</italic>
< 0.001] as also shown in Figure
<xref ref-type="fig" rid="F4">4</xref>
. The average time listeners required to correctly identify emotional prosody was significantly reduced when syllables were presented from the offset vs. onset of utterances, but only for
<italic>disgust</italic>
(
<italic>p</italic>
< 0.001) and
<italic>happiness</italic>
and (
<italic>p</italic>
= 0.050). In contrast to accuracy and confidence ratings, the manner in which utterances were gated had no significant impact on the amount of time listeners needed to recognize
<italic>fear</italic>
,
<italic>sadness</italic>
,
<italic>anger</italic>
, or
<italic>neutral</italic>
prosody (all
<italic>p</italic>
s > 0.157).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>Comparison of mean identification points (in milliseconds) for each emotion as a function of direction of presentation (forward vs. backward)</bold>
. Data in the forward condition are taken from Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
).</p>
</caption>
<graphic xlink:href="fpsyg-04-00367-g0004"></graphic>
</fig>
</sec>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>Following recent work (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
), this experiment sought a clearer understanding of how vocal expressions of basic emotion reveal their meanings in speech using a modified version of the gating paradigm, where emotionally-inflected pseudo-utterances were truncated and presented in excerpts of increasing syllable duration from the
<italic>end</italic>
of an utterance. While the current manner for presenting our stimuli might bear no immediate resemblance to how emotional speech is encountered in structured conversations–especially because our stimuli were only auditory and not spontaneously produced (see Barkhuysen et al.,
<xref ref-type="bibr" rid="B3">2010</xref>
for a discussion on this topic)—our performance measures may help to understand some processes involved when listeners “walk in” on an emotional conversation, or have their attention directed to emotional speech in the environment that is already in progress, an experience that is common to everyday life. Critically, our design allowed important hypotheses to be tested concerning the evolution and associated time course of emotional prosody recognition (in English) as listeners are progressively exposed to representative acoustic cue configurations. In line with past findings, we found that listeners tended to be most accurate at recognizing vocal expressions of
<italic>fear</italic>
(Levitt,
<xref ref-type="bibr" rid="B32">1964</xref>
; Zuckerman et al.,
<xref ref-type="bibr" rid="B64">1975</xref>
; Paulmann and Pell,
<xref ref-type="bibr" rid="B36">2011</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
) and least accurate for
<italic>disgust</italic>
(e.g., Scherer et al.,
<xref ref-type="bibr" rid="B52">1991</xref>
; Banse and Scherer,
<xref ref-type="bibr" rid="B2">1996</xref>
) irrespective of how many syllables/gates were presented. Expressions of
<italic>fear</italic>
were also recognized from the shortest stimulus duration, implying that listeners need minimal input to recognize this emotion in speech (Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). Interestingly, emotion identification times were significantly reduced for certain emotions (
<italic>happiness, disgust</italic>
) when sentences were presented from their offset rather than their onset, and there were other apparent “advantages” to recognizing emotion prosody when listeners were first exposed to the
<italic>end</italic>
of utterances. These effects and their implications are discussed in detail below.</p>
<sec>
<title>Effects of gating direction and cue location on vocal emotion recognition</title>
<p>Our data show that recognition of vocal emotions generally improves with the number of syllables presented, even when listeners hear utterance fragments in reverse order, but reaches a plateau for all emotions after hearing the last three to four syllables of the utterance. When viewed broadly, these findings suggest that “prototypical” acoustic properties for accessing knowledge about basic emotions from speech (Laukka,
<xref ref-type="bibr" rid="B28">2005</xref>
; Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
) are decoded and consciously recognized at peak accuracy levels after processing three to four spoken syllables—approximating a mean stimulus duration of 600–1200 ms, depending on the emotion in question (review Table
<xref ref-type="table" rid="T1">1</xref>
). This broad conclusion fits with observations of two previous gating studies that gated emotional utterances in syllabic units (Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
) or in 250 ms increments (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
). However, there were notable emotion-specific recognition patterns as a function of gate duration; when stimuli were very short (i.e., only the final one or two syllables were presented) there was a marked advantage for detecting
<italic>fear</italic>
and
<italic>anger</italic>
when compared to the other expression types, and listeners were significantly more confident that they had correctly identified these two emotions based solely on the utterance-final syllable. As the gate duration gradually increased to five syllables (Gate_3-7), no further differences were observed in the ability to recognize
<italic>anger, sadness</italic>
, and
<italic>happiness</italic>
, although participants remained significantly more accurate for
<italic>fear</italic>
and significantly less accurate for
<italic>disgust</italic>
at all stimulus durations.</p>
<p>The observation that
<italic>fear</italic>
, and to a lesser extent
<italic>anger</italic>
, were highly salient to listeners at the end of utterances even when minimal acoustic information was present (i.e., the final syllable) is noteworthy. Leinonen and colleagues (
<xref ref-type="bibr" rid="B30">1997</xref>
) presented two-syllable emotional utterances in Finnish (the word [saara]) and reported higher recognition scores and distinct acoustic attributes of productions conveying
<italic>fear</italic>
and
<italic>anger</italic>
when compared to eight other emotional-motivational states, suggesting that these emotions are highly salient to listeners in acoustic stimuli of brief duration. Similarly, Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) reported that recognition of most emotions improved over the
<italic>full course</italic>
of the utterance when they were gated from sentence onset and that certain emotions, such as
<italic>happiness</italic>
and
<italic>fear</italic>
, demonstrated clear gains in that study when listeners processed the last two syllables of the utterance. When combined with our current findings, this implies that syllables located towards the
<italic>end</italic>
of an utterance provide especially powerful cues for identifying basic emotions encoded in spoken language. This argument is supported by our direct statistical comparisons of the two data sets when utterances were gated from their onset vs. offset; we found that presentation
<italic>direction</italic>
had a significant impact on the accuracy and confidence levels of English listeners, with improved recognition of all emotions except
<italic>neutral</italic>
when participants heard utterances commencing with the last syllable. Gating utterances from their offset also reduced mean emotion identification times for some emotions (
<italic>happiness</italic>
,
<italic>disgust</italic>
) as elaborated below. In contrast, there was no evidence in our data that listeners were at an advantage to recognize emotional prosody when utterances were gated from their onset, with the possible exception of accuracy rates for
<italic>sadness</italic>
that were somewhat higher in the onset condition at very short gate intervals.</p>
<p>Why would natural, presumably biologically-specified codes for signaling emotions in the voice (e.g., Ekman,
<xref ref-type="bibr" rid="B17">1992</xref>
; Wilson and Wharton,
<xref ref-type="bibr" rid="B63">2006</xref>
) bear an important relationship to the temporal features of spoken language? This phenomenon, which has been highlighted at different times (Cosmides,
<xref ref-type="bibr" rid="B15">1983</xref>
; Scherer,
<xref ref-type="bibr" rid="B49">1988</xref>
), could be explained by the accent structure of utterances we presented for emotion recognition and by natural processes of speech production, factors which both contribute to the “socialization” or shaping of vocal emotion expressions in the context of spoken language. It is well known that the accent/phrase structure of speech, or the relative pattern of weak vs. strong syllables (or segments) in a language, can be altered when speakers experience and convey vocal emotions (Ladd,
<xref ref-type="bibr" rid="B26">1996</xref>
). For example, speakers may increase or decrease the relative prominence of stressed syllables (through local changes in duration and pitch variation) and/or shift the location or frequency of syllables that are typically accented in a language, which may serve as an important perceptual correlate of vocal emotion expressions (Bolinger,
<xref ref-type="bibr" rid="B7">1972</xref>
; Cosmides,
<xref ref-type="bibr" rid="B15">1983</xref>
). Related to the notion of local prominence, there is a well-documented propensity for speakers to lengthen syllables located in word- or phrase-final position (“sentence-final lengthening,” Oller,
<xref ref-type="bibr" rid="B33">1973</xref>
; Pell,
<xref ref-type="bibr" rid="B39">2001</xref>
), sometimes on the penultimate syllable of certain languages (Bolinger,
<xref ref-type="bibr" rid="B6">1978</xref>
), and other evidence that speakers modulate their pitch in final positions to encode gradient acoustic cues that refer directly to their emotional state (Pell,
<xref ref-type="bibr" rid="B39">2001</xref>
) to give to the final position of sentences a special impact in the identification of the emotional quality of the voice.</p>
<p>The observation here that cues located toward the end of an utterance facilitated accurate recognition of most emotions in English likely re-asserts the importance of accent structure during vocal emotion processing (Cosmides,
<xref ref-type="bibr" rid="B15">1983</xref>
; Ladd et al.,
<xref ref-type="bibr" rid="B27">1985</xref>
). More specifically, it implies that sentence-final syllables in many languages could act as a vehicle for reinforcing the speaker's emotion state
<italic>vis-à-vis</italic>
the listener in an unambiguous and highly differentiated manner during discourse (especially for
<italic>fear</italic>
and
<italic>anger</italic>
). Inspection of the mean syllable durations of gated stimuli presented here and by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) confirm that while there were natural temporal variations across emotions, the duration of utterance-final syllables (
<italic>M</italic>
= 386 ms, range = 329–481) was more than double that of utterance-initial syllables (
<italic>M</italic>
= 165 ms, range = 119–198), the latter of which were always unstressed in our study. In comparison, differences in the cumulative duration of gates composed of two syllables (
<italic>M</italic>
= 600 vs. 516 in the offset vs. onset conditions, respectively) or three syllables (
<italic>M</italic>
= 779 vs. 711) were relatively modest between the two studies, and these stimulus durations were always composed of both weak and stressed syllables. This difference of duration observed is in line with the above described propensity of speakers to lengthen syllables located in the final position of the sentences. Also, given the structure of the pseudo-utterances (see Section Appendix), it should be noted that the forward presentation of pseudo-utterances might differ from the backward presentation in terms of expectations of the participants. In Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), the first gate was always a pronoun or a determiner and was always followed by the first syllable of a pseudo-verb, whereas in the present experiment, the two first gates were always the two final syllables of a pseudo-word. It is difficult to know whether participants may have developed some expectations about the following syllable and to what extent these expectations could have impacted the identification of the prosody. We cannot exclude that these expectations could have been more difficult to make in the backward condition, when the gates were presented in a reverse order, altering how participants focused on the emotional prosody of the sentences. However, such an interpretation would not explain why the direction of presentation did not influence the performance of participants when sentences were uttered with a neutral note and why this influence was limited to some specific gates when the sentences were spoken in an emotional way.</p>
<p>Nevertheless, these results suggest that there is a certain alignment in how speakers realize acoustic targets that refer to semantically-dictated stress patterns and emotional meanings in speech, demonstrating that recognition of vocal emotional expressions is shaped to some extent by differences in the temporal (accent) structure of language
<italic>and</italic>
that emotional cues are probably not equally salient throughout the speech signal. Further studies that compare our findings with data from other languages will clearly be needed to advance specific hypotheses about how vocal emotion expressions may have become “domesticated” in the context of spoken language. For example, we could replicate forward and backward gating experiments in another stressed-language like German, and see if critical cues in the identification of some emotions could be located at different places of a sentence. We could also compare forward and backward presentation of pseudo-sentences in a language that does not use accentuated stress such as French, which supposedly would lead to similar results in the time needed to identify emotional prosody irrespective of the direction of presentation of the sentences.</p>
<sec>
<title>Further reflections on the time course of vocal emotion recognition</title>
<p>While our data show that the position of emotionally meaningful cues plays a role in how vocal emotions are revealed to listeners, they simultaneously argue that the average
<italic>time</italic>
needed to accurately decode most basic emotions in speech is relatively constant irrespective of gating method (syllables vs. 250 ms increments) or stimulus set (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). When mean emotion identification times were computed here,
<italic>fear</italic>
required the least amount of stimulus exposure to recognize (
<italic>M</italic>
= 427 ms), followed by
<italic>sadness</italic>
(
<italic>M</italic>
= 612 ms),
<italic>neutral</italic>
(
<italic>M</italic>
= 654 ms),
<italic>anger</italic>
(
<italic>M</italic>
= 677 ms),
<italic>happiness</italic>
(
<italic>M</italic>
= 811 ms), and
<italic>disgust</italic>
(
<italic>M</italic>
= 1197 ms). With the exception of
<italic>neutral</italic>
which took slightly (although not significantly) longer to detect when utterances were gated in reverse, this emotion-specific pattern precisely mirrors the one reported by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) for the same six emotions and replicates Cornew et al.'s (
<xref ref-type="bibr" rid="B14">2010</xref>
) data for
<italic>neutral</italic>
,
<italic>anger</italic>
, and
<italic>happy</italic>
expressions when utterances were gated in 250 ms units. When the mean emotion identification times recorded here are compared to those reported by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
) and Cornew et al. (
<xref ref-type="bibr" rid="B14">2010</xref>
), it can be said that recognition of
<italic>fear</italic>
occurs approximately in the range of 425–525 ms (427, 517 ms),
<italic>sadness</italic>
in the range of 600 ms (612, 576 ms),
<italic>anger</italic>
in the range of 700 ms (677, 710, 723 ms),
<italic>happiness</italic>
in the range of 800–900 ms (811, 977, 802 ms), and
<italic>disgust</italic>
requires analysis of at least 1200 ms of speech (1197, 1486 ms). As pointed out by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), the time needed to identify basic emotions from their underlying acoustic cues does not simply reflect characteristic differences in articulation rate across emotions (e.g., Banse and Scherer,
<xref ref-type="bibr" rid="B2">1996</xref>
; Pell et al.,
<xref ref-type="bibr" rid="B42">2009</xref>
), since expressions of
<italic>sadness</italic>
are routinely slower and often twice the duration of comparable
<italic>fear</italic>
expressions, and yet these two emotions are accurately recognized from speech stimuli of the shortest duration. Rather, it can be claimed that prototypical cues for understanding vocal emotions are decoded and consciously retrievable over slightly different epochs in the 400–1200 ms time window, or after hearing roughly 2–4 syllables in speech. The idea that emotional meanings begin to be differentiated after hearing around 400 ms of speech fits with recent priming data using behavioral paradigms (Pell and Skorup,
<xref ref-type="bibr" rid="B43">2008</xref>
) and event-related potentials (ERPs, Paulmann and Pell,
<xref ref-type="bibr" rid="B35">2010</xref>
) as well as recent neuro-cognitive models on the time course and cognitive processing structure of vocal emotion processing (Schirmer and Kotz,
<xref ref-type="bibr" rid="B54">2006</xref>
).</p>
<p>Evidence that vocal expressions of certain negative emotions, such as
<italic>fear, sadness</italic>
, or
<italic>anger</italic>
, require systematically less auditory input to decode accurately, whereas expressions of
<italic>happiness</italic>
and
<italic>disgust</italic>
take much longer, may be partly explained by the evolutionary prevalence and significance of negative emotions over positive emotions (e.g., Cacioppo and Gardner,
<xref ref-type="bibr" rid="B8">1999</xref>
). Expressions that signal threat or loss must be decoded rapidly to avoid detrimental outcomes of great urgency to the organism; this negativity bias has been observed elsewhere in response to facial (Carretié et al.,
<xref ref-type="bibr" rid="B13">2001</xref>
) and vocal expressions of fear and anger (Calder et al.,
<xref ref-type="bibr" rid="B10">2001</xref>
,
<xref ref-type="bibr" rid="B9">2004</xref>
), and would explain why
<italic>fear</italic>
prosody was recognized more accurately and
<italic>faster</italic>
than any other emotional expression in the voice (Levitt,
<xref ref-type="bibr" rid="B32">1964</xref>
). The biological importance of rapidly differentiating negative vocal signals (e.g., Scherer,
<xref ref-type="bibr" rid="B48">1986</xref>
) potentially explains why the
<italic>amount</italic>
of temporal acoustic information, and not the position of critical cues, appears to be the key factor governing the time course of recognizing
<italic>fear, anger</italic>
, and
<italic>sadness</italic>
, since we found no significant differences in emotion identification times for these emotions between our two studies.</p>
<p>In contrast,
<italic>happy</italic>
and
<italic>disgust</italic>
took significantly longer to identify and were the only emotions for which recognition times varied significantly as a function of gating direction (with a reduction in emotion recognition times of approximately 200 ms and 300 ms between studies, respectively). Difficulties recognizing
<italic>disgust</italic>
from prosody are well documented in the literature (Scherer,
<xref ref-type="bibr" rid="B48">1986</xref>
; Scherer et al.,
<xref ref-type="bibr" rid="B52">1991</xref>
; Jaywant and Pell,
<xref ref-type="bibr" rid="B23">2012</xref>
) and are sometimes attributed to the fact that
<italic>disgust</italic>
in the auditory modality is more typical in the form of affective bursts such as “yuck” or “eeeew” (Scherer,
<xref ref-type="bibr" rid="B49">1988</xref>
; Simon-Thomas et al.,
<xref ref-type="bibr" rid="B55">2009</xref>
). It is possible that identifying disgust from running speech, as required here and by Pell and Kotz (
<xref ref-type="bibr" rid="B41">2011</xref>
), activates additional social meanings that take more time to analyze and infer than the decoding of pure biological signals such as
<italic>fear, sadness</italic>
, and
<italic>anger</italic>
. For example, it has been suggested that there are qualitatively different expressions of disgust in the visual (Rozin et al.,
<xref ref-type="bibr" rid="B45">1994</xref>
) and auditory (Calder et al.,
<xref ref-type="bibr" rid="B11">2010</xref>
) modality, including a variant related to violations of moral standards that is often conveyed in running speech, as opposed to physical/visceral expressions of disgust which are better conveyed through exclamations (yuck!). If presentation of disgust utterances engendered processes for inferring a speaker's social or moral attitude from vocal cues, a more symbolic function of prosody, one might expect a much slower time course as witnessed here. A similar argument may apply to our results for
<italic>happiness</italic>
; although this emotion is typically the quickest emotion to be recognized in the visual modality (Tracy and Robins,
<xref ref-type="bibr" rid="B59">2008</xref>
; Palermo and Coltheart,
<xref ref-type="bibr" rid="B34">2004</xref>
; Calvo and Nummenmaa,
<xref ref-type="bibr" rid="B12">2009</xref>
), it exhibits a systematically slower time course in spoken language (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). Like disgust,
<italic>happiness</italic>
may also be communicated in a more rapid and reliable manner by other types of vocal cues that accompany speech, such as laughter (e.g., Szameitat et al.,
<xref ref-type="bibr" rid="B57">2010</xref>
). In addition, there is probably a need to differentiate between different types of vocal expressions of happiness which yield different rates of perceptual recognition (Sauter and Scott,
<xref ref-type="bibr" rid="B47">2007</xref>
). Nonetheless, our results strongly imply that speakers use prosody to signal
<italic>happiness</italic>
, particularly towards the end of an utterance, as a conventionalized social cue directed to the listener for communicating this emotion state (Pell,
<xref ref-type="bibr" rid="B39">2001</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
), perhaps as a form of self-presentation and inter-personal expression of social affiliation. Further inquiry will be needed to test why
<italic>disgust</italic>
and
<italic>happiness</italic>
appear to be more socially mediated than other basic emotions, influencing the time course of their recognition in speech, and to define the
<italic>contexts</italic>
that produce variations in these expressions.</p>
<p>Interestingly, the recognition of
<italic>neutral</italic>
prosody was uniquely unaffected by the manner in which acoustic information was unveiled in the utterance, with no significant effects of presentation direction on accuracy, confidence ratings, or emotion identification times between studies. This tentatively suggests that the identification of neutrality, or a lack of emotionality in the voice, can be reliably inferred following a relatively standard amount of time in the range of 400–650 ms of stimulus exposure (Cornew et al.,
<xref ref-type="bibr" rid="B14">2010</xref>
; Pell and Kotz,
<xref ref-type="bibr" rid="B41">2011</xref>
). Since our measures of recognition include conscious interpretative (naming) processes and are biased somewhat by the gating method, our data on the time course for
<italic>neutral</italic>
prosody are not inconsistent with results showing the
<italic>on-line</italic>
differentiation of neutrality/emotionality in the voice at around 200 ms after speech onset, as inferred from amplitude differences in the P200 ERP component when German utterances were presented to listeners (Paulmann et al.,
<xref ref-type="bibr" rid="B37">2008</xref>
). One can speculate that listeners use a heuristic or default process for recognizing
<italic>neutral</italic>
voices whenever on-line analysis of prosody does not uncover evidence of emotionally meaningful cue configurations; presumably, this process for rejecting the presence of known acoustic patterns referring to emotions, like the process for decoding known patterns, is accomplished over a relatively stable time interval. To test these possibilities, it would be interesting to modify neutral sentences by inserting local variations in emotionally-meaningful acoustic features at critical junctures in time to determine if this “resets the clock” for inferring the presence or absence of emotion in speech.</p>
</sec>
</sec>
</sec>
<sec sec-type="conclusion" id="s5">
<title>Conclusion</title>
<p>Following recent on-line (ERP) studies demonstrating that vocal emotions are distinguished from neutral voices after 200 ms of speech processing (Paulmann and Kotz,
<xref ref-type="bibr" rid="B37a">2008</xref>
), and that emotion-specific differences begin to be detected in the 200–400 ms time window (Alter et al.,
<xref ref-type="bibr" rid="B1">2003</xref>
; Paulmann and Pell,
<xref ref-type="bibr" rid="B35">2010</xref>
), our data shed critical light on the time interval where different emotion-specific meanings of vocal expressions are fully recognized and available for conscious retrieval. While it seems likely that the phrase structure of language governs local opportunities for speakers to encode emotionally-meaningful cues that are highly salient to the listener, at least in certain contexts, there are remarkable consistencies in the
<italic>amount</italic>
of time listeners must monitor vocal cue configurations to decode emotional (particularly threatening) meanings. As such, the idea that there are systematic differences in the time course for arriving at vocal emotional meanings is confirmed. To gather further information on how social factors influence the communication of vocal emotional meanings, future studies using the gating paradigm could present emotional utterances to listeners in their native vs. a foreign language; this could reveal whether specificities in the time course for recognizing emotions manifest in a similar way for native speakers of different languages, while testing the hypothesis that accurate decoding of vocal emotions in a foreign language is systematically delayed due to interference at the phonological level (Van Bezooijen et al.,
<xref ref-type="bibr" rid="B60">1983</xref>
; Pell and Skorup,
<xref ref-type="bibr" rid="B43">2008</xref>
).</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</sec>
</body>
<back>
<ack>
<p>This research was financially supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (RGPIN 203708-11 to Marc D. Pell). Assistance from the McGill University Faculty of Medicine (McLaughlin Postdoctoral Fellowship to Simon Rigoulot) and the Konrad-Adenauer-Foundation (to Eugen Wassiliwizky) are also gratefully acknowledged.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alter</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Rank</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Toepel</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Besson</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Schirmer</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
</person-group>
(
<year>2003</year>
).
<article-title>Affective encoding in the speech signal and in event-related brain potentials</article-title>
.
<source>Speech Commun</source>
.
<volume>40</volume>
,
<fpage>61</fpage>
<lpage>70</lpage>
<pub-id pub-id-type="doi">10.1016/S0167-6393(02)00075-4</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Banse</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>Acoustic profiles in vocal emotion expression</article-title>
.
<source>J. Pers. Soc. Psychol</source>
.
<volume>70</volume>
,
<fpage>614</fpage>
<lpage>636</lpage>
<pub-id pub-id-type="doi">10.1037/0022-3514.70.3.614</pub-id>
<pub-id pub-id-type="pmid">8851745</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barkhuysen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Krahmer</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Swerts</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>2010</year>
).
<article-title>Crossmodal and incremental perception of audiovisual cues to emotional speech</article-title>
.
<source>Lang. Speech</source>
<volume>53</volume>
,
<fpage>1</fpage>
<lpage>30</lpage>
<pub-id pub-id-type="doi">10.1177/0023830909348993</pub-id>
<pub-id pub-id-type="pmid">20415000</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Becker</surname>
<given-names>D. V.</given-names>
</name>
<name>
<surname>Neel</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Srinivasan</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Neufeld</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Fouse</surname>
<given-names>S.</given-names>
</name>
</person-group>
(
<year>2012</year>
).
<article-title>The vividness of happiness in dynamic facial displays of emotion</article-title>
.
<source>PLoS ONE</source>
<volume>7</volume>
:
<fpage>e26551</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0026551</pub-id>
<pub-id pub-id-type="pmid">22247755</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Boersma</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Weenink</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>2012</year>
).
<source>Praat: Doing Phonetics by Computer [Computer Program]</source>
. Version 5.3.21.</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Bolinger</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>1978</year>
).
<article-title>Intonation across languages</article-title>
, in
<source>Universals of Human Language</source>
,
<volume>vol. II.</volume>
Phonology, ed
<person-group person-group-type="editor">
<name>
<surname>Greenberg</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<publisher-loc>Palo Alto, CA</publisher-loc>
:
<publisher-name>Stanford University Press</publisher-name>
),
<fpage>471</fpage>
<lpage>524</lpage>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bolinger</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>1972</year>
).
<article-title>Accent is predictable (if you're a mind reader)</article-title>
.
<source>Language</source>
<volume>48</volume>
,
<fpage>633</fpage>
<lpage>644</lpage>
<pub-id pub-id-type="doi">10.2307/412039</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cacioppo</surname>
<given-names>J. T.</given-names>
</name>
<name>
<surname>Gardner</surname>
<given-names>W. L.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<article-title>Emotion</article-title>
.
<source>Annu. Rev. Psychol</source>
.
<volume>50</volume>
,
<fpage>191</fpage>
<lpage>214</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.psych.50.1.191</pub-id>
<pub-id pub-id-type="pmid">10074678</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calder</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Keane</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lawrence</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Manes</surname>
<given-names>F.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Impaired recognition of anger following damage to the ventral striatum</article-title>
.
<source>Brain</source>
<volume>127</volume>
:
<fpage>1958</fpage>
<lpage>1969</lpage>
<pub-id pub-id-type="doi">10.1093/brain/awh214</pub-id>
<pub-id pub-id-type="pmid">15289264</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calder</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Lawrence</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>A. W.</given-names>
</name>
</person-group>
(
<year>2001</year>
).
<article-title>Neuropsychology of fear and loathing</article-title>
.
<source>Nat. Rev. Neurosci</source>
.
<volume>2</volume>
,
<fpage>352</fpage>
<lpage>363</lpage>
<pub-id pub-id-type="doi">10.1038/35072584</pub-id>
<pub-id pub-id-type="pmid">11331919</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calder</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Keane</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>A. W.</given-names>
</name>
<name>
<surname>Lawrence</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Mason</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Barker</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>2010</year>
).
<article-title>The relation between anger and different forms of disgust: implications for emotion recognition impairments in Huntington's disease</article-title>
.
<source>Neuropsychologia</source>
<volume>48</volume>
,
<fpage>2719</fpage>
<lpage>2729</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2010.05.019</pub-id>
<pub-id pub-id-type="pmid">20580641</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calvo</surname>
<given-names>M. G.</given-names>
</name>
<name>
<surname>Nummenmaa</surname>
<given-names>L.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Eye-movement assessment of the time course in facial expression recognition: neurophysiological implications</article-title>
.
<source>Cogn. Affect. Behav. Neurosci</source>
.
<volume>9</volume>
,
<fpage>398</fpage>
<lpage>411</lpage>
<pub-id pub-id-type="doi">10.3758/CABN.9.4.398</pub-id>
<pub-id pub-id-type="pmid">19897793</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carretié</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Mercado</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tapia</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hinojosa</surname>
<given-names>J. A.</given-names>
</name>
</person-group>
(
<year>2001</year>
).
<article-title>Emotion, attention, and the “negativity bias”, studied through event-related potentials</article-title>
.
<source>Int. J. Psychophysiol</source>
.
<volume>41</volume>
,
<fpage>75</fpage>
<lpage>85</lpage>
<pub-id pub-id-type="doi">10.1016/S0167-8760(00)00195-1</pub-id>
<pub-id pub-id-type="pmid">11239699</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cornew</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Carver</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Love</surname>
<given-names>T.</given-names>
</name>
</person-group>
(
<year>2010</year>
).
<article-title>There's more to emotion than meets the eye: a processing bias for neutral content in the domain of emotional prosody</article-title>
.
<source>Cogn. Emot</source>
.
<volume>24</volume>
:
<fpage>1133</fpage>
<lpage>1152</lpage>
<pub-id pub-id-type="doi">10.1080/02699930903247492</pub-id>
<pub-id pub-id-type="pmid">21552425</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cosmides</surname>
<given-names>L.</given-names>
</name>
</person-group>
(
<year>1983</year>
).
<article-title>Invariances in the acoustic expression of emotion during speech</article-title>
.
<source>J. Exp. Psychol. Hum. Percept. Perform</source>
.
<volume>9</volume>
(6),
<fpage>864</fpage>
<lpage>881</lpage>
<pub-id pub-id-type="doi">10.1037/0096-1523.9.6.864</pub-id>
<pub-id pub-id-type="pmid">6227697</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Ekman</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>1972</year>
).
<article-title>Universal and cultural differences in facial expressions of emotions</article-title>
, in
<source>Nebraska Symposium on Motivation, 1971</source>
, ed
<person-group person-group-type="editor">
<name>
<surname>Cole</surname>
<given-names>J. K.</given-names>
</name>
</person-group>
(
<publisher-loc>Lincoln</publisher-loc>
:
<publisher-name>University of Nebraska Press</publisher-name>
),
<fpage>207</fpage>
<lpage>283</lpage>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ekman</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>1992</year>
).
<article-title>An argument for basic emotions</article-title>
.
<source>Cogn. Emot</source>
.
<volume>6</volume>
,
<fpage>169</fpage>
<lpage>200</lpage>
<pub-id pub-id-type="doi">10.1080/02699939208411068</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ekman</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Levenson</surname>
<given-names>R. W.</given-names>
</name>
<name>
<surname>Friesen</surname>
<given-names>W. V.</given-names>
</name>
</person-group>
(
<year>1983</year>
).
<article-title>Autonomic nervous system activity distinguishes between emotions</article-title>
.
<source>Science</source>
<volume>221</volume>
,
<fpage>1208</fpage>
<lpage>1210</lpage>
<pub-id pub-id-type="doi">10.1126/science.6612338</pub-id>
<pub-id pub-id-type="pmid">6612338</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grosjean</surname>
<given-names>F.</given-names>
</name>
</person-group>
(
<year>1980</year>
).
<article-title>Spoken word recognition processes and the gating paradigm</article-title>
.
<source>Percept. Psychophys</source>
.
<volume>18</volume>
,
<fpage>267</fpage>
<lpage>283</lpage>
<pub-id pub-id-type="doi">10.3758/BF03204386</pub-id>
<pub-id pub-id-type="pmid">7465310</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grosjean</surname>
<given-names>F.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>Gating</article-title>
.
<source>Lang. Cogn. Process</source>
.
<volume>11</volume>
,
<fpage>597</fpage>
<lpage>604</lpage>
<pub-id pub-id-type="doi">10.1080/016909696386999</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Hess</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Beaupré</surname>
<given-names>M. G.</given-names>
</name>
<name>
<surname>Cheung</surname>
<given-names>N.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Who to whom and why – cultural differences and similarities in the function of smiles</article-title>
, in
<source>An Empirical Reflection on the Smile</source>
, eds
<person-group person-group-type="editor">
<name>
<surname>Abel</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ceia</surname>
<given-names>C. H.</given-names>
</name>
</person-group>
(
<publisher-loc>Lewiston, NY</publisher-loc>
:
<publisher-name>The Edwin Mellen Press</publisher-name>
),
<fpage>187</fpage>
<lpage>216</lpage>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Izard</surname>
<given-names>C. E.</given-names>
</name>
</person-group>
(
<year>1971</year>
).
<source>The Face of Emotion</source>
.
<publisher-loc>New York, NY</publisher-loc>
:
<publisher-name>Appleton-Century-Crofts</publisher-name>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jaywant</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
</person-group>
(
<year>2012</year>
).
<article-title>Categorical processing of negative emotions from speech prosody</article-title>
.
<source>Speech Commun</source>
.
<volume>54</volume>
,
<fpage>1</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1016/j.specom.2011.05.011</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Johnstone</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>2000</year>
).
<article-title>Vocal communication of emotion</article-title>
, in
<source>Handbook of Emotions, 2nd Edn</source>
. eds
<person-group person-group-type="editor">
<name>
<surname>Lewis</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Haviland</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<publisher-loc>New York, NY</publisher-loc>
:
<publisher-name>Guilford Press</publisher-name>
),
<fpage>220</fpage>
<lpage>235</lpage>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Juslin</surname>
<given-names>P. N.</given-names>
</name>
<name>
<surname>Laukka</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Communication of emotions in vocal expression and music performance: different channels, same code?</article-title>
<source>Psychol. Bull</source>
.
<volume>129</volume>
,
<fpage>770</fpage>
<lpage>814</lpage>
<pub-id pub-id-type="doi">10.1037/0033-2909.129.5.770</pub-id>
<pub-id pub-id-type="pmid">12956543</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Ladd</surname>
<given-names>D. R.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<source>Intonational Phonology</source>
.
<publisher-loc>Cambridge, UK</publisher-loc>
:
<publisher-name>Cambridge University Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ladd</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Silverman</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Tolkmitt</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bergmann</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>1985</year>
).
<article-title>Evidence for the independent function of intonation contour type, voice quality and F0 range in signaling speaker affect</article-title>
.
<source>J. Acoust. Soc. Am</source>
.
<volume>78</volume>
,
<fpage>435</fpage>
<lpage>444</lpage>
<pub-id pub-id-type="doi">10.1121/1.392466</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Laukka</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Categorical perception of vocal emotion expressions</article-title>
.
<source>Emotion</source>
<volume>5</volume>
,
<fpage>277</fpage>
<lpage>295</lpage>
<pub-id pub-id-type="doi">10.1037/1528-3542.5.3.277</pub-id>
<pub-id pub-id-type="pmid">16187864</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Laukka</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Juslin</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bresin</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>A dimensional approach to vocal expression of emotion</article-title>
.
<source>Cogn. Emot</source>
.
<volume>19</volume>
,
<fpage>633</fpage>
<lpage>653</lpage>
<pub-id pub-id-type="doi">10.1080/02699930441000445</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leinonen</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Hiltunen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Linnankoski</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Laakso</surname>
<given-names>M.-L.</given-names>
</name>
</person-group>
(
<year>1997</year>
).
<article-title>Expression of emotional-motivational connotations with a one-word utterance</article-title>
.
<source>J. Acoust. Soc. Am</source>
.
<volume>102</volume>
,
<fpage>1853</fpage>
<lpage>1863</lpage>
<pub-id pub-id-type="doi">10.1121/1.420109</pub-id>
<pub-id pub-id-type="pmid">9301063</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Levenson</surname>
<given-names>R. W.</given-names>
</name>
</person-group>
(
<year>1992</year>
).
<article-title>Autonomic nervous system differences among emotions</article-title>
.
<source>Psychol. Sci</source>
.
<volume>3</volume>
,
<fpage>23</fpage>
<lpage>27</lpage>
<pub-id pub-id-type="doi">10.1111/j.1467-9280.1992.tb00251.x</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Levitt</surname>
<given-names>E. A.</given-names>
</name>
</person-group>
(
<year>1964</year>
).
<article-title>The relationship between abilities to express emotional meanings vocally and facially</article-title>
, in
<source>The Communication of Emotional Meaning</source>
, ed
<person-group person-group-type="editor">
<name>
<surname>Davitz</surname>
<given-names>J. R.</given-names>
</name>
</person-group>
(
<publisher-loc>New York, NY</publisher-loc>
:
<publisher-name>McGraw-Hill</publisher-name>
),
<fpage>87</fpage>
<lpage>100</lpage>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oller</surname>
<given-names>D. K.</given-names>
</name>
</person-group>
(
<year>1973</year>
).
<article-title>The effect of position in utterance on speech segment duration in English</article-title>
.
<source>J. Acoust. Soci. Am</source>
.
<volume>54</volume>
,
<fpage>1235</fpage>
<lpage>1247</lpage>
<pub-id pub-id-type="doi">10.1121/1.1914393</pub-id>
<pub-id pub-id-type="pmid">4765808</pub-id>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Palermo</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Coltheart</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Photographs of facial expression: accuracy, response times, and ratings of intensity</article-title>
.
<source>Behav. Res. Meth. Instrum. Comput</source>
.
<volume>36</volume>
,
<fpage>634</fpage>
<lpage>638</lpage>
<pub-id pub-id-type="doi">10.3758/BF03206544</pub-id>
<pub-id pub-id-type="pmid">15641409</pub-id>
</mixed-citation>
</ref>
<ref id="B37a">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paulmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S. A.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Early emotional prosody perception based on different speaker voices</article-title>
.
<source>Neuroreport</source>
,
<volume>19</volume>
,
<fpage>209</fpage>
<lpage>213</lpage>
<pub-id pub-id-type="doi">10.1097/WNR.0b013e3282f454db</pub-id>
<pub-id pub-id-type="pmid">18185110</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paulmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
</person-group>
(
<year>2010</year>
).
<article-title>Contextual influences of emotional speech prosody on face processing: how much is enough?</article-title>
<source>Cogn., Affect. Behav. Neurosci</source>
.
<volume>10</volume>
,
<fpage>230</fpage>
<lpage>242</lpage>
<pub-id pub-id-type="doi">10.3758/CABN.10.2.230</pub-id>
<pub-id pub-id-type="pmid">20498347</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paulmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
</person-group>
(
<year>2011</year>
).
<article-title>Is there an advantage for recognizing multi-modal emotional stimuli?</article-title>
<source>Mot. Emot</source>
.
<volume>35</volume>
,
<fpage>192</fpage>
<lpage>201</lpage>
<pub-id pub-id-type="doi">10.1007/s11031-011-9206-0</pub-id>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paulmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S. A.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Functional contributions of the basal ganglia to emotional prosody: evidence from ERPs</article-title>
.
<source>Brain Res</source>
.
<volume>1217</volume>
,
<fpage>171</fpage>
<lpage>178</lpage>
<pub-id pub-id-type="doi">10.1016/j.brainres.2008.04.032</pub-id>
<pub-id pub-id-type="pmid">18501336</pub-id>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
</person-group>
(
<year>2001</year>
).
<article-title>Influence of emotion and focus location on prosody in matched statements and questions</article-title>
.
<source>J. Acoust. Soc. Am</source>
.
<volume>109</volume>
,
<fpage>1668</fpage>
<lpage>1680</lpage>
<pub-id pub-id-type="doi">10.1121/1.1352088</pub-id>
<pub-id pub-id-type="pmid">11325135</pub-id>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Baum</surname>
<given-names>S. R.</given-names>
</name>
</person-group>
(
<year>1997</year>
).
<article-title>Unilateral brain damage, prosodic comprehension deficits, and the acoustic cues to prosody</article-title>
.
<source>Brain Lang</source>
.
<volume>57</volume>
,
<fpage>195</fpage>
<lpage>214</lpage>
<pub-id pub-id-type="doi">10.1006/brln.1997.1736</pub-id>
<pub-id pub-id-type="pmid">9126413</pub-id>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S.</given-names>
</name>
</person-group>
(
<year>2011</year>
).
<article-title>On the time course of vocal emotion recognition</article-title>
.
<source>PLoS ONE</source>
<volume>6</volume>
:
<fpage>e27256</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0027256</pub-id>
<pub-id pub-id-type="pmid">22087275</pub-id>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Paulmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Dara</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Alasseri</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S. A.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Factors in the recognition of vocally expressed emotions: a comparison of four languages</article-title>
.
<source>J. Phonet</source>
.
<volume>37</volume>
,
<fpage>417</fpage>
<lpage>435</lpage>
<pub-id pub-id-type="doi">10.1016/j.wocn.2009.07.005</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Skorup</surname>
<given-names>V.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Implicit processing of emotional prosody in a foreign versus native language</article-title>
.
<source>Speech Commun</source>
.
<volume>50</volume>
,
<fpage>519</fpage>
<lpage>530</lpage>
<pub-id pub-id-type="doi">10.1016/j.specom.2008.03.006</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rigoulot</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pell</surname>
<given-names>M. D.</given-names>
</name>
</person-group>
(
<year>2012</year>
).
<article-title>Seeing emotion with your ears: emotional prosody implicitly guides visual attention to faces</article-title>
.
<source>PLoS ONE</source>
<volume>7</volume>
:
<fpage>e30740</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0030740</pub-id>
<pub-id pub-id-type="pmid">22303454</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rozin</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Lowery</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ebert</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>1994</year>
).
<article-title>Varieties of disgust faces and the structure of disgust</article-title>
.
<source>J. Pers. Soc. Psychol</source>
.
<volume>66</volume>
,
<fpage>870</fpage>
<lpage>881</lpage>
<pub-id pub-id-type="doi">10.1037/0022-3514.66.5.870</pub-id>
<pub-id pub-id-type="pmid">8014832</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sauter</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Eisner</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Ekman</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>S. K.</given-names>
</name>
</person-group>
(
<year>2010</year>
).
<article-title>Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A</source>
.
<volume>107</volume>
,
<fpage>2408</fpage>
<lpage>2412</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0908239106</pub-id>
<pub-id pub-id-type="pmid">20133790</pub-id>
</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sauter</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>S. K.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>More than one kind of happiness: can we recognize vocal expressions of different positive states?</article-title>
<source>Mot. Emot</source>
.
<volume>31</volume>
,
<fpage>192</fpage>
<lpage>199</lpage>
<pub-id pub-id-type="doi">10.1007/s11031-007-9065-x</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>1986</year>
).
<article-title>Vocal affect expression: a review and a model for future research</article-title>
.
<source>Psychol. Bull</source>
.
<volume>99</volume>
,
<fpage>143</fpage>
<lpage>165</lpage>
<pub-id pub-id-type="doi">10.1037/0033-2909.99.2.143</pub-id>
<pub-id pub-id-type="pmid">3515381</pub-id>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>1988</year>
).
<article-title>On the symbolic functions of vocal affect expression</article-title>
.
<source>J. Lang. Soc. Psychol</source>
.
<volume>7</volume>
,
<fpage>79</fpage>
<lpage>100</lpage>
<pub-id pub-id-type="doi">10.1177/0261927X8800700201</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Emotions are emergent processes: they require a dynamic computational architecture</article-title>
.
<source>Philos. Trans. R. Soc. B</source>
<volume>364</volume>
,
<fpage>3459</fpage>
<lpage>3474</lpage>
<pub-id pub-id-type="doi">10.1098/rstb.2009.0141</pub-id>
<pub-id pub-id-type="pmid">19884141</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Banse</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wallbott</surname>
<given-names>H. G.</given-names>
</name>
</person-group>
(
<year>2001</year>
).
<article-title>Emotion inferences from vocal expression correlate across language and cultures</article-title>
.
<source>J. Cross Cult. Psychol</source>
.
<volume>32</volume>
,
<fpage>76</fpage>
<lpage>92</lpage>
<pub-id pub-id-type="doi">10.1177/0022022101032001009</pub-id>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Banse</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wallbott</surname>
<given-names>H. G.</given-names>
</name>
<name>
<surname>Goldbeck</surname>
<given-names>T.</given-names>
</name>
</person-group>
(
<year>1991</year>
).
<article-title>Vocal cues in emotion encoding and decoding</article-title>
.
<source>Mot. Emot</source>
.
<volume>15</volume>
,
<fpage>123</fpage>
<lpage>148</lpage>
<pub-id pub-id-type="doi">10.1007/BF00995674</pub-id>
</mixed-citation>
</ref>
<ref id="B54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schirmer</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kotz</surname>
<given-names>S. A.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing</article-title>
.
<source>Trends Cogn. Sci</source>
.
<volume>10</volume>
,
<fpage>24</fpage>
<lpage>30</lpage>
<pub-id pub-id-type="doi">10.1016/j.tics.2005.11.009</pub-id>
<pub-id pub-id-type="pmid">16321562</pub-id>
</mixed-citation>
</ref>
<ref id="B55">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simon-Thomas</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Keltner</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Sauter</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Sinicropi-Yao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Abramson</surname>
<given-names>A.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>The voice conveys specific emotions: evidence from vocal burst displays</article-title>
.
<source>Emotion</source>
<volume>9</volume>
,
<fpage>838</fpage>
<lpage>846</lpage>
<pub-id pub-id-type="doi">10.1037/a0017810</pub-id>
<pub-id pub-id-type="pmid">20001126</pub-id>
</mixed-citation>
</ref>
<ref id="B56">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sobin</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Alpert</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<article-title>Emotions in speech: the acoustic attributes of fear, anger, sadness and joy</article-title>
.
<source>J. Psycholinguist. Res</source>
.
<volume>28</volume>
,
<fpage>347</fpage>
<lpage>365</lpage>
<pub-id pub-id-type="doi">10.1023/A:1023237014909</pub-id>
<pub-id pub-id-type="pmid">10380660</pub-id>
</mixed-citation>
</ref>
<ref id="B57">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Szameitat</surname>
<given-names>D. P.</given-names>
</name>
<name>
<surname>Kreifelts</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Alter</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Szameitat</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Sterr</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Grodd</surname>
<given-names>W.</given-names>
</name>
<etal></etal>
</person-group>
(
<year>2010</year>
).
<article-title>It is not always tickling: distinct cerebral responses during perception of different laughter types</article-title>
.
<source>Neuroimage</source>
<volume>53</volume>
,
<fpage>1264</fpage>
<lpage>1271</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.06.028</pub-id>
<pub-id pub-id-type="pmid">20600991</pub-id>
</mixed-citation>
</ref>
<ref id="B58">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname>
<given-names>W. F.</given-names>
</name>
<name>
<surname>Balkwill</surname>
<given-names>L.-L.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Decoding speech prosody in five languages</article-title>
.
<source>Semiotica</source>
<volume>158</volume>
,
<fpage>407</fpage>
<lpage>24</lpage>
</mixed-citation>
</ref>
<ref id="B59">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tracy</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Robins</surname>
<given-names>R. W.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>The automaticity of emotion recognition</article-title>
.
<source>Emotion</source>
<volume>8</volume>
,
<fpage>81</fpage>
<lpage>95</lpage>
<pub-id pub-id-type="doi">10.1037/1528-3542.8.1.81</pub-id>
<pub-id pub-id-type="pmid">18266518</pub-id>
</mixed-citation>
</ref>
<ref id="B60">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Van Bezooijen</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Otto</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Heenan</surname>
<given-names>T.</given-names>
</name>
</person-group>
(
<year>1983</year>
).
<article-title>Recognition of vocal expressions of emotion: a three-nation study to identify universal characteristics</article-title>
.
<source>J. Cross Cult. Psychol</source>
.
<volume>14</volume>
,
<fpage>387</fpage>
<lpage>406</lpage>
<pub-id pub-id-type="doi">10.1177/0022002183014004001</pub-id>
</mixed-citation>
</ref>
<ref id="B61">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wagner</surname>
<given-names>H. L.</given-names>
</name>
</person-group>
(
<year>1993</year>
).
<article-title>On measuring performance in category judgment studies of nonverbal behavior</article-title>
.
<source>J. Nonverb. Behav</source>
.
<volume>17</volume>
,
<fpage>3</fpage>
<lpage>28</lpage>
<pub-id pub-id-type="doi">10.1007/BF00987006</pub-id>
</mixed-citation>
</ref>
<ref id="B62">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wallbott</surname>
<given-names>H. G.</given-names>
</name>
<name>
<surname>Scherer</surname>
<given-names>K. R.</given-names>
</name>
</person-group>
(
<year>1986</year>
).
<article-title>Cues and channels in emotion recognition</article-title>
.
<source>J. Personal. Soc. Psychol</source>
.
<volume>51</volume>
,
<fpage>690</fpage>
<lpage>99</lpage>
<pub-id pub-id-type="doi">10.1037/0022-3514.51.4.690</pub-id>
</mixed-citation>
</ref>
<ref id="B63">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilson</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wharton</surname>
<given-names>T.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Relevance and prosody</article-title>
.
<source>J. Pragmat</source>
.
<volume>38</volume>
,
<fpage>1559</fpage>
<lpage>1579</lpage>
<pub-id pub-id-type="doi">10.1016/j.pragma.2005.04.012</pub-id>
</mixed-citation>
</ref>
<ref id="B64">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zuckerman</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Lipets</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Koivumaki</surname>
<given-names>J. H.</given-names>
</name>
<name>
<surname>Rosenthal</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>1975</year>
).
<article-title>Encoding and decoding nonverbal cues of emotion</article-title>
.
<source>J. Pers. Soc. Psychol</source>
.
<volume>32</volume>
,
<fpage>1068</fpage>
<lpage>1076</lpage>
<pub-id pub-id-type="doi">10.1037/0022-3514.32.6.1068</pub-id>
<pub-id pub-id-type="pmid">1214214</pub-id>
</mixed-citation>
</ref>
</ref-list>
<app-group>
<app id="A1">
<title>Appendix</title>
<p>A list of pseudo-utterances produced to convey each target emotion that were gated for presentation in the experiment.</p>
<list list-type="order">
<list-item>
<p>I tropped for swinty gowers.</p>
</list-item>
<list-item>
<p>She kuvelled the noralind.</p>
</list-item>
<list-item>
<p>The placter jabored the tozz.</p>
</list-item>
<list-item>
<p>The moger is chalestic.</p>
</list-item>
<list-item>
<p>The rivix joled the silling.</p>
</list-item>
<list-item>
<p>The crinklet is boritate.</p>
</list-item>
<list-item>
<p>She krayed a jad ralition.</p>
</list-item>
<list-item>
<p>We wanced on the nonitor.</p>
</list-item>
<list-item>
<p>They pannifered the moser.</p>
</list-item>
<list-item>
<p>We groffed for vappy laurits.</p>
</list-item>
<list-item>
<p>I marlipped the tovity.</p>
</list-item>
<list-item>
<p>The varmalit was raffid.</p>
</list-item>
<list-item>
<p>They rilted the prubition.</p>
</list-item>
</list>
</app>
</app-group>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000144 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000144 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3690349
   |texte=   Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:23805115" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MusicSarreV3 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024