Serveur d'exploration sur la musique en Sarre

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Testing hypotheses about compound stress assignment in English: a corpus-based investigation

Identifieur interne : 000858 ( Istex/Corpus ); précédent : 000857; suivant : 000859

Testing hypotheses about compound stress assignment in English: a corpus-based investigation

Auteurs : Ingo Plag ; Gero Kunter ; Sabine Lappe

Source :

RBID : ISTEX:519348FA143EF197827D6C5F8553F9626EBC142F

English descriptors

Abstract

This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -er. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.

Url:
DOI: 10.1515/CLLT.2007.012

Links to Exploration step

ISTEX:519348FA143EF197827D6C5F8553F9626EBC142F

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
<author>
<name sortKey="Plag, Ingo" sort="Plag, Ingo" uniqKey="Plag I" first="Ingo" last="Plag">Ingo Plag</name>
</author>
<author>
<name sortKey="Kunter, Gero" sort="Kunter, Gero" uniqKey="Kunter G" first="Gero" last="Kunter">Gero Kunter</name>
</author>
<author>
<name sortKey="Lappe, Sabine" sort="Lappe, Sabine" uniqKey="Lappe S" first="Sabine" last="Lappe">Sabine Lappe</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:519348FA143EF197827D6C5F8553F9626EBC142F</idno>
<date when="2007" year="2007">2007</date>
<idno type="doi">10.1515/CLLT.2007.012</idno>
<idno type="url">https://api.istex.fr/document/519348FA143EF197827D6C5F8553F9626EBC142F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000858</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000858</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
<author>
<name sortKey="Plag, Ingo" sort="Plag, Ingo" uniqKey="Plag I" first="Ingo" last="Plag">Ingo Plag</name>
</author>
<author>
<name sortKey="Kunter, Gero" sort="Kunter, Gero" uniqKey="Kunter G" first="Gero" last="Kunter">Gero Kunter</name>
</author>
<author>
<name sortKey="Lappe, Sabine" sort="Lappe, Sabine" uniqKey="Lappe S" first="Sabine" last="Lappe">Sabine Lappe</name>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Corpus Linguistics and Linguistic Theory</title>
<title level="j" type="abbrev">Corpus Linguistics and Linguistic Theory</title>
<idno type="ISSN">1613-7027</idno>
<idno type="eISSN">1613-7035</idno>
<imprint>
<publisher>Walter de Gruyter</publisher>
<date type="published" when="2007-12-11">2007-12-11</date>
<biblScope unit="volume">3</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="199">199</biblScope>
<biblScope unit="page" to="232">232</biblScope>
</imprint>
<idno type="ISSN">1613-7027</idno>
</series>
<idno type="istex">519348FA143EF197827D6C5F8553F9626EBC142F</idno>
<idno type="DOI">10.1515/CLLT.2007.012</idno>
<idno type="ArticleID">cllt.3.2.199</idno>
<idno type="pdf">cllt.2007.012.pdf</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1613-7027</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="Teeft" xml:lang="en">
<term>Agentive suffix</term>
<term>Algorithm</term>
<term>Analogical</term>
<term>Analogical algorithm</term>
<term>Analogical effects</term>
<term>Analogical factors</term>
<term>Analogical hypothesis</term>
<term>Analogical model</term>
<term>Analogical modeling</term>
<term>Analogical models</term>
<term>Argstruct morphright</term>
<term>Argument head compounds show</term>
<term>Argument structure</term>
<term>Argument structure effect</term>
<term>Argumenthead compounds</term>
<term>Assistant professor</term>
<term>Authorship relation</term>
<term>Baayen</term>
<term>Boston university radio speech corpus</term>
<term>Cambridge university press</term>
<term>Categorical rules</term>
<term>Celex</term>
<term>Celex compilers</term>
<term>Celex database</term>
<term>Celex frequencies</term>
<term>Celex frequency</term>
<term>Cobuild</term>
<term>Cobuild corpus</term>
<term>Compound</term>
<term>Compound constituents</term>
<term>Compound interpretation</term>
<term>Compound semantics</term>
<term>Compound stress</term>
<term>Compound stress assignment</term>
<term>Compound stress rule</term>
<term>Compound stress variability</term>
<term>Constituent</term>
<term>Constituent families</term>
<term>Constituent family</term>
<term>Constituent family information</term>
<term>Continuous letter strings</term>
<term>Copulative compounds</term>
<term>Corpus data</term>
<term>Correct predictions</term>
<term>Dance hall</term>
<term>Data computationally</term>
<term>Data points</term>
<term>Database</term>
<term>Deriv model</term>
<term>Dictionary data</term>
<term>Distance space</term>
<term>Dutch compounds</term>
<term>English compound stress</term>
<term>English language</term>
<term>English linguistics</term>
<term>Exemplar</term>
<term>Experimental psychology</term>
<term>Feature values</term>
<term>Fifth avenue</term>
<term>Final model</term>
<term>First element</term>
<term>Frequency information</term>
<term>Gagne</term>
<term>Gero kunter</term>
<term>Giegerich</term>
<term>Google</term>
<term>Google frequencies</term>
<term>Harald baayen</term>
<term>Head morphology</term>
<term>Higher frequency</term>
<term>Higher proportion</term>
<term>Hyphenated compounds</term>
<term>Hypothesis accuracy</term>
<term>Important factor</term>
<term>Ingo plag</term>
<term>John benjamins</term>
<term>Krott</term>
<term>Kunter</term>
<term>Language processing</term>
<term>Lappe</term>
<term>Lappe figure</term>
<term>Large number</term>
<term>Latest version</term>
<term>Leftward</term>
<term>Leftward stress</term>
<term>Less lexicalized compounds</term>
<term>Lexicalization</term>
<term>Lexicalization effect</term>
<term>Lexicalized</term>
<term>Lexicalized compounds</term>
<term>Lexicon</term>
<term>Liberman</term>
<term>Linguistic data consortium</term>
<term>Logistic</term>
<term>Logistic regression</term>
<term>Logistic regression analysis</term>
<term>Logistic regression model</term>
<term>Madison avenue</term>
<term>Many compounds</term>
<term>Mental lexicon</term>
<term>Modeling</term>
<term>Modifierhead compounds</term>
<term>More detail</term>
<term>Morpheme</term>
<term>Morphology</term>
<term>Morphright</term>
<term>Music hall</term>
<term>Nearest neighbors</term>
<term>Noun</term>
<term>Noun phrases</term>
<term>Novel compounds</term>
<term>Orthographic words</term>
<term>Other compounds</term>
<term>Other features</term>
<term>Other languages</term>
<term>Other words</term>
<term>Overall accuracy</term>
<term>Pertinent compounds</term>
<term>Pertinent examples</term>
<term>Plag</term>
<term>Predictive accuracy</term>
<term>Predictor</term>
<term>Present authors</term>
<term>Present study</term>
<term>Probabilistic</term>
<term>Proper noun</term>
<term>Raters</term>
<term>Regression</term>
<term>Regression analysis</term>
<term>Regression model</term>
<term>Right constituent</term>
<term>Right constituents</term>
<term>Right predictions</term>
<term>Right stress</term>
<term>Right stresses</term>
<term>Rightward</term>
<term>Rightward stress</term>
<term>Rightward stress assignment</term>
<term>Rightward stresses</term>
<term>Robert schreuder</term>
<term>Sabine lappe</term>
<term>Same data</term>
<term>Semantic</term>
<term>Semantic categories</term>
<term>Semantic entities</term>
<term>Semantic features</term>
<term>Semantic hypotheses</term>
<term>Semantic hypothesis</term>
<term>Semantic relation</term>
<term>Semantic relations</term>
<term>Significant influence</term>
<term>Significant predictors</term>
<term>Spelling</term>
<term>Sproat</term>
<term>Stress assignment</term>
<term>Stress pattern</term>
<term>Stress patterns</term>
<term>Stress position</term>
<term>Structural hypothesis</term>
<term>Subset</term>
<term>Synthetic compounds</term>
<term>Test item</term>
<term>Timbl</term>
<term>Timbl analysis</term>
<term>Traditional claims</term>
<term>Truck driver</term>
<term>Usual model simplification process</term>
<term>Variability</term>
<term>Variable compound behavior</term>
<term>Vast majority</term>
<term>Worm hole</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -er. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.</div>
</front>
</TEI>
<istex>
<corpusName>degruyter-journals</corpusName>
<keywords>
<teeft>
<json:string>celex</json:string>
<json:string>plag</json:string>
<json:string>analogical</json:string>
<json:string>compound stress</json:string>
<json:string>lappe</json:string>
<json:string>kunter</json:string>
<json:string>lexicalization</json:string>
<json:string>rightward</json:string>
<json:string>structural hypothesis</json:string>
<json:string>timbl</json:string>
<json:string>predictor</json:string>
<json:string>rightward stress</json:string>
<json:string>leftward</json:string>
<json:string>right constituent</json:string>
<json:string>argument structure</json:string>
<json:string>leftward stress</json:string>
<json:string>semantic hypothesis</json:string>
<json:string>stress assignment</json:string>
<json:string>semantic categories</json:string>
<json:string>right stresses</json:string>
<json:string>stress position</json:string>
<json:string>database</json:string>
<json:string>google</json:string>
<json:string>krott</json:string>
<json:string>cobuild</json:string>
<json:string>compound</json:string>
<json:string>nearest neighbors</json:string>
<json:string>giegerich</json:string>
<json:string>morpheme</json:string>
<json:string>lexicalized</json:string>
<json:string>liberman</json:string>
<json:string>constituent family</json:string>
<json:string>subset</json:string>
<json:string>sproat</json:string>
<json:string>cobuild corpus</json:string>
<json:string>gagne</json:string>
<json:string>baayen</json:string>
<json:string>logistic regression analysis</json:string>
<json:string>semantic relations</json:string>
<json:string>raters</json:string>
<json:string>morphright</json:string>
<json:string>algorithm</json:string>
<json:string>logistic regression model</json:string>
<json:string>regression model</json:string>
<json:string>stress pattern</json:string>
<json:string>lexicalization effect</json:string>
<json:string>compound stress assignment</json:string>
<json:string>english linguistics</json:string>
<json:string>constituent</json:string>
<json:string>noun</json:string>
<json:string>right stress</json:string>
<json:string>semantic relation</json:string>
<json:string>analogical algorithm</json:string>
<json:string>harald baayen</json:string>
<json:string>celex frequencies</json:string>
<json:string>google frequencies</json:string>
<json:string>semantic</json:string>
<json:string>morphology</json:string>
<json:string>logistic</json:string>
<json:string>lexicalized compounds</json:string>
<json:string>test item</json:string>
<json:string>english language</json:string>
<json:string>proper noun</json:string>
<json:string>music hall</json:string>
<json:string>analogical model</json:string>
<json:string>analogical effects</json:string>
<json:string>analogical models</json:string>
<json:string>mental lexicon</json:string>
<json:string>deriv model</json:string>
<json:string>boston university radio speech corpus</json:string>
<json:string>stress patterns</json:string>
<json:string>compound stress rule</json:string>
<json:string>dutch compounds</json:string>
<json:string>analogical modeling</json:string>
<json:string>modeling</json:string>
<json:string>probabilistic</json:string>
<json:string>analogical hypothesis</json:string>
<json:string>noun phrases</json:string>
<json:string>argstruct morphright</json:string>
<json:string>argument structure effect</json:string>
<json:string>cambridge university press</json:string>
<json:string>overall accuracy</json:string>
<json:string>more detail</json:string>
<json:string>celex frequency</json:string>
<json:string>celex database</json:string>
<json:string>semantic hypotheses</json:string>
<json:string>compound constituents</json:string>
<json:string>categorical rules</json:string>
<json:string>english compound stress</json:string>
<json:string>logistic regression</json:string>
<json:string>right constituents</json:string>
<json:string>copulative compounds</json:string>
<json:string>sabine lappe</json:string>
<json:string>constituent families</json:string>
<json:string>gero kunter</json:string>
<json:string>many compounds</json:string>
<json:string>authorship relation</json:string>
<json:string>feature values</json:string>
<json:string>regression</json:string>
<json:string>lexicon</json:string>
<json:string>variability</json:string>
<json:string>exemplar</json:string>
<json:string>spelling</json:string>
<json:string>correct predictions</json:string>
<json:string>continuous letter strings</json:string>
<json:string>ingo plag</json:string>
<json:string>modifierhead compounds</json:string>
<json:string>right predictions</json:string>
<json:string>predictive accuracy</json:string>
<json:string>rightward stresses</json:string>
<json:string>argumenthead compounds</json:string>
<json:string>important factor</json:string>
<json:string>head morphology</json:string>
<json:string>lappe figure</json:string>
<json:string>fifth avenue</json:string>
<json:string>other languages</json:string>
<json:string>madison avenue</json:string>
<json:string>traditional claims</json:string>
<json:string>language processing</json:string>
<json:string>synthetic compounds</json:string>
<json:string>variable compound behavior</json:string>
<json:string>hypothesis accuracy</json:string>
<json:string>large number</json:string>
<json:string>analogical factors</json:string>
<json:string>first element</json:string>
<json:string>argument head compounds show</json:string>
<json:string>data points</json:string>
<json:string>truck driver</json:string>
<json:string>pertinent examples</json:string>
<json:string>rightward stress assignment</json:string>
<json:string>compound stress variability</json:string>
<json:string>semantic entities</json:string>
<json:string>vast majority</json:string>
<json:string>higher frequency</json:string>
<json:string>worm hole</json:string>
<json:string>compound interpretation</json:string>
<json:string>compound semantics</json:string>
<json:string>data computationally</json:string>
<json:string>other words</json:string>
<json:string>usual model simplification process</json:string>
<json:string>significant predictors</json:string>
<json:string>final model</json:string>
<json:string>pertinent compounds</json:string>
<json:string>less lexicalized compounds</json:string>
<json:string>regression analysis</json:string>
<json:string>semantic features</json:string>
<json:string>other compounds</json:string>
<json:string>dance hall</json:string>
<json:string>present study</json:string>
<json:string>agentive suffix</json:string>
<json:string>dictionary data</json:string>
<json:string>corpus data</json:string>
<json:string>distance space</json:string>
<json:string>higher proportion</json:string>
<json:string>other features</json:string>
<json:string>frequency information</json:string>
<json:string>constituent family information</json:string>
<json:string>timbl analysis</json:string>
<json:string>same data</json:string>
<json:string>latest version</json:string>
<json:string>significant influence</json:string>
<json:string>assistant professor</json:string>
<json:string>hyphenated compounds</json:string>
<json:string>linguistic data consortium</json:string>
<json:string>celex compilers</json:string>
<json:string>present authors</json:string>
<json:string>orthographic words</json:string>
<json:string>experimental psychology</json:string>
<json:string>robert schreuder</json:string>
<json:string>john benjamins</json:string>
<json:string>novel compounds</json:string>
</teeft>
</keywords>
<author>
<json:item>
<name>Ingo Plag</name>
</json:item>
<json:item>
<name>Gero Kunter</name>
</json:item>
<json:item>
<name>Sabine Lappe</name>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>compound</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>stress assignment</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>analogy</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>constituent family</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>CELEX</value>
</json:item>
</subject>
<articleId>
<json:string>cllt.3.2.199</json:string>
</articleId>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>research-article</json:string>
</originalGenre>
<abstract>This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -er. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.</abstract>
<qualityIndicators>
<score>9.172</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>419.528 x 637.795 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>5</keywordCount>
<abstractCharCount>1232</abstractCharCount>
<pdfWordCount>10664</pdfWordCount>
<pdfCharCount>65588</pdfCharCount>
<pdfPageCount>34</pdfPageCount>
<abstractWordCount>181</abstractWordCount>
</qualityIndicators>
<title>Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<title>Corpus Linguistics and Linguistic Theory</title>
<language>
<json:string>unknown</json:string>
</language>
<issn>
<json:string>1613-7027</json:string>
</issn>
<eissn>
<json:string>1613-7035</json:string>
</eissn>
<publisherId>
<json:string>cllt</json:string>
</publisherId>
<volume>3</volume>
<issue>2</issue>
<pages>
<first>199</first>
<last>232</last>
</pages>
<genre>
<json:string>journal</json:string>
</genre>
</host>
<categories>
<wos>
<json:string>social science</json:string>
<json:string>linguistics</json:string>
</wos>
<scienceMetrix>
<json:string>arts & humanities</json:string>
<json:string>communication & textual studies</json:string>
<json:string>languages & linguistics</json:string>
</scienceMetrix>
<inist>
<json:string>sciences humaines et sociales</json:string>
</inist>
</categories>
<publicationDate>2007</publicationDate>
<copyrightDate>2007</copyrightDate>
<doi>
<json:string>10.1515/CLLT.2007.012</json:string>
</doi>
<id>519348FA143EF197827D6C5F8553F9626EBC142F</id>
<score>1</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/document/519348FA143EF197827D6C5F8553F9626EBC142F/fulltext/pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/document/519348FA143EF197827D6C5F8553F9626EBC142F/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/519348FA143EF197827D6C5F8553F9626EBC142F/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
<respStmt>
<resp>Références bibliographiques récupérées via GROBID</resp>
<name resp="ISTEX-API">ISTEX-API (INIST-CNRS)</name>
</respStmt>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Walter de Gruyter</publisher>
<availability>
<p>© Walter de Gruyter, 2007</p>
</availability>
<date>2007-12-18</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
<author xml:id="author-1">
<persName>
<forename type="first">Ingo</forename>
<surname>Plag</surname>
</persName>
</author>
<author xml:id="author-2">
<persName>
<forename type="first">Gero</forename>
<surname>Kunter</surname>
</persName>
</author>
<author xml:id="author-3">
<persName>
<forename type="first">Sabine</forename>
<surname>Lappe</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Corpus Linguistics and Linguistic Theory</title>
<title level="j" type="abbrev">Corpus Linguistics and Linguistic Theory</title>
<idno type="pISSN">1613-7027</idno>
<idno type="eISSN">1613-7035</idno>
<imprint>
<publisher>Walter de Gruyter</publisher>
<date type="published" when="2007-12-11"></date>
<biblScope unit="volume">3</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="199">199</biblScope>
<biblScope unit="page" to="232">232</biblScope>
</imprint>
</monogr>
<idno type="istex">519348FA143EF197827D6C5F8553F9626EBC142F</idno>
<idno type="DOI">10.1515/CLLT.2007.012</idno>
<idno type="ArticleID">cllt.3.2.199</idno>
<idno type="pdf">cllt.2007.012.pdf</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2007-12-18</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -er. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<head>Keywords</head>
<item>
<term>compound</term>
</item>
<item>
<term>stress assignment</term>
</item>
<item>
<term>analogy</term>
</item>
<item>
<term>constituent family</term>
</item>
<item>
<term>CELEX</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2007-12-18">Created</change>
<change when="2007-12-11">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2017-01-17">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/document/519348FA143EF197827D6C5F8553F9626EBC142F/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus degruyter-journals" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Atypon//DTD Atypon Systems Archival NLM DTD Suite v2.2.0 20090301//EN" URI="nlm-dtd/archivearticle.dtd" name="istex:docType"></istex:docType>
<istex:document>
<article article-type="research-article" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">cllt</journal-id>
<abbrev-journal-title abbrev-type="full">Corpus Linguistics and Linguistic Theory</abbrev-journal-title>
<issn pub-type="ppub">1613-7027</issn>
<issn pub-type="epub">1613-7035</issn>
<publisher>
<publisher-name>Walter de Gruyter</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">cllt.3.2.199</article-id>
<article-id pub-id-type="doi">10.1515/CLLT.2007.012</article-id>
<title-group>
<article-title>Testing hypotheses about compound stress assignment in English: a corpus-based investigation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<given-names>Ingo</given-names>
<x> </x>
<surname>Plag</surname>
<x>, </x>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<given-names>Gero</given-names>
<x> </x>
<surname>Kunter</surname>
<x>, </x>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<given-names>Sabine</given-names>
<x> </x>
<surname>Lappe</surname>
</name>
</contrib>
</contrib-group>
<pub-date pub-type="ppub">
<day>11</day>
<month>12</month>
<year>2007</year>
<string-date>December 2007</string-date>
</pub-date>
<pub-date pub-type="epub">
<day>18</day>
<month>12</month>
<year>2007</year>
</pub-date>
<volume>3</volume>
<issue>2</issue>
<fpage>199</fpage>
<lpage>232</lpage>
<copyright-statement>© Walter de Gruyter</copyright-statement>
<copyright-year>2007</copyright-year>
<related-article related-article-type="pdf" xlink:href="cllt.2007.012.pdf"></related-article>
<abstract>
<title>Abstract</title>
<p>This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -
<italic>er</italic>
. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.</p>
</abstract>
<kwd-group>
<title>Keywords</title>
<kwd>compound</kwd>
<x>; </x>
<kwd>stress assignment</kwd>
<x>; </x>
<kwd>analogy</kwd>
<x>; </x>
<kwd>constituent family</kwd>
<x>; </x>
<kwd>CELEX</kwd>
<x>. </x>
</kwd-group>
</article-meta>
</front>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
</titleInfo>
<titleInfo type="alternative" lang="en" contentType="CDATA">
<title>Testing hypotheses about compound stress assignment in English: a corpus-based investigation</title>
</titleInfo>
<name type="personal">
<namePart type="given">Ingo</namePart>
<namePart type="family">Plag</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gero</namePart>
<namePart type="family">Kunter</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sabine</namePart>
<namePart type="family">Lappe</namePart>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article"></genre>
<originInfo>
<publisher>Walter de Gruyter</publisher>
<dateIssued encoding="w3cdtf">2007-12-11</dateIssued>
<dateCreated encoding="w3cdtf">2007-12-18</dateCreated>
<copyrightDate encoding="w3cdtf">2007</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix -er. The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.</abstract>
<subject>
<genre>Keywords</genre>
<topic>compound</topic>
<topic>stress assignment</topic>
<topic>analogy</topic>
<topic>constituent family</topic>
<topic>CELEX</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Corpus Linguistics and Linguistic Theory</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>Corpus Linguistics and Linguistic Theory</title>
</titleInfo>
<genre type="journal">journal</genre>
<identifier type="ISSN">1613-7027</identifier>
<identifier type="eISSN">1613-7035</identifier>
<identifier type="PublisherID">cllt</identifier>
<part>
<date>2007</date>
<detail type="volume">
<caption>vol.</caption>
<number>3</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>2</number>
</detail>
<extent unit="pages">
<start>199</start>
<end>232</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">519348FA143EF197827D6C5F8553F9626EBC142F</identifier>
<identifier type="DOI">10.1515/CLLT.2007.012</identifier>
<identifier type="ArticleID">cllt.3.2.199</identifier>
<identifier type="pdf">cllt.2007.012.pdf</identifier>
<accessCondition type="use and reproduction" contentType="copyright">© Walter de Gruyter, 2007</accessCondition>
<recordInfo>
<recordContentSource>De Gruyter</recordContentSource>
</recordInfo>
</mods>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000858 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000858 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:519348FA143EF197827D6C5F8553F9626EBC142F
   |texte=   Testing hypotheses about compound stress assignment in English: a corpus-based investigation
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024