Learning the Logic of Simple Phonotactics
Identifieur interne : 001D08 ( Istex/Curation ); précédent : 001D07; suivant : 001D09Learning the Logic of Simple Phonotactics
Auteurs : F. Tjong Kim Sang [Belgique] ; John Nerbonne [Pays-Bas]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2000.
Abstract
Abstract: We report on experiments which demonstrate that by abductive inference it is possible to learn enough simple phonotactics to distinguish words from non-words for a simplified set of Dutch, the monosyllables. The monosyllables are distinguished in input so that segmentation is not problematic. Frequency information is withheld as is negative data. The methods are all tested using ten-fold cross-validation as well as a fixed number of randomly generated strings. Orthographic and phonetic representations are compared. The work presented in this chapter is part of a larger project comparing different machine learning techniques on linguistic data.
Url:
DOI: 10.1007/3-540-40030-3_7
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001E31
Links to Exploration step
ISTEX:70C7D08C9A5C9D7C6CC886ADF30144BD359087A2Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Learning the Logic of Simple Phonotactics</title>
<author><name sortKey="Tjong Kim Sang, F" sort="Tjong Kim Sang, F" uniqKey="Tjong Kim Sang F" first="F." last="Tjong Kim Sang">F. Tjong Kim Sang</name>
<affiliation wicri:level="1"><mods:affiliation>CNTS - Language Technology Group, University of Antwerp, Belgium</mods:affiliation>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>CNTS - Language Technology Group, University of Antwerp</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: erikt@uia.ua.ac.be</mods:affiliation>
<country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author><name sortKey="Nerbonne, John" sort="Nerbonne, John" uniqKey="Nerbonne J" first="John" last="Nerbonne">John Nerbonne</name>
<affiliation wicri:level="1"><mods:affiliation>Alfa-informatica, BCN, University of Groningen, The Netherlands</mods:affiliation>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea>Alfa-informatica, BCN, University of Groningen</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: nerbonne@let.rug.nl</mods:affiliation>
<country wicri:rule="url">Pays-Bas</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:70C7D08C9A5C9D7C6CC886ADF30144BD359087A2</idno>
<date when="2000" year="2000">2000</date>
<idno type="doi">10.1007/3-540-40030-3_7</idno>
<idno type="url">https://api.istex.fr/document/70C7D08C9A5C9D7C6CC886ADF30144BD359087A2/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001E31</idno>
<idno type="wicri:Area/Istex/Curation">001D08</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Learning the Logic of Simple Phonotactics</title>
<author><name sortKey="Tjong Kim Sang, F" sort="Tjong Kim Sang, F" uniqKey="Tjong Kim Sang F" first="F." last="Tjong Kim Sang">F. Tjong Kim Sang</name>
<affiliation wicri:level="1"><mods:affiliation>CNTS - Language Technology Group, University of Antwerp, Belgium</mods:affiliation>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>CNTS - Language Technology Group, University of Antwerp</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: erikt@uia.ua.ac.be</mods:affiliation>
<country wicri:rule="url">Belgique</country>
</affiliation>
</author>
<author><name sortKey="Nerbonne, John" sort="Nerbonne, John" uniqKey="Nerbonne J" first="John" last="Nerbonne">John Nerbonne</name>
<affiliation wicri:level="1"><mods:affiliation>Alfa-informatica, BCN, University of Groningen, The Netherlands</mods:affiliation>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea>Alfa-informatica, BCN, University of Groningen</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: nerbonne@let.rug.nl</mods:affiliation>
<country wicri:rule="url">Pays-Bas</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2000</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">70C7D08C9A5C9D7C6CC886ADF30144BD359087A2</idno>
<idno type="DOI">10.1007/3-540-40030-3_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We report on experiments which demonstrate that by abductive inference it is possible to learn enough simple phonotactics to distinguish words from non-words for a simplified set of Dutch, the monosyllables. The monosyllables are distinguished in input so that segmentation is not problematic. Frequency information is withheld as is negative data. The methods are all tested using ten-fold cross-validation as well as a fixed number of randomly generated strings. Orthographic and phonetic representations are compared. The work presented in this chapter is part of a larger project comparing different machine learning techniques on linguistic data.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001D08 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 001D08 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Istex |étape= Curation |type= RBID |clé= ISTEX:70C7D08C9A5C9D7C6CC886ADF30144BD359087A2 |texte= Learning the Logic of Simple Phonotactics }}
This area was generated with Dilib version V0.6.32. |