Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

The construction of a corpus of spoken Sylheti

Identifieur interne : 000164 ( Istex/Corpus ); précédent : 000163; suivant : 000165

The construction of a corpus of spoken Sylheti

Auteurs : P. Baker ; M. Lie ; T. Mcenery ; M. Sebba

Source :

RBID : ISTEX:119A659D89E449E179924ECE6FEC847B19693758

Abstract

This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.

Url:
DOI: 10.1093/llc/15.4.421

Links to Exploration step

ISTEX:119A659D89E449E179924ECE6FEC847B19693758

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">The construction of a corpus of spoken Sylheti</title>
<author>
<name sortKey="Baker, P" sort="Baker, P" uniqKey="Baker P" first="P" last="Baker">P. Baker</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lie, M" sort="Lie, M" uniqKey="Lie M" first="M" last="Lie">M. Lie</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Mcenery, T" sort="Mcenery, T" uniqKey="Mcenery T" first="T" last="Mcenery">T. Mcenery</name>
<affiliation>
<mods:affiliation>Corresponding author E-mail: mcenery@comp.lancs.ac.uk</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sebba, M" sort="Sebba, M" uniqKey="Sebba M" first="M" last="Sebba">M. Sebba</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:119A659D89E449E179924ECE6FEC847B19693758</idno>
<date when="2000" year="2000">2000</date>
<idno type="doi">10.1093/llc/15.4.421</idno>
<idno type="url">https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000164</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">The construction of a corpus of spoken Sylheti</title>
<author>
<name sortKey="Baker, P" sort="Baker, P" uniqKey="Baker P" first="P" last="Baker">P. Baker</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lie, M" sort="Lie, M" uniqKey="Lie M" first="M" last="Lie">M. Lie</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Mcenery, T" sort="Mcenery, T" uniqKey="Mcenery T" first="T" last="Mcenery">T. Mcenery</name>
<affiliation>
<mods:affiliation>Corresponding author E-mail: mcenery@comp.lancs.ac.uk</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sebba, M" sort="Sebba, M" uniqKey="Sebba M" first="M" last="Sebba">M. Sebba</name>
<affiliation>
<mods:affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Literary and Linguistic Computing</title>
<title level="j" type="abbrev">Lit Linguist Computing</title>
<idno type="ISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published" when="2000-12">2000-12</date>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="421">421</biblScope>
<biblScope unit="page" to="432">432</biblScope>
</imprint>
<idno type="ISSN">0268-1145</idno>
</series>
<idno type="istex">119A659D89E449E179924ECE6FEC847B19693758</idno>
<idno type="DOI">10.1093/llc/15.4.421</idno>
<idno type="local">3</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0268-1145</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.</div>
</front>
</TEI>
<istex>
<corpusName>oup</corpusName>
<author>
<json:item>
<name>P Baker</name>
<affiliations>
<json:string>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>M Lie</name>
<affiliations>
<json:string>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>T McEnery</name>
<affiliations>
<json:string>Corresponding author E-mail: mcenery@comp.lancs.ac.uk</json:string>
<json:string>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>M Sebba</name>
<affiliations>
<json:string>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</json:string>
</affiliations>
</json:item>
</author>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>research-article</json:string>
</originalGenre>
<abstract>This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.</abstract>
<qualityIndicators>
<score>5.484</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>521 x 704 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>517</abstractCharCount>
<pdfWordCount>4500</pdfWordCount>
<pdfCharCount>26994</pdfCharCount>
<pdfPageCount>11</pdfPageCount>
<abstractWordCount>82</abstractWordCount>
</qualityIndicators>
<title>The construction of a corpus of spoken Sylheti</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<volume>15</volume>
<publisherId>
<json:string>litlin</json:string>
</publisherId>
<pages>
<last>432</last>
<first>421</first>
</pages>
<issn>
<json:string>0268-1145</json:string>
</issn>
<issue>4</issue>
<genre>
<json:string>journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1477-4615</json:string>
</eissn>
<title>Literary and Linguistic Computing</title>
</host>
<categories>
<wos>
<json:string>LINGUISTICS</json:string>
<json:string>LITERATURE</json:string>
</wos>
</categories>
<publicationDate>2000</publicationDate>
<copyrightDate>2000</copyrightDate>
<doi>
<json:string>10.1093/llc/15.4.421</json:string>
</doi>
<id>119A659D89E449E179924ECE6FEC847B19693758</id>
<score>0.24625446</score>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">The construction of a corpus of spoken Sylheti</title>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
<respStmt>
<resp>Références bibliographiques récupérées via GROBID</resp>
<name resp="ISTEX-API">ISTEX-API (INIST-CNRS)</name>
</respStmt>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Oxford University Press</publisher>
<availability>
<p>OUP</p>
</availability>
<date>2000</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">The construction of a corpus of spoken Sylheti</title>
<author>
<persName>
<forename type="first">P</forename>
<surname>Baker</surname>
</persName>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Lie</surname>
</persName>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
</author>
<author>
<persName>
<forename type="first">T</forename>
<surname>McEnery</surname>
</persName>
<email>mcenery@comp.lancs.ac.uk</email>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Sebba</surname>
</persName>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Literary and Linguistic Computing</title>
<title level="j" type="abbrev">Lit Linguist Computing</title>
<idno type="pISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published" when="2000-12"></date>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="421">421</biblScope>
<biblScope unit="page" to="432">432</biblScope>
</imprint>
</monogr>
<idno type="istex">119A659D89E449E179924ECE6FEC847B19693758</idno>
<idno type="DOI">10.1093/llc/15.4.421</idno>
<idno type="local">3</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2000</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.</p>
</abstract>
</profileDesc>
<revisionDesc>
<change when="2000-12">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-15">References added</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-21">References added</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-07-27">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus oup" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="US-ASCII"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" URI="journalpublishing.dtd" name="istex:docType"></istex:docType>
<istex:document>
<article xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">litlin</journal-id>
<journal-id journal-id-type="hwp">litlin</journal-id>
<journal-title>Literary and Linguistic Computing</journal-title>
<abbrev-journal-title abbrev-type="publisher">Lit Linguist Computing</abbrev-journal-title>
<issn pub-type="ppub">0268-1145</issn>
<issn pub-type="epub">1477-4615</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="other">3</article-id>
<article-id pub-id-type="doi">10.1093/llc/15.4.421</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>The construction of a corpus of spoken Sylheti</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Baker</surname>
<given-names>P</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lie</surname>
<given-names>M</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>McEnery</surname>
<given-names>T</given-names>
</name>
<xref rid="Z">Z</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sebba</surname>
<given-names>M</given-names>
</name>
</contrib>
<aff> Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK
<target target-type="aff" id="Z"></target>
<label>Z</label>
Corresponding author E-mail: mcenery@comp.lancs.ac.uk </aff>
</contrib-group>
<pub-date pub-type="ppub">
<month>12</month>
<year>2000</year>
</pub-date>
<volume>15</volume>
<issue>4</issue>
<fpage>421</fpage>
<lpage>432</lpage>
<permissions>
<copyright-statement>Copyright 2000</copyright-statement>
<copyright-year>2000</copyright-year>
</permissions>
<abstract xml:lang="en">
<p>This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.</p>
</abstract>
<custom-meta-wrap>
<custom-meta>
<meta-name>hwp-legacy-fpage</meta-name>
<meta-value>421</meta-value>
</custom-meta>
<custom-meta>
<meta-name>hwp-legacy-dochead</meta-name>
<meta-value>Article</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>The construction of a corpus of spoken Sylheti</title>
</titleInfo>
<titleInfo type="alternative" lang="en" contentType="CDATA">
<title>The construction of a corpus of spoken Sylheti</title>
</titleInfo>
<name type="personal">
<namePart type="given">P</namePart>
<namePart type="family">Baker</namePart>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">M</namePart>
<namePart type="family">Lie</namePart>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">T</namePart>
<namePart type="family">McEnery</namePart>
<affiliation>Corresponding author E-mail: mcenery@comp.lancs.ac.uk</affiliation>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">M</namePart>
<namePart type="family">Sebba</namePart>
<affiliation>Department of Linguistics and Modern English Language, Lancaster University, Bailrigg, Lancaster LA1 4YT, UK</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article"></genre>
<originInfo>
<publisher>Oxford University Press</publisher>
<dateIssued encoding="w3cdtf">2000-12</dateIssued>
<copyrightDate encoding="w3cdtf">2000</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.</abstract>
<relatedItem type="host">
<titleInfo>
<title>Literary and Linguistic Computing</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>Lit Linguist Computing</title>
</titleInfo>
<genre type="journal">journal</genre>
<identifier type="ISSN">0268-1145</identifier>
<identifier type="eISSN">1477-4615</identifier>
<identifier type="PublisherID">litlin</identifier>
<identifier type="PublisherID-hwp">litlin</identifier>
<part>
<date>2000</date>
<detail type="volume">
<caption>vol.</caption>
<number>15</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>4</number>
</detail>
<extent unit="pages">
<start>421</start>
<end>432</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">119A659D89E449E179924ECE6FEC847B19693758</identifier>
<identifier type="DOI">10.1093/llc/15.4.421</identifier>
<identifier type="local">3</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Copyright 2000</accessCondition>
<recordInfo>
<recordContentSource>OUP</recordContentSource>
</recordInfo>
</mods>
</metadata>
<enrichments>
<istex:catWosTEI uri="https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/enrichments/catWos">
<teiHeader>
<profileDesc>
<textClass>
<classCode scheme="WOS">LINGUISTICS</classCode>
<classCode scheme="WOS">LITERATURE</classCode>
</textClass>
</profileDesc>
</teiHeader>
</istex:catWosTEI>
<json:item>
<type>refBibs</type>
<uri>https://api.istex.fr/document/119A659D89E449E179924ECE6FEC847B19693758/enrichments/refBibs</uri>
</json:item>
</enrichments>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000164 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000164 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:119A659D89E449E179924ECE6FEC847B19693758
   |texte=   The construction of a corpus of spoken Sylheti
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024