Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

SusTEInability of linguistic resources through feature structures

Identifieur interne : 000247 ( Istex/Corpus ); précédent : 000246; suivant : 000248

SusTEInability of linguistic resources through feature structures

Auteurs : Andreas Witt ; Georg Rehm ; Erhard Hinrichs ; Timm Lehmberg ; Jens Stegmann

Source :

RBID : ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9

Abstract

This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.

Url:
DOI: 10.1093/llc/fqp024

Links to Exploration step

ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>SusTEInability of linguistic resources through feature structures</title>
<author wicri:is="90%">
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: witt@ids-mannheim.de</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</idno>
<date when="2009" year="2009">2009</date>
<idno type="doi">10.1093/llc/fqp024</idno>
<idno type="url">https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000247</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">SusTEInability of linguistic resources through feature structures</title>
<author wicri:is="90%">
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: witt@ids-mannheim.de</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<affiliation>
<mods:affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>vionto GmbH, Berlin, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Tbingen University, General and Computational Linguistics, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Hamburg University, SFB Multilingualism, Germany</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Literary and Linguistic Computing</title>
<idno type="ISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published" when="2009-09">2009-09</date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">3</biblScope>
<biblScope unit="page" from="363">363</biblScope>
<biblScope unit="page" to="372">372</biblScope>
</imprint>
<idno type="ISSN">0268-1145</idno>
</series>
<idno type="istex">9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</idno>
<idno type="DOI">10.1093/llc/fqp024</idno>
<idno type="ArticleID">fqp024</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0268-1145</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</div>
</front>
</TEI>
<istex>
<corpusName>oup</corpusName>
<author>
<json:item>
<name>Andreas Witt</name>
<affiliations>
<json:string>Institut fr Deutsche Sprache, Mannheim, Germany</json:string>
<json:string>vionto GmbH, Berlin, Germany</json:string>
<json:string>Tbingen University, General and Computational Linguistics, Germany</json:string>
<json:string>Hamburg University, SFB Multilingualism, Germany</json:string>
<json:string>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</json:string>
<json:string>E-mail: witt@ids-mannheim.de</json:string>
</affiliations>
</json:item>
<json:item>
<name>Georg Rehm</name>
<affiliations>
<json:string>Institut fr Deutsche Sprache, Mannheim, Germany</json:string>
<json:string>vionto GmbH, Berlin, Germany</json:string>
<json:string>Tbingen University, General and Computational Linguistics, Germany</json:string>
<json:string>Hamburg University, SFB Multilingualism, Germany</json:string>
<json:string>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</json:string>
</affiliations>
</json:item>
<json:item>
<name>Erhard Hinrichs</name>
<affiliations>
<json:string>Institut fr Deutsche Sprache, Mannheim, Germany</json:string>
<json:string>vionto GmbH, Berlin, Germany</json:string>
<json:string>Tbingen University, General and Computational Linguistics, Germany</json:string>
<json:string>Hamburg University, SFB Multilingualism, Germany</json:string>
<json:string>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</json:string>
</affiliations>
</json:item>
<json:item>
<name>Timm Lehmberg</name>
<affiliations>
<json:string>Institut fr Deutsche Sprache, Mannheim, Germany</json:string>
<json:string>vionto GmbH, Berlin, Germany</json:string>
<json:string>Tbingen University, General and Computational Linguistics, Germany</json:string>
<json:string>Hamburg University, SFB Multilingualism, Germany</json:string>
<json:string>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</json:string>
</affiliations>
</json:item>
<json:item>
<name>Jens Stegmann</name>
<affiliations>
<json:string>Institut fr Deutsche Sprache, Mannheim, Germany</json:string>
<json:string>vionto GmbH, Berlin, Germany</json:string>
<json:string>Tbingen University, General and Computational Linguistics, Germany</json:string>
<json:string>Hamburg University, SFB Multilingualism, Germany</json:string>
<json:string>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Original Articles</value>
</json:item>
</subject>
<articleId>
<json:string>fqp024</json:string>
</articleId>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>research-article</json:string>
</originalGenre>
<abstract>This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</abstract>
<qualityIndicators>
<score>6.167</score>
<pdfVersion>1.4</pdfVersion>
<pdfPageSize>538.583 x 697.323 pts</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>1</keywordCount>
<abstractCharCount>1083</abstractCharCount>
<pdfWordCount>3651</pdfWordCount>
<pdfCharCount>25464</pdfCharCount>
<pdfPageCount>10</pdfPageCount>
<abstractWordCount>168</abstractWordCount>
</qualityIndicators>
<title>SusTEInability of linguistic resources through feature structures</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<volume>24</volume>
<publisherId>
<json:string>litlin</json:string>
</publisherId>
<pages>
<last>372</last>
<first>363</first>
</pages>
<issn>
<json:string>0268-1145</json:string>
</issn>
<issue>3</issue>
<genre>
<json:string>journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1477-4615</json:string>
</eissn>
<title>Literary and Linguistic Computing</title>
</host>
<categories>
<wos>
<json:string>LINGUISTICS</json:string>
<json:string>LITERATURE</json:string>
</wos>
</categories>
<publicationDate>2009</publicationDate>
<copyrightDate>2009</copyrightDate>
<doi>
<json:string>10.1093/llc/fqp024</json:string>
</doi>
<id>9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</id>
<score>0.20106593</score>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a">SusTEInability of linguistic resources through feature structures</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Oxford University Press</publisher>
<availability>
<p>OUP</p>
</availability>
<date>2009-06-11</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a">SusTEInability of linguistic resources through feature structures</title>
<author>
<persName>
<forename type="first">Andreas</forename>
<surname>Witt</surname>
</persName>
<email>witt@ids-mannheim.de</email>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</author>
<author>
<persName>
<forename type="first">Georg</forename>
<surname>Rehm</surname>
</persName>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</author>
<author>
<persName>
<forename type="first">Erhard</forename>
<surname>Hinrichs</surname>
</persName>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</author>
<author>
<persName>
<forename type="first">Timm</forename>
<surname>Lehmberg</surname>
</persName>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</author>
<author>
<persName>
<forename type="first">Jens</forename>
<surname>Stegmann</surname>
</persName>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Literary and Linguistic Computing</title>
<idno type="pISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published" when="2009-09"></date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">3</biblScope>
<biblScope unit="page" from="363">363</biblScope>
<biblScope unit="page" to="372">372</biblScope>
</imprint>
</monogr>
<idno type="istex">9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</idno>
<idno type="DOI">10.1093/llc/fqp024</idno>
<idno type="ArticleID">fqp024</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2009-06-11</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract>
<p>This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<item>
<term>Original Articles</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2009-06-11">Created</change>
<change when="2009-09">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus oup" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="utf-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" URI="journalpublishing.dtd" name="istex:docType"></istex:docType>
<istex:document>
<article article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">litlin</journal-id>
<journal-id journal-id-type="hwp">litlin</journal-id>
<journal-title>Literary and Linguistic Computing</journal-title>
<issn pub-type="ppub">0268-1145</issn>
<issn pub-type="epub">1477-4615</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.1093/llc/fqp024</article-id>
<article-id pub-id-type="publisher-id">fqp024</article-id>
<article-categories>
<subj-group>
<subject>Original Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>SusTEInability of linguistic resources through feature structures</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Witt</surname>
<given-names>Andreas</given-names>
</name>
</contrib>
</contrib-group>
<aff>Institut für Deutsche Sprache, Mannheim, Germany</aff>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Rehm</surname>
<given-names>Georg</given-names>
</name>
</contrib>
</contrib-group>
<aff>vionto GmbH, Berlin, Germany</aff>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Hinrichs</surname>
<given-names>Erhard</given-names>
</name>
</contrib>
</contrib-group>
<aff>Tübingen University, General and Computational Linguistics, Germany</aff>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Lehmberg</surname>
<given-names>Timm</given-names>
</name>
</contrib>
</contrib-group>
<aff>Hamburg University, SFB Multilingualism, Germany</aff>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Stegmann</surname>
<given-names>Jens</given-names>
</name>
</contrib>
</contrib-group>
<aff>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</aff>
<author-notes>
<corresp>
<bold>Correspondence:</bold>
Andreas Witt, Institut für Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, Germany.
<bold>E-mail:</bold>
<email>witt@ids-mannheim.de</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>9</month>
<year>2009</year>
</pub-date>
<pub-date pub-type="epub">
<day>11</day>
<month>6</month>
<year>2009</year>
</pub-date>
<volume>24</volume>
<issue>3</issue>
<fpage>363</fpage>
<lpage>372</lpage>
<permissions>
<copyright-statement>© The Author 2009. Published by Oxford University Press on behalf of ALLC and ACH. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org</copyright-statement>
<copyright-year>2009</copyright-year>
</permissions>
<abstract>
<p>This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</p>
</abstract>
</article-meta>
</front>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo>
<title>SusTEInability of linguistic resources through feature structures</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>SusTEInability of linguistic resources through feature structures</title>
</titleInfo>
<name type="personal">
<namePart type="given">Andreas</namePart>
<namePart type="family">Witt</namePart>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
<affiliation>E-mail: witt@ids-mannheim.de</affiliation>
</name>
<name type="personal">
<namePart type="given">Georg</namePart>
<namePart type="family">Rehm</namePart>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</name>
<name type="personal">
<namePart type="given">Erhard</namePart>
<namePart type="family">Hinrichs</namePart>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</name>
<name type="personal">
<namePart type="given">Timm</namePart>
<namePart type="family">Lehmberg</namePart>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</name>
<name type="personal">
<namePart type="given">Jens</namePart>
<namePart type="family">Stegmann</namePart>
<affiliation>Institut fr Deutsche Sprache, Mannheim, Germany</affiliation>
<affiliation>vionto GmbH, Berlin, Germany</affiliation>
<affiliation>Tbingen University, General and Computational Linguistics, Germany</affiliation>
<affiliation>Hamburg University, SFB Multilingualism, Germany</affiliation>
<affiliation>Bielefeld University, Faculty of Linguistics and Literary Studies, Germany</affiliation>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article"></genre>
<subject>
<topic>Original Articles</topic>
</subject>
<originInfo>
<publisher>Oxford University Press</publisher>
<dateIssued encoding="w3cdtf">2009-09</dateIssued>
<dateCreated encoding="w3cdtf">2009-06-11</dateCreated>
<copyrightDate encoding="w3cdtf">2009</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract>This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</abstract>
<relatedItem type="host">
<titleInfo>
<title>Literary and Linguistic Computing</title>
</titleInfo>
<genre type="journal">journal</genre>
<identifier type="ISSN">0268-1145</identifier>
<identifier type="eISSN">1477-4615</identifier>
<identifier type="PublisherID">litlin</identifier>
<identifier type="PublisherID-hwp">litlin</identifier>
<part>
<date>2009</date>
<detail type="volume">
<caption>vol.</caption>
<number>24</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>3</number>
</detail>
<extent unit="pages">
<start>363</start>
<end>372</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</identifier>
<identifier type="DOI">10.1093/llc/fqp024</identifier>
<identifier type="ArticleID">fqp024</identifier>
<accessCondition type="use and reproduction" contentType="copyright">The Author 2009. Published by Oxford University Press on behalf of ALLC and ACH. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org</accessCondition>
<recordInfo>
<recordContentSource>OUP</recordContentSource>
</recordInfo>
</mods>
</metadata>
<covers>
<json:item>
<original>true</original>
<mimetype>image/tiff</mimetype>
<extension>tiff</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/covers/tiff</uri>
</json:item>
</covers>
<annexes>
<json:item>
<original>true</original>
<mimetype>image/jpeg</mimetype>
<extension>jpeg</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/annexes/jpeg</uri>
</json:item>
<json:item>
<original>true</original>
<mimetype>image/gif</mimetype>
<extension>gif</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/annexes/gif</uri>
</json:item>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/annexes/pdf</uri>
</json:item>
</annexes>
<enrichments>
<istex:catWosTEI uri="https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/enrichments/catWos">
<teiHeader>
<profileDesc>
<textClass>
<classCode scheme="WOS">LINGUISTICS</classCode>
<classCode scheme="WOS">LITERATURE</classCode>
</textClass>
</profileDesc>
</teiHeader>
</istex:catWosTEI>
</enrichments>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000247 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000247 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9
   |texte=   SusTEInability of linguistic resources through feature structures
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024