Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Integration of an XML electronic dictionary with linguistic tools for natural language processing

Identifieur interne : 000031 ( PascalFrancis/Corpus ); précédent : 000030; suivant : 000032

Integration of an XML electronic dictionary with linguistic tools for natural language processing

Auteurs : Octavio Santana Suarez ; Francisco J. Carreras Riudavets ; Zenon Hernandez Figueroa ; Antonio C. Gonzalez Cabrera

Source :

RBID : Francis:08-0332344

Descripteurs français

English descriptors

Abstract

This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0306-4573
A02 01      @0 IPMADK
A03   1    @0 Inf. process. manag.
A05       @2 43
A06       @2 4
A08 01  1  ENG  @1 Integration of an XML electronic dictionary with linguistic tools for natural language processing
A11 01  1    @1 SANTANA SUAREZ (Octavio)
A11 02  1    @1 CARRERAS RIUDAVETS (Francisco J.)
A11 03  1    @1 FIGUEROA (Zenon Hernandez)
A11 04  1    @1 GONZALEZ CABRERA (Antonio C.)
A14 01      @1 Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria @2 35017 Las Palmas @3 ESP @Z 1 aut. @Z 2 aut. @Z 3 aut. @Z 4 aut.
A20       @1 946-957
A21       @1 2007
A23 01      @0 ENG
A43 01      @1 INIST @2 10246 @5 354000145663510070
A44       @0 0000 @1 © 2008 INIST-CNRS. All rights reserved.
A45       @0 1/4 p.
A47 01  1    @0 08-0332344
A60       @1 P
A61       @0 A
A64 01  1    @0 Information processing & management
A66 01      @0 GBR
C01 01    ENG  @0 This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.
C02 01  X    @0 790F02B @1 VI
C03 01  X  FRE  @0 Traitement du langage naturel @5 04
C03 01  X  ENG  @0 Natural language processing @5 04
C03 01  X  SPA  @0 Tratamiento del lenguaje natural @5 04
C03 02  X  FRE  @0 Linguistique mathématique @5 05
C03 02  X  ENG  @0 Computational linguistics @5 05
C03 02  X  SPA  @0 Linguística matemática @5 05
C03 03  X  FRE  @0 Outil linguistique @5 06
C03 03  X  ENG  @0 Linguistic tool @5 06
C03 03  X  SPA  @0 Instrumento lingüístico @5 06
C03 04  X  FRE  @0 Codage @5 07
C03 04  X  ENG  @0 Coding @5 07
C03 04  X  SPA  @0 Codificación @5 07
C03 05  X  FRE  @0 Langage XML @5 08
C03 05  X  ENG  @0 XML language @5 08
C03 05  X  SPA  @0 Lenguaje XML @5 08
N21       @1 210

Format Inist (serveur)

NO : FRANCIS 08-0332344 INIST
ET : Integration of an XML electronic dictionary with linguistic tools for natural language processing
AU : SANTANA SUAREZ (Octavio); CARRERAS RIUDAVETS (Francisco J.); FIGUEROA (Zenon Hernandez); GONZALEZ CABRERA (Antonio C.)
AF : Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria/35017 Las Palmas/Espagne (1 aut., 2 aut., 3 aut., 4 aut.)
DT : Publication en série; Niveau analytique
SO : Information processing & management; ISSN 0306-4573; Coden IPMADK; Royaume-Uni; Da. 2007; Vol. 43; No. 4; Pp. 946-957; Bibl. 1/4 p.
LA : Anglais
EA : This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.
CC : 790F02B
FD : Traitement du langage naturel; Linguistique mathématique; Outil linguistique; Codage; Langage XML
ED : Natural language processing; Computational linguistics; Linguistic tool; Coding; XML language
SD : Tratamiento del lenguaje natural; Linguística matemática; Instrumento lingüístico; Codificación; Lenguaje XML
LO : INIST-10246.354000145663510070
ID : 08-0332344

Links to Exploration step

Francis:08-0332344

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Integration of an XML electronic dictionary with linguistic tools for natural language processing</title>
<author>
<name sortKey="Santana Suarez, Octavio" sort="Santana Suarez, Octavio" uniqKey="Santana Suarez O" first="Octavio" last="Santana Suarez">Octavio Santana Suarez</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Carreras Riudavets, Francisco J" sort="Carreras Riudavets, Francisco J" uniqKey="Carreras Riudavets F" first="Francisco J." last="Carreras Riudavets">Francisco J. Carreras Riudavets</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Figueroa, Zenon Hernandez" sort="Figueroa, Zenon Hernandez" uniqKey="Figueroa Z" first="Zenon Hernandez" last="Figueroa">Zenon Hernandez Figueroa</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Gonzalez Cabrera, Antonio C" sort="Gonzalez Cabrera, Antonio C" uniqKey="Gonzalez Cabrera A" first="Antonio C." last="Gonzalez Cabrera">Antonio C. Gonzalez Cabrera</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">08-0332344</idno>
<date when="2007">2007</date>
<idno type="stanalyst">FRANCIS 08-0332344 INIST</idno>
<idno type="RBID">Francis:08-0332344</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000031</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Integration of an XML electronic dictionary with linguistic tools for natural language processing</title>
<author>
<name sortKey="Santana Suarez, Octavio" sort="Santana Suarez, Octavio" uniqKey="Santana Suarez O" first="Octavio" last="Santana Suarez">Octavio Santana Suarez</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Carreras Riudavets, Francisco J" sort="Carreras Riudavets, Francisco J" uniqKey="Carreras Riudavets F" first="Francisco J." last="Carreras Riudavets">Francisco J. Carreras Riudavets</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Figueroa, Zenon Hernandez" sort="Figueroa, Zenon Hernandez" uniqKey="Figueroa Z" first="Zenon Hernandez" last="Figueroa">Zenon Hernandez Figueroa</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Gonzalez Cabrera, Antonio C" sort="Gonzalez Cabrera, Antonio C" uniqKey="Gonzalez Cabrera A" first="Antonio C." last="Gonzalez Cabrera">Antonio C. Gonzalez Cabrera</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Information processing & management</title>
<title level="j" type="abbreviated">Inf. process. manag.</title>
<idno type="ISSN">0306-4573</idno>
<imprint>
<date when="2007">2007</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Information processing & management</title>
<title level="j" type="abbreviated">Inf. process. manag.</title>
<idno type="ISSN">0306-4573</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Coding</term>
<term>Computational linguistics</term>
<term>Linguistic tool</term>
<term>Natural language processing</term>
<term>XML language</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traitement du langage naturel</term>
<term>Linguistique mathématique</term>
<term>Outil linguistique</term>
<term>Codage</term>
<term>Langage XML</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0306-4573</s0>
</fA01>
<fA02 i1="01">
<s0>IPMADK</s0>
</fA02>
<fA03 i2="1">
<s0>Inf. process. manag.</s0>
</fA03>
<fA05>
<s2>43</s2>
</fA05>
<fA06>
<s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Integration of an XML electronic dictionary with linguistic tools for natural language processing</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>SANTANA SUAREZ (Octavio)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>CARRERAS RIUDAVETS (Francisco J.)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>FIGUEROA (Zenon Hernandez)</s1>
</fA11>
<fA11 i1="04" i2="1">
<s1>GONZALEZ CABRERA (Antonio C.)</s1>
</fA11>
<fA14 i1="01">
<s1>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria</s1>
<s2>35017 Las Palmas</s2>
<s3>ESP</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</fA14>
<fA20>
<s1>946-957</s1>
</fA20>
<fA21>
<s1>2007</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>10246</s2>
<s5>354000145663510070</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2008 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>1/4 p.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>08-0332344</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Information processing & management</s0>
</fA64>
<fA66 i1="01">
<s0>GBR</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>790F02B</s0>
<s1>VI</s1>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Traitement du langage naturel</s0>
<s5>04</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Natural language processing</s0>
<s5>04</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Tratamiento del lenguaje natural</s0>
<s5>04</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Linguistique mathématique</s0>
<s5>05</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Computational linguistics</s0>
<s5>05</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Linguística matemática</s0>
<s5>05</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Outil linguistique</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Linguistic tool</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Instrumento lingüístico</s0>
<s5>06</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Codage</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Coding</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Codificación</s0>
<s5>07</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Langage XML</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>XML language</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Lenguaje XML</s0>
<s5>08</s5>
</fC03>
<fN21>
<s1>210</s1>
</fN21>
</pA>
</standard>
<server>
<NO>FRANCIS 08-0332344 INIST</NO>
<ET>Integration of an XML electronic dictionary with linguistic tools for natural language processing</ET>
<AU>SANTANA SUAREZ (Octavio); CARRERAS RIUDAVETS (Francisco J.); FIGUEROA (Zenon Hernandez); GONZALEZ CABRERA (Antonio C.)</AU>
<AF>Department of Informática y Sistemas, Edificio de Informática y Matemdticas, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria/35017 Las Palmas/Espagne (1 aut., 2 aut., 3 aut., 4 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>Information processing & management; ISSN 0306-4573; Coden IPMADK; Royaume-Uni; Da. 2007; Vol. 43; No. 4; Pp. 946-957; Bibl. 1/4 p.</SO>
<LA>Anglais</LA>
<EA>This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145000 accepted meanings.</EA>
<CC>790F02B</CC>
<FD>Traitement du langage naturel; Linguistique mathématique; Outil linguistique; Codage; Langage XML</FD>
<ED>Natural language processing; Computational linguistics; Linguistic tool; Coding; XML language</ED>
<SD>Tratamiento del lenguaje natural; Linguística matemática; Instrumento lingüístico; Codificación; Lenguaje XML</SD>
<LO>INIST-10246.354000145663510070</LO>
<ID>08-0332344</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000031 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000031 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Francis:08-0332344
   |texte=   Integration of an XML electronic dictionary with linguistic tools for natural language processing
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024