Serveur d'exploration sur SGML

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories

Identifieur interne : 000334 ( PascalFrancis/Corpus ); précédent : 000333; suivant : 000335

Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories

Auteurs : L. Rostek ; M. Alexa

Source :

RBID : Francis:524-98-12881

Descripteurs français

English descriptors

Abstract

This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0010-4817
A02 01      @0 COHUAD
A03   1    @0 Comput. humanit.
A05       @2 31
A06       @2 4
A08 01  1  ENG  @1 Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
A11 01  1    @1 ROSTEK (L.)
A11 02  1    @1 ALEXA (M.)
A12 01  1    @1 LESSARD (Greg) @9 introd.
A12 02  1    @1 LEVISON (Michael) @9 introd.
A14 01      @1 GMD - Integrated Publication and Information Systems Institute @3 DEU @Z 1 aut.
A14 02      @1 Center for Survey Research and Methodology (ZUMA) @3 INC @Z 2 aut.
A15 01      @1 French Studies, Queen's University @2 Kingston, Ontario K7L 3N6 @3 CAN @Z 1 aut.
A15 02      @1 Computing and Information Science, Queen's University @2 Kingston, Ontario K7L 3N6 @3 CAN @Z 2 aut.
A18 01  1    @1 Association for Computers in the Humanities @3 INC @9 patr.
A18 02  1    @1 Association for Literary and Linguistic Computing @3 INC @9 patr.
A20       @1 311-326
A21       @1 1997
A23 01      @0 ENG
A43 01      @1 INIST @2 14902 @5 354000072618530040
A44       @0 0000 @1 © 1998 INIST-CNRS. All rights reserved.
A45       @0 14 ref.
A47 01  1    @0 524-98-12881
A60       @1 P @2 C
A61       @0 A
A64   1    @0 Computers and the humanities
A66 01      @0 NLD
A68 01  1  FRE  @1 Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF
C01 01    ENG  @0 This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper
C02 01  L    @0 52478 @1 XV
C02 02  L    @0 524
C03 01  L  FRE  @0 Linguistique informatique @5 01
C03 01  L  ENG  @0 Computational linguistics @5 01
C03 02  L  FRE  @0 Texte électronique @5 03
C03 02  L  ENG  @0 Electronic text @5 03
C03 03  L  FRE  @0 Etiquetage automatique @5 04
C03 03  L  ENG  @0 Tagging @5 04
C03 04  L  FRE  @0 Catégorie sémantique @5 05
C03 04  L  ENG  @0 Semantic category @5 05
C03 05  L  FRE  @0 Règles @5 06
C03 05  L  ENG  @0 Rule @5 06
C03 06  L  FRE  @0 Message @5 07
C03 06  L  ENG  @0 Message @5 07
C03 07  L  FRE  @0 Presse @5 08
C03 07  L  ENG  @0 Newspaper @5 08
C03 08  L  FRE  @0 Allemand @2 NL @5 09
C03 09  L  FRE  @0 Encodage @4 INC @5 31
C03 10  L  FRE  @0 Système TATOE @4 INC @5 32
C03 11  L  FRE  @0 SGML @4 INC @5 33
C03 12  L  FRE  @0 TEI @2 NI @4 CD @5 96
C03 12  L  ENG  @0 TEI @2 NI @4 CD @5 96
C03 13  L  FRE  @0 Corpus annoté @2 NI @4 CD @5 97
C03 13  L  ENG  @0 Annotated corpus @2 NI @4 CD @5 97
N21       @1 327
pR  
A30 01  1  ENG  @1 1997 Joint Annual Meeting of the Association for Computers in the Humanities and the Association for Literary and Linguistic Computing @3 Kingston, ON CAN @4 1997

Format Inist (serveur)

NO : FRANCIS 524-98-12881 INIST
FT : (Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF)
ET : Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
AU : ROSTEK (L.); ALEXA (M.); LESSARD (Greg); LEVISON (Michael)
AF : GMD - Integrated Publication and Information Systems Institute/Allemagne (1 aut.); Center for Survey Research and Methodology (ZUMA)/Inconnu (2 aut.); French Studies, Queen's University/Kingston, Ontario K7L 3N6/Canada (1 aut.); Computing and Information Science, Queen's University/Kingston, Ontario K7L 3N6/Canada (2 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : Computers and the humanities; ISSN 0010-4817; Coden COHUAD; Pays-Bas; Da. 1997; Vol. 31; No. 4; Pp. 311-326; Bibl. 14 ref.
LA : Anglais
EA : This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper
CC : 52478; 524
FD : Linguistique informatique; Texte électronique; Etiquetage automatique; Catégorie sémantique; Règles; Message; Presse; Allemand; Encodage; Système TATOE; SGML; TEI; Corpus annoté
ED : Computational linguistics; Electronic text; Tagging; Semantic category; Rule; Message; Newspaper; TEI; Annotated corpus
LO : INIST-14902.354000072618530040
ID : 524

Links to Exploration step

Francis:524-98-12881

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</title>
<author>
<name sortKey="Rostek, L" sort="Rostek, L" uniqKey="Rostek L" first="L." last="Rostek">L. Rostek</name>
<affiliation>
<inist:fA14 i1="01">
<s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Alexa, M" sort="Alexa, M" uniqKey="Alexa M" first="M." last="Alexa">M. Alexa</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">524-98-12881</idno>
<date when="1997">1997</date>
<idno type="stanalyst">FRANCIS 524-98-12881 INIST</idno>
<idno type="RBID">Francis:524-98-12881</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000334</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</title>
<author>
<name sortKey="Rostek, L" sort="Rostek, L" uniqKey="Rostek L" first="L." last="Rostek">L. Rostek</name>
<affiliation>
<inist:fA14 i1="01">
<s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Alexa, M" sort="Alexa, M" uniqKey="Alexa M" first="M." last="Alexa">M. Alexa</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
<imprint>
<date when="1997">1997</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Annotated corpus</term>
<term>Computational linguistics</term>
<term>Electronic text</term>
<term>Message</term>
<term>Newspaper</term>
<term>Rule</term>
<term>Semantic category</term>
<term>TEI</term>
<term>Tagging</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Linguistique informatique</term>
<term>Texte électronique</term>
<term>Etiquetage automatique</term>
<term>Catégorie sémantique</term>
<term>Règles</term>
<term>Message</term>
<term>Presse</term>
<term>Allemand</term>
<term>Encodage</term>
<term>Système TATOE</term>
<term>SGML</term>
<term>TEI</term>
<term>Corpus annoté</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0010-4817</s0>
</fA01>
<fA02 i1="01">
<s0>COHUAD</s0>
</fA02>
<fA03 i2="1">
<s0>Comput. humanit.</s0>
</fA03>
<fA05>
<s2>31</s2>
</fA05>
<fA06>
<s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>ROSTEK (L.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>ALEXA (M.)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>LESSARD (Greg)</s1>
<s9>introd.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>LEVISON (Michael)</s1>
<s9>introd.</s9>
</fA12>
<fA14 i1="01">
<s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA15 i1="01">
<s1>French Studies, Queen's University</s1>
<s2>Kingston, Ontario K7L 3N6</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
</fA15>
<fA15 i1="02">
<s1>Computing and Information Science, Queen's University</s1>
<s2>Kingston, Ontario K7L 3N6</s2>
<s3>CAN</s3>
<sZ>2 aut.</sZ>
</fA15>
<fA18 i1="01" i2="1">
<s1>Association for Computers in the Humanities</s1>
<s3>INC</s3>
<s9>patr.</s9>
</fA18>
<fA18 i1="02" i2="1">
<s1>Association for Literary and Linguistic Computing</s1>
<s3>INC</s3>
<s9>patr.</s9>
</fA18>
<fA20>
<s1>311-326</s1>
</fA20>
<fA21>
<s1>1997</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>14902</s2>
<s5>354000072618530040</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 1998 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>14 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>524-98-12881</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i2="1">
<s0>Computers and the humanities</s0>
</fA64>
<fA66 i1="01">
<s0>NLD</s0>
</fA66>
<fA68 i1="01" i2="1" l="FRE">
<s1>Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF</s1>
</fA68>
<fC01 i1="01" l="ENG">
<s0>This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</s0>
</fC01>
<fC02 i1="01" i2="L">
<s0>52478</s0>
<s1>XV</s1>
</fC02>
<fC02 i1="02" i2="L">
<s0>524</s0>
</fC02>
<fC03 i1="01" i2="L" l="FRE">
<s0>Linguistique informatique</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="L" l="ENG">
<s0>Computational linguistics</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="L" l="FRE">
<s0>Texte électronique</s0>
<s5>03</s5>
</fC03>
<fC03 i1="02" i2="L" l="ENG">
<s0>Electronic text</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="L" l="FRE">
<s0>Etiquetage automatique</s0>
<s5>04</s5>
</fC03>
<fC03 i1="03" i2="L" l="ENG">
<s0>Tagging</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="L" l="FRE">
<s0>Catégorie sémantique</s0>
<s5>05</s5>
</fC03>
<fC03 i1="04" i2="L" l="ENG">
<s0>Semantic category</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="L" l="FRE">
<s0>Règles</s0>
<s5>06</s5>
</fC03>
<fC03 i1="05" i2="L" l="ENG">
<s0>Rule</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="L" l="FRE">
<s0>Message</s0>
<s5>07</s5>
</fC03>
<fC03 i1="06" i2="L" l="ENG">
<s0>Message</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="L" l="FRE">
<s0>Presse</s0>
<s5>08</s5>
</fC03>
<fC03 i1="07" i2="L" l="ENG">
<s0>Newspaper</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="L" l="FRE">
<s0>Allemand</s0>
<s2>NL</s2>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="L" l="FRE">
<s0>Encodage</s0>
<s4>INC</s4>
<s5>31</s5>
</fC03>
<fC03 i1="10" i2="L" l="FRE">
<s0>Système TATOE</s0>
<s4>INC</s4>
<s5>32</s5>
</fC03>
<fC03 i1="11" i2="L" l="FRE">
<s0>SGML</s0>
<s4>INC</s4>
<s5>33</s5>
</fC03>
<fC03 i1="12" i2="L" l="FRE">
<s0>TEI</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="12" i2="L" l="ENG">
<s0>TEI</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="13" i2="L" l="FRE">
<s0>Corpus annoté</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="13" i2="L" l="ENG">
<s0>Annotated corpus</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fN21>
<s1>327</s1>
</fN21>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>1997 Joint Annual Meeting of the Association for Computers in the Humanities and the Association for Literary and Linguistic Computing</s1>
<s3>Kingston, ON CAN</s3>
<s4>1997</s4>
</fA30>
</pR>
</standard>
<server>
<NO>FRANCIS 524-98-12881 INIST</NO>
<FT>(Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF)</FT>
<ET>Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</ET>
<AU>ROSTEK (L.); ALEXA (M.); LESSARD (Greg); LEVISON (Michael)</AU>
<AF>GMD - Integrated Publication and Information Systems Institute/Allemagne (1 aut.); Center for Survey Research and Methodology (ZUMA)/Inconnu (2 aut.); French Studies, Queen's University/Kingston, Ontario K7L 3N6/Canada (1 aut.); Computing and Information Science, Queen's University/Kingston, Ontario K7L 3N6/Canada (2 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Computers and the humanities; ISSN 0010-4817; Coden COHUAD; Pays-Bas; Da. 1997; Vol. 31; No. 4; Pp. 311-326; Bibl. 14 ref.</SO>
<LA>Anglais</LA>
<EA>This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</EA>
<CC>52478; 524</CC>
<FD>Linguistique informatique; Texte électronique; Etiquetage automatique; Catégorie sémantique; Règles; Message; Presse; Allemand; Encodage; Système TATOE; SGML; TEI; Corpus annoté</FD>
<ED>Computational linguistics; Electronic text; Tagging; Semantic category; Rule; Message; Newspaper; TEI; Annotated corpus</ED>
<LO>INIST-14902.354000072618530040</LO>
<ID>524</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000334 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000334 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Informatique
   |area=    SgmlV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Francis:524-98-12881
   |texte=   Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021