Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
Identifieur interne :
000334 ( PascalFrancis/Corpus );
précédent :
000333;
suivant :
000335
Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
Auteurs : L. Rostek ;
M. AlexaSource :
-
Computers and the humanities [ 0010-4817 ] ; 1997.
RBID : Francis:524-98-12881
Descripteurs français
- Pascal (Inist)
- Linguistique informatique,
Texte électronique,
Etiquetage automatique,
Catégorie sémantique,
Règles,
Message,
Presse,
Allemand,
Encodage,
Système TATOE,
SGML,
TEI,
Corpus annoté.
English descriptors
Abstract
This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
A01 | 01 | 1 | | @0 0010-4817 |
---|
A02 | 01 | | | @0 COHUAD |
---|
A03 | | 1 | | @0 Comput. humanit. |
---|
A05 | | | | @2 31 |
---|
A06 | | | | @2 4 |
---|
A08 | 01 | 1 | ENG | @1 Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories |
---|
A11 | 01 | 1 | | @1 ROSTEK (L.) |
---|
A11 | 02 | 1 | | @1 ALEXA (M.) |
---|
A12 | 01 | 1 | | @1 LESSARD (Greg) @9 introd. |
---|
A12 | 02 | 1 | | @1 LEVISON (Michael) @9 introd. |
---|
A14 | 01 | | | @1 GMD - Integrated Publication and Information Systems Institute @3 DEU @Z 1 aut. |
---|
A14 | 02 | | | @1 Center for Survey Research and Methodology (ZUMA) @3 INC @Z 2 aut. |
---|
A15 | 01 | | | @1 French Studies, Queen's University @2 Kingston, Ontario K7L 3N6 @3 CAN @Z 1 aut. |
---|
A15 | 02 | | | @1 Computing and Information Science, Queen's University @2 Kingston, Ontario K7L 3N6 @3 CAN @Z 2 aut. |
---|
A18 | 01 | 1 | | @1 Association for Computers in the Humanities @3 INC @9 patr. |
---|
A18 | 02 | 1 | | @1 Association for Literary and Linguistic Computing @3 INC @9 patr. |
---|
A20 | | | | @1 311-326 |
---|
A21 | | | | @1 1997 |
---|
A23 | 01 | | | @0 ENG |
---|
A43 | 01 | | | @1 INIST @2 14902 @5 354000072618530040 |
---|
A44 | | | | @0 0000 @1 © 1998 INIST-CNRS. All rights reserved. |
---|
A45 | | | | @0 14 ref. |
---|
A47 | 01 | 1 | | @0 524-98-12881 |
---|
A60 | | | | @1 P @2 C |
---|
A61 | | | | @0 A |
---|
A64 | | 1 | | @0 Computers and the humanities |
---|
A66 | 01 | | | @0 NLD |
---|
A68 | 01 | 1 | FRE | @1 Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF |
---|
C01 | 01 | | ENG | @0 This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper |
---|
C02 | 01 | L | | @0 52478 @1 XV |
---|
C02 | 02 | L | | @0 524 |
---|
C03 | 01 | L | FRE | @0 Linguistique informatique @5 01 |
---|
C03 | 01 | L | ENG | @0 Computational linguistics @5 01 |
---|
C03 | 02 | L | FRE | @0 Texte électronique @5 03 |
---|
C03 | 02 | L | ENG | @0 Electronic text @5 03 |
---|
C03 | 03 | L | FRE | @0 Etiquetage automatique @5 04 |
---|
C03 | 03 | L | ENG | @0 Tagging @5 04 |
---|
C03 | 04 | L | FRE | @0 Catégorie sémantique @5 05 |
---|
C03 | 04 | L | ENG | @0 Semantic category @5 05 |
---|
C03 | 05 | L | FRE | @0 Règles @5 06 |
---|
C03 | 05 | L | ENG | @0 Rule @5 06 |
---|
C03 | 06 | L | FRE | @0 Message @5 07 |
---|
C03 | 06 | L | ENG | @0 Message @5 07 |
---|
C03 | 07 | L | FRE | @0 Presse @5 08 |
---|
C03 | 07 | L | ENG | @0 Newspaper @5 08 |
---|
C03 | 08 | L | FRE | @0 Allemand @2 NL @5 09 |
---|
C03 | 09 | L | FRE | @0 Encodage @4 INC @5 31 |
---|
C03 | 10 | L | FRE | @0 Système TATOE @4 INC @5 32 |
---|
C03 | 11 | L | FRE | @0 SGML @4 INC @5 33 |
---|
C03 | 12 | L | FRE | @0 TEI @2 NI @4 CD @5 96 |
---|
C03 | 12 | L | ENG | @0 TEI @2 NI @4 CD @5 96 |
---|
C03 | 13 | L | FRE | @0 Corpus annoté @2 NI @4 CD @5 97 |
---|
C03 | 13 | L | ENG | @0 Annotated corpus @2 NI @4 CD @5 97 |
---|
N21 | | | | @1 327 |
---|
|
pR |
A30 | 01 | 1 | ENG | @1 1997 Joint Annual Meeting of the Association for Computers in the Humanities and the Association for Literary and Linguistic Computing @3 Kingston, ON CAN @4 1997 |
---|
|
Format Inist (serveur)
NO : | FRANCIS 524-98-12881 INIST |
FT : | (Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF) |
ET : | Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories |
AU : | ROSTEK (L.); ALEXA (M.); LESSARD (Greg); LEVISON (Michael) |
AF : | GMD - Integrated Publication and Information Systems Institute/Allemagne (1 aut.); Center for Survey Research and Methodology (ZUMA)/Inconnu (2 aut.); French Studies, Queen's University/Kingston, Ontario K7L 3N6/Canada (1 aut.); Computing and Information Science, Queen's University/Kingston, Ontario K7L 3N6/Canada (2 aut.) |
DT : | Publication en série; Congrès; Niveau analytique |
SO : | Computers and the humanities; ISSN 0010-4817; Coden COHUAD; Pays-Bas; Da. 1997; Vol. 31; No. 4; Pp. 311-326; Bibl. 14 ref. |
LA : | Anglais |
EA : | This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper |
CC : | 52478; 524 |
FD : | Linguistique informatique; Texte électronique; Etiquetage automatique; Catégorie sémantique; Règles; Message; Presse; Allemand; Encodage; Système TATOE; SGML; TEI; Corpus annoté |
ED : | Computational linguistics; Electronic text; Tagging; Semantic category; Rule; Message; Newspaper; TEI; Annotated corpus |
LO : | INIST-14902.354000072618530040 |
ID : | 524 |
Links to Exploration step
Francis:524-98-12881
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</title>
<author><name sortKey="Rostek, L" sort="Rostek, L" uniqKey="Rostek L" first="L." last="Rostek">L. Rostek</name>
<affiliation><inist:fA14 i1="01"><s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Alexa, M" sort="Alexa, M" uniqKey="Alexa M" first="M." last="Alexa">M. Alexa</name>
<affiliation><inist:fA14 i1="02"><s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">524-98-12881</idno>
<date when="1997">1997</date>
<idno type="stanalyst">FRANCIS 524-98-12881 INIST</idno>
<idno type="RBID">Francis:524-98-12881</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000334</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</title>
<author><name sortKey="Rostek, L" sort="Rostek, L" uniqKey="Rostek L" first="L." last="Rostek">L. Rostek</name>
<affiliation><inist:fA14 i1="01"><s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Alexa, M" sort="Alexa, M" uniqKey="Alexa M" first="M." last="Alexa">M. Alexa</name>
<affiliation><inist:fA14 i1="02"><s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
<imprint><date when="1997">1997</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Computers and the humanities</title>
<title level="j" type="abbreviated">Comput. humanit.</title>
<idno type="ISSN">0010-4817</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Annotated corpus</term>
<term>Computational linguistics</term>
<term>Electronic text</term>
<term>Message</term>
<term>Newspaper</term>
<term>Rule</term>
<term>Semantic category</term>
<term>TEI</term>
<term>Tagging</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Linguistique informatique</term>
<term>Texte électronique</term>
<term>Etiquetage automatique</term>
<term>Catégorie sémantique</term>
<term>Règles</term>
<term>Message</term>
<term>Presse</term>
<term>Allemand</term>
<term>Encodage</term>
<term>Système TATOE</term>
<term>SGML</term>
<term>TEI</term>
<term>Corpus annoté</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0010-4817</s0>
</fA01>
<fA02 i1="01"><s0>COHUAD</s0>
</fA02>
<fA03 i2="1"><s0>Comput. humanit.</s0>
</fA03>
<fA08 i1="01" i2="1" l="ENG"><s1>Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</s1>
</fA08>
<fA11 i1="01" i2="1"><s1>ROSTEK (L.)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>ALEXA (M.)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>LESSARD (Greg)</s1>
<s9>introd.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>LEVISON (Michael)</s1>
<s9>introd.</s9>
</fA12>
<fA14 i1="01"><s1>GMD - Integrated Publication and Information Systems Institute</s1>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Center for Survey Research and Methodology (ZUMA)</s1>
<s3>INC</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA15 i1="01"><s1>French Studies, Queen's University</s1>
<s2>Kingston, Ontario K7L 3N6</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
</fA15>
<fA15 i1="02"><s1>Computing and Information Science, Queen's University</s1>
<s2>Kingston, Ontario K7L 3N6</s2>
<s3>CAN</s3>
<sZ>2 aut.</sZ>
</fA15>
<fA18 i1="01" i2="1"><s1>Association for Computers in the Humanities</s1>
<s3>INC</s3>
<s9>patr.</s9>
</fA18>
<fA18 i1="02" i2="1"><s1>Association for Literary and Linguistic Computing</s1>
<s3>INC</s3>
<s9>patr.</s9>
</fA18>
<fA20><s1>311-326</s1>
</fA20>
<fA21><s1>1997</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>14902</s2>
<s5>354000072618530040</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 1998 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>14 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>524-98-12881</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA64 i2="1"><s0>Computers and the humanities</s0>
</fA64>
<fA66 i1="01"><s0>NLD</s0>
</fA66>
<fA68 i1="01" i2="1" l="FRE"><s1>Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF</s1>
</fA68>
<fC01 i1="01" l="ENG"><s0>This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</s0>
</fC01>
<fC02 i1="01" i2="L"><s0>52478</s0>
<s1>XV</s1>
</fC02>
<fC02 i1="02" i2="L"><s0>524</s0>
</fC02>
<fC03 i1="01" i2="L" l="FRE"><s0>Linguistique informatique</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="L" l="ENG"><s0>Computational linguistics</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="L" l="FRE"><s0>Texte électronique</s0>
<s5>03</s5>
</fC03>
<fC03 i1="02" i2="L" l="ENG"><s0>Electronic text</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="L" l="FRE"><s0>Etiquetage automatique</s0>
<s5>04</s5>
</fC03>
<fC03 i1="03" i2="L" l="ENG"><s0>Tagging</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="L" l="FRE"><s0>Catégorie sémantique</s0>
<s5>05</s5>
</fC03>
<fC03 i1="04" i2="L" l="ENG"><s0>Semantic category</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="L" l="FRE"><s0>Règles</s0>
<s5>06</s5>
</fC03>
<fC03 i1="05" i2="L" l="ENG"><s0>Rule</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="L" l="FRE"><s0>Message</s0>
<s5>07</s5>
</fC03>
<fC03 i1="06" i2="L" l="ENG"><s0>Message</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="L" l="FRE"><s0>Presse</s0>
<s5>08</s5>
</fC03>
<fC03 i1="07" i2="L" l="ENG"><s0>Newspaper</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="L" l="FRE"><s0>Allemand</s0>
<s2>NL</s2>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="L" l="FRE"><s0>Encodage</s0>
<s4>INC</s4>
<s5>31</s5>
</fC03>
<fC03 i1="10" i2="L" l="FRE"><s0>Système TATOE</s0>
<s4>INC</s4>
<s5>32</s5>
</fC03>
<fC03 i1="11" i2="L" l="FRE"><s0>SGML</s0>
<s4>INC</s4>
<s5>33</s5>
</fC03>
<fC03 i1="12" i2="L" l="FRE"><s0>TEI</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="12" i2="L" l="ENG"><s0>TEI</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="13" i2="L" l="FRE"><s0>Corpus annoté</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="13" i2="L" l="ENG"><s0>Annotated corpus</s0>
<s2>NI</s2>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fN21><s1>327</s1>
</fN21>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>1997 Joint Annual Meeting of the Association for Computers in the Humanities and the Association for Literary and Linguistic Computing</s1>
<s3>Kingston, ON CAN</s3>
<s4>1997</s4>
</fA30>
</pR>
</standard>
<server><NO>FRANCIS 524-98-12881 INIST</NO>
<FT>(Marquage dans TATOE et exportation vers SGML : Développement de règles pour l'identification des catégories du NITF)</FT>
<ET>Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories</ET>
<AU>ROSTEK (L.); ALEXA (M.); LESSARD (Greg); LEVISON (Michael)</AU>
<AF>GMD - Integrated Publication and Information Systems Institute/Allemagne (1 aut.); Center for Survey Research and Methodology (ZUMA)/Inconnu (2 aut.); French Studies, Queen's University/Kingston, Ontario K7L 3N6/Canada (1 aut.); Computing and Information Science, Queen's University/Kingston, Ontario K7L 3N6/Canada (2 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Computers and the humanities; ISSN 0010-4817; Coden COHUAD; Pays-Bas; Da. 1997; Vol. 31; No. 4; Pp. 311-326; Bibl. 14 ref.</SO>
<LA>Anglais</LA>
<EA>This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper</EA>
<CC>52478; 524</CC>
<FD>Linguistique informatique; Texte électronique; Etiquetage automatique; Catégorie sémantique; Règles; Message; Presse; Allemand; Encodage; Système TATOE; SGML; TEI; Corpus annoté</FD>
<ED>Computational linguistics; Electronic text; Tagging; Semantic category; Rule; Message; Newspaper; TEI; Annotated corpus</ED>
<LO>INIST-14902.354000072618530040</LO>
<ID>524</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000334 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000334 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Wicri/Informatique
|area= SgmlV1
|flux= PascalFrancis
|étape= Corpus
|type= RBID
|clé= Francis:524-98-12881
|texte= Marking up in TATOE and exporting to SGML : Rule development for identifying NITF categories
}}
| This area was generated with Dilib version V0.6.33. Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021 | |