ODIL : an SGML description language of the layout structure of documents
Identifieur interne :
000A09 ( PascalFrancis/Corpus );
précédent :
000A08;
suivant :
000A10
ODIL : an SGML description language of the layout structure of documents
Auteurs : Source :
-
Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique
RBID : Pascal:96-0263445
Descripteurs français
English descriptors
Abstract
This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
A06 | | | | @2 6 |
---|
A09 | 01 | 1 | ENG | @1 ODIL : an SGML description language of the layout structure of documents |
---|
A09 | 02 | 5 | FRE | @1 ODIL, langage de description SGML de la structure physique des documents |
---|
A12 | 01 | 1 | | @1 LEFEVRE (P.) |
---|
A12 | 02 | 1 | | @1 REYNAUD (F.) |
---|
A18 | 01 | 1 | | @1 Electricité de France. Direction des études et recherches @2 Clamart @3 FRA |
---|
A18 | 02 | 1 | | @1 Electricité de France. Département systèmes d'information et de documentation @2 Clamart @3 FRA |
---|
A21 | | | | @1 1995 |
---|
A23 | 01 | | | @0 ENG |
---|
A24 | 01 | | | @0 fre |
---|
A24 | 02 | | | @0 eng |
---|
A29 | | | | @1 13 p. @2 30 cm @3 ill., fig. |
---|
A43 | 01 | | | @1 INIST @2 26165 D @5 354000044784840000 |
---|
A44 | | | | @0 0000 |
---|
A45 | | | | @0 10 ref. |
---|
A47 | 01 | 1 | | @0 96-0263445 |
---|
A60 | | | | @1 P @2 R |
---|
A61 | | | | @0 M |
---|
A64 | 01 | 2 | | @0 Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique |
---|
A66 | 01 | | | @0 FRA |
---|
C01 | 01 | | ENG | @0 This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem. |
---|
C02 | 01 | 8 | | @0 430A09J |
---|
C02 | 02 | 8 | | @0 430A05B |
---|
C02 | 03 | X | | @0 001D02C03 |
---|
C03 | 01 | X | FRE | @0 Recherche documentaire @5 01 |
---|
C03 | 01 | X | ENG | @0 Document retrieval @5 01 |
---|
C03 | 01 | X | SPA | @0 Recuperación documental @5 01 |
---|
C03 | 02 | X | FRE | @0 Reconnaissance forme @5 02 |
---|
C03 | 02 | X | ENG | @0 Pattern recognition @5 02 |
---|
C03 | 02 | X | GER | @0 Mustererkennung @5 02 |
---|
C03 | 02 | X | SPA | @0 Reconocimiento patrón @5 02 |
---|
C03 | 03 | X | FRE | @0 Reconnaissance image @5 03 |
---|
C03 | 03 | X | ENG | @0 Image recognition @5 03 |
---|
C03 | 03 | X | SPA | @0 Reconocimiento imagen @5 03 |
---|
C03 | 04 | 8 | FRE | @0 SGML @4 CD @5 96 |
---|
C03 | 04 | 8 | ENG | @0 SGML @4 CD @5 96 |
---|
C03 | 05 | 8 | FRE | @0 ODIL @4 CD @5 97 |
---|
C03 | 05 | 8 | ENG | @0 ODIL @4 CD @5 97 |
---|
C03 | 06 | 8 | FRE | @0 Structure document @4 CD @5 98 |
---|
C03 | 06 | 8 | ENG | @0 Document structure @4 CD @5 98 |
---|
N21 | | | | @1 183 |
---|
|
pR |
A39 | 01 | | | @1 EDF-DER @2 96-NO-00006 |
---|
|
Format Inist (serveur)
NO : | PASCAL 96-0263445 INIST |
FT : | ODIL, langage de description SGML de la structure physique des documents |
ET : | ODIL : an SGML description language of the layout structure of documents |
AU : | LEFEVRE (P.); REYNAUD (F.) |
AF : | Electricité de France. Direction des études et recherches/Clamart/France; Electricité de France. Département systèmes d'information et de documentation/Clamart/France |
DT : | Publication en série; Rapport; Niveau monographique |
SO : | Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique; France; Da. 1995; No. 6; ; Pp. 13 p.; Abs. français/anglais; Bibl. 10 ref.; ill., fig. |
LA : | Anglais |
EA : | This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem. |
CC : | 430A09J; 430A05B; 001D02C03 |
FD : | Recherche documentaire; Reconnaissance forme; Reconnaissance image; SGML; ODIL; Structure document |
ED : | Document retrieval; Pattern recognition; Image recognition; SGML; ODIL; Document structure |
GD : | Mustererkennung |
SD : | Recuperación documental; Reconocimiento patrón; Reconocimiento imagen |
LO : | INIST-26165 D.354000044784840000 |
ID : | 96-0263445 |
Links to Exploration step
Pascal:96-0263445
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="m">ODIL : an SGML description language of the layout structure of documents</title>
<author><name sortKey="Lefevre, P" sort="Lefevre, P" uniqKey="Lefevre P" first="P." last="Lefevre">P. Lefevre</name>
<affiliation><inist:fA14 i1="01" i2="1"><s1>Electricité de France. Direction des études et recherches</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</inist:fA14>
</affiliation>
<affiliation><inist:fA14 i1="02" i2="1"><s1>Electricité de France. Département systèmes d'information et de documentation</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Reynaud, F" sort="Reynaud, F" uniqKey="Reynaud F" first="F." last="Reynaud">F. Reynaud</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">96-0263445</idno>
<date when="1995">1995</date>
<idno type="stanalyst">PASCAL 96-0263445 INIST</idno>
<idno type="RBID">Pascal:96-0263445</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000A09</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic></analytic>
<series><title level="j" type="main">Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</title>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</title>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Document retrieval</term>
<term>Document structure</term>
<term>Image recognition</term>
<term>ODIL</term>
<term>Pattern recognition</term>
<term>SGML</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Recherche documentaire</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance image</term>
<term>SGML</term>
<term>ODIL</term>
<term>Structure document</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA09 i1="01" i2="1" l="ENG"><s1>ODIL : an SGML description language of the layout structure of documents</s1>
</fA09>
<fA09 i1="02" i2="5" l="FRE"><s1>ODIL, langage de description SGML de la structure physique des documents</s1>
</fA09>
<fA12 i1="01" i2="1"><s1>LEFEVRE (P.)</s1>
</fA12>
<fA12 i1="02" i2="1"><s1>REYNAUD (F.)</s1>
</fA12>
<fA18 i1="01" i2="1"><s1>Electricité de France. Direction des études et recherches</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</fA18>
<fA18 i1="02" i2="1"><s1>Electricité de France. Département systèmes d'information et de documentation</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</fA18>
<fA21><s1>1995</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA24 i1="01"><s0>fre</s0>
</fA24>
<fA24 i1="02"><s0>eng</s0>
</fA24>
<fA29><s1>13 p.</s1>
<s2>30 cm</s2>
<s3>ill., fig.</s3>
</fA29>
<fA43 i1="01"><s1>INIST</s1>
<s2>26165 D</s2>
<s5>354000044784840000</s5>
</fA43>
<fA44><s0>0000</s0>
</fA44>
<fA45><s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>96-0263445</s0>
</fA47>
<fA60><s1>P</s1>
<s2>R</s2>
</fA60>
<fA64 i1="01" i2="2"><s0>Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</s0>
</fA64>
<fA66 i1="01"><s0>FRA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</s0>
</fC01>
<fC02 i1="01" i2="8"><s0>430A09J</s0>
</fC02>
<fC02 i1="02" i2="8"><s0>430A05B</s0>
</fC02>
<fC02 i1="03" i2="X"><s0>001D02C03</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Recherche documentaire</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Document retrieval</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Recuperación documental</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Reconnaissance forme</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Pattern recognition</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="GER"><s0>Mustererkennung</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Reconocimiento patrón</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Reconnaissance image</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Image recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Reconocimiento imagen</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="8" l="FRE"><s0>SGML</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="04" i2="8" l="ENG"><s0>SGML</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="05" i2="8" l="FRE"><s0>ODIL</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="05" i2="8" l="ENG"><s0>ODIL</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="06" i2="8" l="FRE"><s0>Structure document</s0>
<s4>CD</s4>
<s5>98</s5>
</fC03>
<fC03 i1="06" i2="8" l="ENG"><s0>Document structure</s0>
<s4>CD</s4>
<s5>98</s5>
</fC03>
<fN21><s1>183</s1>
</fN21>
</pA>
<pR><fA39 i1="01"><s1>EDF-DER</s1>
<s2>96-NO-00006</s2>
</fA39>
</pR>
</standard>
<server><NO>PASCAL 96-0263445 INIST</NO>
<FT>ODIL, langage de description SGML de la structure physique des documents</FT>
<ET>ODIL : an SGML description language of the layout structure of documents</ET>
<AU>LEFEVRE (P.); REYNAUD (F.)</AU>
<AF>Electricité de France. Direction des études et recherches/Clamart/France; Electricité de France. Département systèmes d'information et de documentation/Clamart/France</AF>
<DT>Publication en série; Rapport; Niveau monographique</DT>
<SO>Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique; France; Da. 1995; No. 6; ; Pp. 13 p.; Abs. français/anglais; Bibl. 10 ref.; ill., fig.</SO>
<LA>Anglais</LA>
<EA>This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</EA>
<CC>430A09J; 430A05B; 001D02C03</CC>
<FD>Recherche documentaire; Reconnaissance forme; Reconnaissance image; SGML; ODIL; Structure document</FD>
<ED>Document retrieval; Pattern recognition; Image recognition; SGML; ODIL; Document structure</ED>
<GD>Mustererkennung</GD>
<SD>Recuperación documental; Reconocimiento patrón; Reconocimiento imagen</SD>
<LO>INIST-26165 D.354000044784840000</LO>
<ID>96-0263445</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A09 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000A09 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Ticri/CIDE
|area= OcrV1
|flux= PascalFrancis
|étape= Corpus
|type= RBID
|clé= Pascal:96-0263445
|texte= ODIL : an SGML description language of the layout structure of documents
}}
| This area was generated with Dilib version V0.6.32. Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024 | ![](Common/icons/LogoDilib.gif) |