Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

ODIL : an SGML description language of the layout structure of documents

Identifieur interne : 000A09 ( PascalFrancis/Corpus ); précédent : 000A08; suivant : 000A10

ODIL : an SGML description language of the layout structure of documents

Auteurs :

Source :

RBID : Pascal:96-0263445

Descripteurs français

English descriptors

Abstract

This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A06       @2 6
A09 01  1  ENG  @1 ODIL : an SGML description language of the layout structure of documents
A09 02  5  FRE  @1 ODIL, langage de description SGML de la structure physique des documents
A12 01  1    @1 LEFEVRE (P.)
A12 02  1    @1 REYNAUD (F.)
A18 01  1    @1 Electricité de France. Direction des études et recherches @2 Clamart @3 FRA
A18 02  1    @1 Electricité de France. Département systèmes d'information et de documentation @2 Clamart @3 FRA
A21       @1 1995
A23 01      @0 ENG
A24 01      @0 fre
A24 02      @0 eng
A29       @1 13 p. @2 30 cm @3 ill., fig.
A43 01      @1 INIST @2 26165 D @5 354000044784840000
A44       @0 0000
A45       @0 10 ref.
A47 01  1    @0 96-0263445
A60       @1 P @2 R
A61       @0 M
A64 01  2    @0 Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique
A66 01      @0 FRA
C01 01    ENG  @0 This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.
C02 01  8    @0 430A09J
C02 02  8    @0 430A05B
C02 03  X    @0 001D02C03
C03 01  X  FRE  @0 Recherche documentaire @5 01
C03 01  X  ENG  @0 Document retrieval @5 01
C03 01  X  SPA  @0 Recuperación documental @5 01
C03 02  X  FRE  @0 Reconnaissance forme @5 02
C03 02  X  ENG  @0 Pattern recognition @5 02
C03 02  X  GER  @0 Mustererkennung @5 02
C03 02  X  SPA  @0 Reconocimiento patrón @5 02
C03 03  X  FRE  @0 Reconnaissance image @5 03
C03 03  X  ENG  @0 Image recognition @5 03
C03 03  X  SPA  @0 Reconocimiento imagen @5 03
C03 04  8  FRE  @0 SGML @4 CD @5 96
C03 04  8  ENG  @0 SGML @4 CD @5 96
C03 05  8  FRE  @0 ODIL @4 CD @5 97
C03 05  8  ENG  @0 ODIL @4 CD @5 97
C03 06  8  FRE  @0 Structure document @4 CD @5 98
C03 06  8  ENG  @0 Document structure @4 CD @5 98
N21       @1 183
pR  
A39 01      @1 EDF-DER @2 96-NO-00006

Format Inist (serveur)

NO : PASCAL 96-0263445 INIST
FT : ODIL, langage de description SGML de la structure physique des documents
ET : ODIL : an SGML description language of the layout structure of documents
AU : LEFEVRE (P.); REYNAUD (F.)
AF : Electricité de France. Direction des études et recherches/Clamart/France; Electricité de France. Département systèmes d'information et de documentation/Clamart/France
DT : Publication en série; Rapport; Niveau monographique
SO : Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique; France; Da. 1995; No. 6; ; Pp. 13 p.; Abs. français/anglais; Bibl. 10 ref.; ill., fig.
LA : Anglais
EA : This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.
CC : 430A09J; 430A05B; 001D02C03
FD : Recherche documentaire; Reconnaissance forme; Reconnaissance image; SGML; ODIL; Structure document
ED : Document retrieval; Pattern recognition; Image recognition; SGML; ODIL; Document structure
GD : Mustererkennung
SD : Recuperación documental; Reconocimiento patrón; Reconocimiento imagen
LO : INIST-26165 D.354000044784840000
ID : 96-0263445

Links to Exploration step

Pascal:96-0263445

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="m">ODIL : an SGML description language of the layout structure of documents</title>
<author>
<name sortKey="Lefevre, P" sort="Lefevre, P" uniqKey="Lefevre P" first="P." last="Lefevre">P. Lefevre</name>
<affiliation>
<inist:fA14 i1="01" i2="1">
<s1>Electricité de France. Direction des études et recherches</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</inist:fA14>
</affiliation>
<affiliation>
<inist:fA14 i1="02" i2="1">
<s1>Electricité de France. Département systèmes d'information et de documentation</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Reynaud, F" sort="Reynaud, F" uniqKey="Reynaud F" first="F." last="Reynaud">F. Reynaud</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">96-0263445</idno>
<date when="1995">1995</date>
<idno type="stanalyst">PASCAL 96-0263445 INIST</idno>
<idno type="RBID">Pascal:96-0263445</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000A09</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic></analytic>
<series>
<title level="j" type="main">Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</title>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</title>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Document retrieval</term>
<term>Document structure</term>
<term>Image recognition</term>
<term>ODIL</term>
<term>Pattern recognition</term>
<term>SGML</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Recherche documentaire</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance image</term>
<term>SGML</term>
<term>ODIL</term>
<term>Structure document</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA06>
<s2>6</s2>
</fA06>
<fA09 i1="01" i2="1" l="ENG">
<s1>ODIL : an SGML description language of the layout structure of documents</s1>
</fA09>
<fA09 i1="02" i2="5" l="FRE">
<s1>ODIL, langage de description SGML de la structure physique des documents</s1>
</fA09>
<fA12 i1="01" i2="1">
<s1>LEFEVRE (P.)</s1>
</fA12>
<fA12 i1="02" i2="1">
<s1>REYNAUD (F.)</s1>
</fA12>
<fA18 i1="01" i2="1">
<s1>Electricité de France. Direction des études et recherches</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</fA18>
<fA18 i1="02" i2="1">
<s1>Electricité de France. Département systèmes d'information et de documentation</s1>
<s2>Clamart</s2>
<s3>FRA</s3>
</fA18>
<fA21>
<s1>1995</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA24 i1="01">
<s0>fre</s0>
</fA24>
<fA24 i1="02">
<s0>eng</s0>
</fA24>
<fA29>
<s1>13 p.</s1>
<s2>30 cm</s2>
<s3>ill., fig.</s3>
</fA29>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26165 D</s2>
<s5>354000044784840000</s5>
</fA43>
<fA44>
<s0>0000</s0>
</fA44>
<fA45>
<s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>96-0263445</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>R</s2>
</fA60>
<fA61>
<s0>M</s0>
</fA61>
<fA64 i1="01" i2="2">
<s0>Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique</s0>
</fA64>
<fA66 i1="01">
<s0>FRA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</s0>
</fC01>
<fC02 i1="01" i2="8">
<s0>430A09J</s0>
</fC02>
<fC02 i1="02" i2="8">
<s0>430A05B</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001D02C03</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Recherche documentaire</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Document retrieval</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Recuperación documental</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Pattern recognition</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="GER">
<s0>Mustererkennung</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Reconocimiento patrón</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance image</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Image recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimiento imagen</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="8" l="FRE">
<s0>SGML</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="04" i2="8" l="ENG">
<s0>SGML</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="05" i2="8" l="FRE">
<s0>ODIL</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="05" i2="8" l="ENG">
<s0>ODIL</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="06" i2="8" l="FRE">
<s0>Structure document</s0>
<s4>CD</s4>
<s5>98</s5>
</fC03>
<fC03 i1="06" i2="8" l="ENG">
<s0>Document structure</s0>
<s4>CD</s4>
<s5>98</s5>
</fC03>
<fN21>
<s1>183</s1>
</fN21>
</pA>
<pR>
<fA39 i1="01">
<s1>EDF-DER</s1>
<s2>96-NO-00006</s2>
</fA39>
</pR>
</standard>
<server>
<NO>PASCAL 96-0263445 INIST</NO>
<FT>ODIL, langage de description SGML de la structure physique des documents</FT>
<ET>ODIL : an SGML description language of the layout structure of documents</ET>
<AU>LEFEVRE (P.); REYNAUD (F.)</AU>
<AF>Electricité de France. Direction des études et recherches/Clamart/France; Electricité de France. Département systèmes d'information et de documentation/Clamart/France</AF>
<DT>Publication en série; Rapport; Niveau monographique</DT>
<SO>Collection de notes internes de la Direction des études et recherches. Organisation, information, environnement social et économique; France; Da. 1995; No. 6; ; Pp. 13 p.; Abs. français/anglais; Bibl. 10 ref.; ill., fig.</SO>
<LA>Anglais</LA>
<EA>This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL" - Office Document Image Description Language - that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML Elements, and their characteristics are defined by SGML Attributes. The basis objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language : texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DID will permit to use SGML tools for the logical structure recognition, which is viewed as an SGML up-conversion problem.</EA>
<CC>430A09J; 430A05B; 001D02C03</CC>
<FD>Recherche documentaire; Reconnaissance forme; Reconnaissance image; SGML; ODIL; Structure document</FD>
<ED>Document retrieval; Pattern recognition; Image recognition; SGML; ODIL; Document structure</ED>
<GD>Mustererkennung</GD>
<SD>Recuperación documental; Reconocimiento patrón; Reconocimiento imagen</SD>
<LO>INIST-26165 D.354000044784840000</LO>
<ID>96-0263445</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A09 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000A09 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:96-0263445
   |texte=   ODIL : an SGML description language of the layout structure of documents
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024