Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

DL architecture for Indic scripts

Identifieur interne : 000278 ( PascalFrancis/Curation ); précédent : 000277; suivant : 000279

DL architecture for Indic scripts

Auteurs : Suryaprakash Kompalli [États-Unis] ; Srirangaraj Setlur [États-Unis] ; Venugopal Govindaraju [États-Unis]

Source :

RBID : Pascal:04-0536231

Descripteurs français

English descriptors

Abstract

In this study, we outline computational issues in the design of a Digital Library (DL) for Indic languages. The complicated character structure of Indic scripts entails novel OCR analysis techniques and user interface (UI) designs. This paper describes a multi-tier software architecture, which provides text and image processing tools as independent, reusable entities. Techniques for measuring and evaluating different stages of an Indic script recognition engine are outlined.
pA  
A01 01  1    @0 0302-9743
A05       @2 3163
A08 01  1  ENG  @1 DL architecture for Indic scripts
A09 01  1  ENG  @1 DAS 2004 : document analysis systems VI : Florence, 8-10 September 2004
A11 01  1    @1 KOMPALLI (Suryaprakash)
A11 02  1    @1 SETLUR (Srirangaraj)
A11 03  1    @1 GOVINDARAJU (Venugopal)
A12 01  1    @1 MARINAI (Simone) @9 ed.
A12 02  1    @1 DENGEL (Andreas) @9 ed.
A14 01      @1 CEDAR, UB Commons, 520 Lee Entrance, Suite 202 @2 Amherst, NY 14228 @3 USA @Z 1 aut. @Z 2 aut. @Z 3 aut.
A20       @1 28-38
A21       @1 2004
A23 01      @0 ENG
A26 01      @0 3-540-23060-2
A43 01      @1 INIST @2 16343 @5 354000124343050030
A44       @0 0000 @1 © 2004 INIST-CNRS. All rights reserved.
A45       @0 22 ref.
A47 01  1    @0 04-0536231
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 Lecture notes in computer science
A66 01      @0 DEU
C01 01    ENG  @0 In this study, we outline computational issues in the design of a Digital Library (DL) for Indic languages. The complicated character structure of Indic scripts entails novel OCR analysis techniques and user interface (UI) designs. This paper describes a multi-tier software architecture, which provides text and image processing tools as independent, reusable entities. Techniques for measuring and evaluating different stages of an Indic script recognition engine are outlined.
C02 01  X    @0 001D02B07B
C03 01  X  FRE  @0 Structure document @5 01
C03 01  X  ENG  @0 Document structure @5 01
C03 01  X  SPA  @0 Estructura documental @5 01
C03 02  X  FRE  @0 Analyse donnée @5 02
C03 02  X  ENG  @0 Data analysis @5 02
C03 02  X  SPA  @0 Análisis datos @5 02
C03 03  X  FRE  @0 Bibliothèque électronique @5 06
C03 03  X  ENG  @0 Electronic library @5 06
C03 03  X  SPA  @0 Biblioteca electronica @5 06
C03 04  X  FRE  @0 Reconnaissance caractère @5 07
C03 04  X  ENG  @0 Character recognition @5 07
C03 04  X  SPA  @0 Reconocimiento carácter @5 07
C03 05  X  FRE  @0 Reconnaissance optique caractère @5 08
C03 05  X  ENG  @0 Optical character recognition @5 08
C03 05  X  SPA  @0 Reconocimento óptico de caracteres @5 08
C03 06  X  FRE  @0 Interface utilisateur @5 09
C03 06  X  ENG  @0 User interface @5 09
C03 06  X  SPA  @0 Interfase usuario @5 09
C03 07  X  FRE  @0 Texte @5 10
C03 07  X  ENG  @0 Text @5 10
C03 07  X  SPA  @0 Texto @5 10
C03 08  X  FRE  @0 Traitement image @5 11
C03 08  X  ENG  @0 Image processing @5 11
C03 08  X  SPA  @0 Procesamiento imagen @5 11
C03 09  3  FRE  @0 Architecture logiciel @5 18
C03 09  3  ENG  @0 Software architecture @5 18
N21       @1 306
N44 01      @1 OTO
N82       @1 OTO
pR  
A30 01  1  ENG  @1 International workshop on document analysis systems @2 6 @3 Florence ITA @4 2004-09-08

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:04-0536231

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">DL architecture for Indic scripts</title>
<author>
<name sortKey="Kompalli, Suryaprakash" sort="Kompalli, Suryaprakash" uniqKey="Kompalli S" first="Suryaprakash" last="Kompalli">Suryaprakash Kompalli</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Setlur, Srirangaraj" sort="Setlur, Srirangaraj" uniqKey="Setlur S" first="Srirangaraj" last="Setlur">Srirangaraj Setlur</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Govindaraju, Venugopal" sort="Govindaraju, Venugopal" uniqKey="Govindaraju V" first="Venugopal" last="Govindaraju">Venugopal Govindaraju</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">04-0536231</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0536231 INIST</idno>
<idno type="RBID">Pascal:04-0536231</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000511</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000278</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">DL architecture for Indic scripts</title>
<author>
<name sortKey="Kompalli, Suryaprakash" sort="Kompalli, Suryaprakash" uniqKey="Kompalli S" first="Suryaprakash" last="Kompalli">Suryaprakash Kompalli</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Setlur, Srirangaraj" sort="Setlur, Srirangaraj" uniqKey="Setlur S" first="Srirangaraj" last="Setlur">Srirangaraj Setlur</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Govindaraju, Venugopal" sort="Govindaraju, Venugopal" uniqKey="Govindaraju V" first="Venugopal" last="Govindaraju">Venugopal Govindaraju</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
<imprint>
<date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Data analysis</term>
<term>Document structure</term>
<term>Electronic library</term>
<term>Image processing</term>
<term>Optical character recognition</term>
<term>Software architecture</term>
<term>Text</term>
<term>User interface</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Structure document</term>
<term>Analyse donnée</term>
<term>Bibliothèque électronique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Interface utilisateur</term>
<term>Texte</term>
<term>Traitement image</term>
<term>Architecture logiciel</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this study, we outline computational issues in the design of a Digital Library (DL) for Indic languages. The complicated character structure of Indic scripts entails novel OCR analysis techniques and user interface (UI) designs. This paper describes a multi-tier software architecture, which provides text and image processing tools as independent, reusable entities. Techniques for measuring and evaluating different stages of an Indic script recognition engine are outlined.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0302-9743</s0>
</fA01>
<fA05>
<s2>3163</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>DL architecture for Indic scripts</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>DAS 2004 : document analysis systems VI : Florence, 8-10 September 2004</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>KOMPALLI (Suryaprakash)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>SETLUR (Srirangaraj)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>GOVINDARAJU (Venugopal)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>MARINAI (Simone)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>DENGEL (Andreas)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>CEDAR, UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA20>
<s1>28-38</s1>
</fA20>
<fA21>
<s1>2004</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA26 i1="01">
<s0>3-540-23060-2</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>16343</s2>
<s5>354000124343050030</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>22 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>04-0536231</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Lecture notes in computer science</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>In this study, we outline computational issues in the design of a Digital Library (DL) for Indic languages. The complicated character structure of Indic scripts entails novel OCR analysis techniques and user interface (UI) designs. This paper describes a multi-tier software architecture, which provides text and image processing tools as independent, reusable entities. Techniques for measuring and evaluating different stages of an Indic script recognition engine are outlined.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02B07B</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Structure document</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Document structure</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Estructura documental</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Analyse donnée</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Data analysis</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Análisis datos</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Bibliothèque électronique</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Electronic library</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Biblioteca electronica</s0>
<s5>06</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>07</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>08</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Interface utilisateur</s0>
<s5>09</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>User interface</s0>
<s5>09</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Interfase usuario</s0>
<s5>09</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Texte</s0>
<s5>10</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Text</s0>
<s5>10</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Texto</s0>
<s5>10</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Traitement image</s0>
<s5>11</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Image processing</s0>
<s5>11</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Procesamiento imagen</s0>
<s5>11</s5>
</fC03>
<fC03 i1="09" i2="3" l="FRE">
<s0>Architecture logiciel</s0>
<s5>18</s5>
</fC03>
<fC03 i1="09" i2="3" l="ENG">
<s0>Software architecture</s0>
<s5>18</s5>
</fC03>
<fN21>
<s1>306</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>International workshop on document analysis systems</s1>
<s2>6</s2>
<s3>Florence ITA</s3>
<s4>2004-09-08</s4>
</fA30>
</pR>
</standard>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000278 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000278 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Curation
   |type=    RBID
   |clé=     Pascal:04-0536231
   |texte=   DL architecture for Indic scripts
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024