Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A blackboard approach towards integrated Farsi OCR system

Identifieur interne : 000584 ( PascalFrancis/Curation ); précédent : 000583; suivant : 000585

A blackboard approach towards integrated Farsi OCR system

Auteurs : Hossein Khosravi [Iran] ; Ehsanollah Kabir [Iran]

Source :

RBID : Pascal:10-0180822

Descripteurs français

English descriptors

Abstract

An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.
pA  
A01 01  1    @0 1433-2833
A03   1    @0 Int. j. doc. anal. recognit. : (Print)
A05       @2 12
A06       @2 1
A08 01  1  ENG  @1 A blackboard approach towards integrated Farsi OCR system
A11 01  1    @1 KHOSRAVI (Hossein)
A11 02  1    @1 KABIR (Ehsanollah)
A14 01      @1 Department of Electrical Engineering, Tarbiat Modarres University @2 Tehran @3 IRN @Z 1 aut. @Z 2 aut.
A20       @1 21-32
A21       @1 2009
A23 01      @0 ENG
A43 01      @1 INIST @2 26790 @5 354000170255460020
A44       @0 0000 @1 © 2010 INIST-CNRS. All rights reserved.
A45       @0 24 ref.
A47 01  1    @0 10-0180822
A60       @1 P
A61       @0 A
A64 01  1    @0 International journal on document analysis and recognition : (Print)
A66 01      @0 DEU
C01 01    ENG  @0 An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.
C02 01  X    @0 001D02C03
C03 01  X  FRE  @0 Reconnaissance caractère @5 06
C03 01  X  ENG  @0 Character recognition @5 06
C03 01  X  SPA  @0 Reconocimiento carácter @5 06
C03 02  X  FRE  @0 Reconnaissance optique caractère @5 07
C03 02  X  ENG  @0 Optical character recognition @5 07
C03 02  X  SPA  @0 Reconocimento óptico de caracteres @5 07
C03 03  X  FRE  @0 Texte @5 08
C03 03  X  ENG  @0 Text @5 08
C03 03  X  SPA  @0 Texto @5 08
C03 04  X  FRE  @0 Système information @5 09
C03 04  X  ENG  @0 Information system @5 09
C03 04  X  SPA  @0 Sistema información @5 09
C03 05  X  FRE  @0 Classification @5 10
C03 05  X  ENG  @0 Classification @5 10
C03 05  X  SPA  @0 Clasificación @5 10
C03 06  X  FRE  @0 Analyse statistique @5 23
C03 06  X  ENG  @0 Statistical analysis @5 23
C03 06  X  SPA  @0 Análisis estadístico @5 23
C03 07  X  FRE  @0 Approche probabiliste @5 24
C03 07  X  ENG  @0 Probabilistic approach @5 24
C03 07  X  SPA  @0 Enfoque probabilista @5 24
C03 08  X  FRE  @0 Extraction forme @5 25
C03 08  X  ENG  @0 Pattern extraction @5 25
C03 08  X  SPA  @0 Extracción forma @5 25
C03 09  X  FRE  @0 Segmentation @5 26
C03 09  X  ENG  @0 Segmentation @5 26
C03 09  X  SPA  @0 Segmentación @5 26
N21       @1 123
N44 01      @1 OTO
N82       @1 OTO

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:10-0180822

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author>
<name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0180822</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0180822 INIST</idno>
<idno type="RBID">Pascal:10-0180822</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000193</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000584</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author>
<name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Classification</term>
<term>Information system</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Probabilistic approach</term>
<term>Segmentation</term>
<term>Statistical analysis</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Système information</term>
<term>Classification</term>
<term>Analyse statistique</term>
<term>Approche probabiliste</term>
<term>Extraction forme</term>
<term>Segmentation</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1433-2833</s0>
</fA01>
<fA03 i2="1">
<s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05>
<s2>12</s2>
</fA05>
<fA06>
<s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>A blackboard approach towards integrated Farsi OCR system</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>KHOSRAVI (Hossein)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>KABIR (Ehsanollah)</s1>
</fA11>
<fA14 i1="01">
<s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA20>
<s1>21-32</s1>
</fA20>
<fA21>
<s1>2009</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26790</s2>
<s5>354000170255460020</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>24 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>10-0180822</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02C03</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Texte</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Text</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Texto</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Système information</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Information system</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Sistema información</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Classification</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Classification</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Clasificación</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Analyse statistique</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Statistical analysis</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Análisis estadístico</s0>
<s5>23</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Approche probabiliste</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Probabilistic approach</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Enfoque probabilista</s0>
<s5>24</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Extraction forme</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Pattern extraction</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Extracción forma</s0>
<s5>25</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Segmentation</s0>
<s5>26</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Segmentation</s0>
<s5>26</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Segmentación</s0>
<s5>26</s5>
</fC03>
<fN21>
<s1>123</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000584 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000584 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Curation
   |type=    RBID
   |clé=     Pascal:10-0180822
   |texte=   A blackboard approach towards integrated Farsi OCR system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024