Complex documents images segmentation based on steerable pyramid features
Identifieur interne : 000633 ( PascalFrancis/Curation ); précédent : 000632; suivant : 000634Complex documents images segmentation based on steerable pyramid features
Auteurs : Mohamed Benjelil [France] ; Slim Kanoun [Tunisie] ; Rémy Mullot [France] ; Adel M. Alimi [Tunisie]Source :
- International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2010.
Descripteurs français
- Pascal (Inist)
- Traitement document, Traitement image, Classification, Analyse documentaire, Reconnaissance caractère, Reconnaissance optique caractère, Système complexe, Texte, Caractère manuscrit, Représentation graphique, Image multiple, Banque image, Présentation document, Décomposition sous bande, Multilinguisme, Structure document, Document officiel, Extraction forme, Invariant, Analyse multirésolution, Segmentation image.
- Wicri :
- topic : Classification, Multilinguisme, Document officiel.
English descriptors
- KwdEn :
- Character recognition, Classification, Complex system, Document analysis, Document layout, Document processing, Document structure, Graphics, Image databank, Image processing, Image segmentation, Invariant, Manuscript character, Multilingualism, Multiple image, Multiresolution analysis, Official document, Optical character recognition, Pattern extraction, Subband decomposition, Text.
Abstract
Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.
pA |
|
---|
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000140
Links to Exploration step
Pascal:11-0227897Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Complex documents images segmentation based on steerable pyramid features</title>
<author><name sortKey="Benjelil, Mohamed" sort="Benjelil, Mohamed" uniqKey="Benjelil M" first="Mohamed" last="Benjelil">Mohamed Benjelil</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>L3I, University of La Rochelle, Avenue Michel Crépeau</s1>
<s2>17042 La Rochelle</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Kanoun, Slim" sort="Kanoun, Slim" uniqKey="Kanoun S" first="Slim" last="Kanoun">Slim Kanoun</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>REGIM-ENIS, B.P 1173</s1>
<s2>3038 Sfax</s2>
<s3>TUN</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Tunisie</country>
</affiliation>
</author>
<author><name sortKey="Mullot, Remy" sort="Mullot, Remy" uniqKey="Mullot R" first="Rémy" last="Mullot">Rémy Mullot</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>L3I, University of La Rochelle, Avenue Michel Crépeau</s1>
<s2>17042 La Rochelle</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Alimi, Adel M" sort="Alimi, Adel M" uniqKey="Alimi A" first="Adel M." last="Alimi">Adel M. Alimi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>REGIM-ENIS, B.P 1173</s1>
<s2>3038 Sfax</s2>
<s3>TUN</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Tunisie</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">11-0227897</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 11-0227897 INIST</idno>
<idno type="RBID">Pascal:11-0227897</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000140</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000633</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Complex documents images segmentation based on steerable pyramid features</title>
<author><name sortKey="Benjelil, Mohamed" sort="Benjelil, Mohamed" uniqKey="Benjelil M" first="Mohamed" last="Benjelil">Mohamed Benjelil</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>L3I, University of La Rochelle, Avenue Michel Crépeau</s1>
<s2>17042 La Rochelle</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Kanoun, Slim" sort="Kanoun, Slim" uniqKey="Kanoun S" first="Slim" last="Kanoun">Slim Kanoun</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>REGIM-ENIS, B.P 1173</s1>
<s2>3038 Sfax</s2>
<s3>TUN</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Tunisie</country>
</affiliation>
</author>
<author><name sortKey="Mullot, Remy" sort="Mullot, Remy" uniqKey="Mullot R" first="Rémy" last="Mullot">Rémy Mullot</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>L3I, University of La Rochelle, Avenue Michel Crépeau</s1>
<s2>17042 La Rochelle</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Alimi, Adel M" sort="Alimi, Adel M" uniqKey="Alimi A" first="Adel M." last="Alimi">Adel M. Alimi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>REGIM-ENIS, B.P 1173</s1>
<s2>3038 Sfax</s2>
<s3>TUN</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Tunisie</country>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Classification</term>
<term>Complex system</term>
<term>Document analysis</term>
<term>Document layout</term>
<term>Document processing</term>
<term>Document structure</term>
<term>Graphics</term>
<term>Image databank</term>
<term>Image processing</term>
<term>Image segmentation</term>
<term>Invariant</term>
<term>Manuscript character</term>
<term>Multilingualism</term>
<term>Multiple image</term>
<term>Multiresolution analysis</term>
<term>Official document</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Subband decomposition</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Traitement document</term>
<term>Traitement image</term>
<term>Classification</term>
<term>Analyse documentaire</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Système complexe</term>
<term>Texte</term>
<term>Caractère manuscrit</term>
<term>Représentation graphique</term>
<term>Image multiple</term>
<term>Banque image</term>
<term>Présentation document</term>
<term>Décomposition sous bande</term>
<term>Multilinguisme</term>
<term>Structure document</term>
<term>Document officiel</term>
<term>Extraction forme</term>
<term>Invariant</term>
<term>Analyse multirésolution</term>
<term>Segmentation image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Classification</term>
<term>Multilinguisme</term>
<term>Document officiel</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1433-2833</s0>
</fA01>
<fA03 i2="1"><s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05><s2>13</s2>
</fA05>
<fA06><s2>3</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>Complex documents images segmentation based on steerable pyramid features</s1>
</fA08>
<fA11 i1="01" i2="1"><s1>BENJELIL (Mohamed)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>KANOUN (Slim)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>MULLOT (Rémy)</s1>
</fA11>
<fA11 i1="04" i2="1"><s1>ALIMI (Adel M.)</s1>
</fA11>
<fA14 i1="01"><s1>REGIM-ENIS, B.P 1173</s1>
<s2>3038 Sfax</s2>
<s3>TUN</s3>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>L3I, University of La Rochelle, Avenue Michel Crépeau</s1>
<s2>17042 La Rochelle</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA20><s1>209-228</s1>
</fA20>
<fA21><s1>2010</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>26790</s2>
<s5>354000194295480030</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2011 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>58 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>11-0227897</s0>
</fA47>
<fA60><s1>P</s1>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C03</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Traitement document</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Document processing</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Tratamiento documento</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Traitement image</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Image processing</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Procesamiento imagen</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Classification</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Classification</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Clasificación</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Analyse documentaire</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Document analysis</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Análisis documental</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Character recognition</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Reconocimiento carácter</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Optical character recognition</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Reconocimento óptico de caracteres</s0>
<s5>11</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Système complexe</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Complex system</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Sistema complejo</s0>
<s5>12</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Texte</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Text</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Texto</s0>
<s5>13</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Caractère manuscrit</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Manuscript character</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Carácter manuscrito</s0>
<s5>14</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE"><s0>Représentation graphique</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG"><s0>Graphics</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA"><s0>Grafo (curva)</s0>
<s5>15</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE"><s0>Image multiple</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG"><s0>Multiple image</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA"><s0>Imagen múltiple</s0>
<s5>16</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE"><s0>Banque image</s0>
<s5>17</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG"><s0>Image databank</s0>
<s5>17</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA"><s0>Banco imagen</s0>
<s5>17</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE"><s0>Présentation document</s0>
<s5>18</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG"><s0>Document layout</s0>
<s5>18</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA"><s0>Presentación documento</s0>
<s5>18</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE"><s0>Décomposition sous bande</s0>
<s5>19</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG"><s0>Subband decomposition</s0>
<s5>19</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA"><s0>Descomposición subbanda</s0>
<s5>19</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE"><s0>Multilinguisme</s0>
<s5>20</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG"><s0>Multilingualism</s0>
<s5>20</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA"><s0>Multilingüismo</s0>
<s5>20</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE"><s0>Structure document</s0>
<s5>21</s5>
</fC03>
<fC03 i1="16" i2="X" l="ENG"><s0>Document structure</s0>
<s5>21</s5>
</fC03>
<fC03 i1="16" i2="X" l="SPA"><s0>Estructura documental</s0>
<s5>21</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE"><s0>Document officiel</s0>
<s5>22</s5>
</fC03>
<fC03 i1="17" i2="X" l="ENG"><s0>Official document</s0>
<s5>22</s5>
</fC03>
<fC03 i1="17" i2="X" l="SPA"><s0>Documento oficial</s0>
<s5>22</s5>
</fC03>
<fC03 i1="18" i2="X" l="FRE"><s0>Extraction forme</s0>
<s5>23</s5>
</fC03>
<fC03 i1="18" i2="X" l="ENG"><s0>Pattern extraction</s0>
<s5>23</s5>
</fC03>
<fC03 i1="18" i2="X" l="SPA"><s0>Extracción forma</s0>
<s5>23</s5>
</fC03>
<fC03 i1="19" i2="X" l="FRE"><s0>Invariant</s0>
<s5>24</s5>
</fC03>
<fC03 i1="19" i2="X" l="ENG"><s0>Invariant</s0>
<s5>24</s5>
</fC03>
<fC03 i1="19" i2="X" l="SPA"><s0>Invariante</s0>
<s5>24</s5>
</fC03>
<fC03 i1="20" i2="X" l="FRE"><s0>Analyse multirésolution</s0>
<s5>41</s5>
</fC03>
<fC03 i1="20" i2="X" l="ENG"><s0>Multiresolution analysis</s0>
<s5>41</s5>
</fC03>
<fC03 i1="20" i2="X" l="SPA"><s0>Análisis multiresolución</s0>
<s5>41</s5>
</fC03>
<fC03 i1="21" i2="X" l="FRE"><s0>Segmentation image</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="21" i2="X" l="ENG"><s0>Image segmentation</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="21" i2="X" l="SPA"><s0>Segmentación de imágenes</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fN21><s1>150</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
</standard>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000633 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000633 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= PascalFrancis |étape= Curation |type= RBID |clé= Pascal:11-0227897 |texte= Complex documents images segmentation based on steerable pyramid features }}
This area was generated with Dilib version V0.6.32. |