Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Classification of binary document images into textual or nontextual data blocks using neural network models

Identifieur interne : 000A27 ( Istex/Corpus ); précédent : 000A26; suivant : 000A28

Classification of binary document images into textual or nontextual data blocks using neural network models

Auteurs : X. Le ; R. Thoma ; Harry Wechsler

Source :

RBID : ISTEX:492D826F5B6760747F834966C56C9CDBA401E9E1

Abstract

Abstract: This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.

Url:
DOI: 10.1007/BF01211490

Links to Exploration step

ISTEX:492D826F5B6760747F834966C56C9CDBA401E9E1

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Classification of binary document images into textual or nontextual data blocks using neural network models</title>
<author>
<name sortKey="Le, X" sort="Le, X" uniqKey="Le X" first="X." last="Le">X. Le</name>
<affiliation>
<mods:affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Thoma, R" sort="Thoma, R" uniqKey="Thoma R" first="R." last="Thoma">R. Thoma</name>
<affiliation>
<mods:affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wechsler, Harry" sort="Wechsler, Harry" uniqKey="Wechsler H" first="Harry" last="Wechsler">Harry Wechsler</name>
<affiliation>
<mods:affiliation>Department of Computer Science, George Mason University, 22030-4444, Fairfax, Va., USA</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:492D826F5B6760747F834966C56C9CDBA401E9E1</idno>
<date when="1995" year="1995">1995</date>
<idno type="doi">10.1007/BF01211490</idno>
<idno type="url">https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000A27</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Classification of binary document images into textual or nontextual data blocks using neural network models</title>
<author>
<name sortKey="Le, X" sort="Le, X" uniqKey="Le X" first="X." last="Le">X. Le</name>
<affiliation>
<mods:affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Thoma, R" sort="Thoma, R" uniqKey="Thoma R" first="R." last="Thoma">R. Thoma</name>
<affiliation>
<mods:affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wechsler, Harry" sort="Wechsler, Harry" uniqKey="Wechsler H" first="Harry" last="Wechsler">Harry Wechsler</name>
<affiliation>
<mods:affiliation>Department of Computer Science, George Mason University, 22030-4444, Fairfax, Va., USA</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Machine Vision and Applications</title>
<title level="j" type="sub">An International Journal</title>
<title level="j" type="abbrev">Machine Vis. Apps.</title>
<idno type="ISSN">0932-8092</idno>
<idno type="eISSN">1432-1769</idno>
<imprint>
<publisher>Springer-Verlag</publisher>
<pubPlace>Berlin/Heidelberg</pubPlace>
<date type="published" when="1995-09-01">1995-09-01</date>
<biblScope unit="volume">8</biblScope>
<biblScope unit="issue">5</biblScope>
<biblScope unit="page" from="289">289</biblScope>
<biblScope unit="page" to="304">304</biblScope>
</imprint>
<idno type="ISSN">0932-8092</idno>
</series>
<idno type="istex">492D826F5B6760747F834966C56C9CDBA401E9E1</idno>
<idno type="DOI">10.1007/BF01211490</idno>
<idno type="ArticleID">BF01211490</idno>
<idno type="ArticleID">Art4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0932-8092</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.</div>
</front>
</TEI>
<istex>
<corpusName>springer</corpusName>
<author>
<json:item>
<name>Daniel X. Le</name>
<affiliations>
<json:string>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>George R. Thoma</name>
<affiliations>
<json:string>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Harry Wechsler</name>
<affiliations>
<json:string>Department of Computer Science, George Mason University, 22030-4444, Fairfax, Va., USA</json:string>
</affiliations>
</json:item>
</author>
<articleId>
<json:string>BF01211490</json:string>
<json:string>Art4</json:string>
</articleId>
<language>
<json:string>eng</json:string>
</language>
<abstract>Abstract: This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.</abstract>
<qualityIndicators>
<score>6.908</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>594 x 774 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>1087</abstractCharCount>
<pdfWordCount>9025</pdfWordCount>
<pdfCharCount>46390</pdfCharCount>
<pdfPageCount>16</pdfPageCount>
<abstractWordCount>159</abstractWordCount>
</qualityIndicators>
<title>Classification of binary document images into textual or nontextual data blocks using neural network models</title>
<genre.original>
<json:string>OriginalPaper</json:string>
</genre.original>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<issue>5</issue>
<subject>
<json:item>
<value>Image Processing</value>
</json:item>
<json:item>
<value>Communications Engineering, Networks</value>
</json:item>
</subject>
<journalId>
<json:string>138</json:string>
</journalId>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1432-1769</json:string>
</eissn>
<title>Machine Vision and Applications</title>
<genre.original>
<json:string>Archive Journal</json:string>
</genre.original>
<volume>8</volume>
<pages>
<last>304</last>
<first>289</first>
</pages>
<issn>
<json:string>0932-8092</json:string>
</issn>
<genre>
<json:string>Journal</json:string>
</genre>
<publicationDate>1995</publicationDate>
<copyrightDate>1995</copyrightDate>
</host>
<publicationDate>1995</publicationDate>
<copyrightDate>1995</copyrightDate>
<doi>
<json:string>10.1007/BF01211490</json:string>
</doi>
<id>492D826F5B6760747F834966C56C9CDBA401E9E1</id>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Classification of binary document images into textual or nontextual data blocks using neural network models</title>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Springer-Verlag</publisher>
<pubPlace>Berlin/Heidelberg</pubPlace>
<availability>
<p>SPRINGER</p>
</availability>
<date>1995</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Classification of binary document images into textual or nontextual data blocks using neural network models</title>
<author corresp="yes">
<persName>
<forename type="first">Daniel</forename>
<surname>Le</surname>
</persName>
<affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</affiliation>
</author>
<author>
<persName>
<forename type="first">George</forename>
<surname>Thoma</surname>
</persName>
<affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</affiliation>
</author>
<author>
<persName>
<forename type="first">Harry</forename>
<surname>Wechsler</surname>
</persName>
<affiliation>Department of Computer Science, George Mason University, 22030-4444, Fairfax, Va., USA</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Machine Vision and Applications</title>
<title level="j" type="sub">An International Journal</title>
<title level="j" type="abbrev">Machine Vis. Apps.</title>
<idno type="JournalID">138</idno>
<idno type="pISSN">0932-8092</idno>
<idno type="eISSN">1432-1769</idno>
<idno type="IssueArticleCount">11</idno>
<idno type="VolumeIssueCount">6</idno>
<imprint>
<publisher>Springer-Verlag</publisher>
<pubPlace>Berlin/Heidelberg</pubPlace>
<date type="published" when="1995-09-01"></date>
<biblScope unit="volume">8</biblScope>
<biblScope unit="issue">5</biblScope>
<biblScope unit="page" from="289">289</biblScope>
<biblScope unit="page" to="304">304</biblScope>
</imprint>
</monogr>
<idno type="istex">492D826F5B6760747F834966C56C9CDBA401E9E1</idno>
<idno type="DOI">10.1007/BF01211490</idno>
<idno type="ArticleID">BF01211490</idno>
<idno type="ArticleID">Art4</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>1995</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Abstract: This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.</p>
</abstract>
<textClass>
<keywords scheme="Journal Subject">
<list>
<head>Computer Science</head>
<item>
<term>Image Processing</term>
</item>
<item>
<term>Communications Engineering, Networks</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="1995-09-01">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-19">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Springer-Verlag</PublisherName>
<PublisherLocation>Berlin/Heidelberg</PublisherLocation>
</PublisherInfo>
<Journal>
<JournalInfo JournalProductType="ArchiveJournal" NumberingStyle="Unnumbered">
<JournalID>138</JournalID>
<JournalPrintISSN>0932-8092</JournalPrintISSN>
<JournalElectronicISSN>1432-1769</JournalElectronicISSN>
<JournalTitle>Machine Vision and Applications</JournalTitle>
<JournalSubTitle>An International Journal</JournalSubTitle>
<JournalAbbreviatedTitle>Machine Vis. Apps.</JournalAbbreviatedTitle>
<JournalSubjectGroup>
<JournalSubject Type="Primary">Computer Science</JournalSubject>
<JournalSubject Type="Secondary">Image Processing</JournalSubject>
<JournalSubject Type="Secondary">Communications Engineering, Networks</JournalSubject>
</JournalSubjectGroup>
</JournalInfo>
<Volume>
<VolumeInfo VolumeType="Regular" TocLevels="0">
<VolumeIDStart>8</VolumeIDStart>
<VolumeIDEnd>8</VolumeIDEnd>
<VolumeIssueCount>6</VolumeIssueCount>
</VolumeInfo>
<Issue IssueType="Regular">
<IssueInfo TocLevels="0">
<IssueIDStart>5</IssueIDStart>
<IssueIDEnd>5</IssueIDEnd>
<IssueArticleCount>11</IssueArticleCount>
<IssueHistory>
<CoverDate>
<DateString>1995</DateString>
<Year>1995</Year>
<Month>9</Month>
</CoverDate>
</IssueHistory>
<IssueCopyright>
<CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1995</CopyrightYear>
</IssueCopyright>
</IssueInfo>
<Article ID="Art4">
<ArticleInfo Language="En" ArticleType="OriginalPaper" NumberingStyle="Unnumbered" TocLevels="0" ContainsESM="No">
<ArticleID>BF01211490</ArticleID>
<ArticleDOI>10.1007/BF01211490</ArticleDOI>
<ArticleSequenceNumber>4</ArticleSequenceNumber>
<ArticleTitle Language="En">Classification of binary document images into textual or nontextual data blocks using neural network models</ArticleTitle>
<ArticleFirstPage>289</ArticleFirstPage>
<ArticleLastPage>304</ArticleLastPage>
<ArticleHistory>
<RegistrationDate>
<Year>2005</Year>
<Month>2</Month>
<Day>7</Day>
</RegistrationDate>
</ArticleHistory>
<ArticleCopyright>
<CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1995</CopyrightYear>
</ArticleCopyright>
<ArticleGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ArticleGrants>
<ArticleContext>
<JournalID>138</JournalID>
<VolumeIDStart>8</VolumeIDStart>
<VolumeIDEnd>8</VolumeIDEnd>
<IssueIDStart>5</IssueIDStart>
<IssueIDEnd>5</IssueIDEnd>
</ArticleContext>
</ArticleInfo>
<ArticleHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff1" CorrespondingAffiliationID="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Daniel</GivenName>
<GivenName>X.</GivenName>
<FamilyName>Le</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>George</GivenName>
<GivenName>R.</GivenName>
<FamilyName>Thoma</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff2">
<AuthorName DisplayOrder="Western">
<GivenName>Harry</GivenName>
<FamilyName>Wechsler</FamilyName>
</AuthorName>
</Author>
<Affiliation ID="Aff1">
<OrgDivision>Lister Hill National Center for Biomedical Communications</OrgDivision>
<OrgName>National Library of Medicine</OrgName>
<OrgAddress>
<Street>8600 Rockville Pike Blvd., Bldg. 38A, MS 55</Street>
<Postcode>20894</Postcode>
<City>Bethesda</City>
<State>Md.</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2">
<OrgDivision>Department of Computer Science</OrgDivision>
<OrgName>George Mason University</OrgName>
<OrgAddress>
<Postcode>22030-4444</Postcode>
<City>Fairfax</City>
<State>Va.</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.</Para>
</Abstract>
<KeywordGroup Language="En">
<Heading>Key words</Heading>
<Keyword>Back propagation</Keyword>
<Keyword>Document processing</Keyword>
<Keyword>Probabilistic networks</Keyword>
<Keyword>Radial basis function</Keyword>
<Keyword>Self-organizing feature maps</Keyword>
<Keyword>Textual classification</Keyword>
</KeywordGroup>
</ArticleHeader>
<NoBody></NoBody>
</Article>
</Issue>
</Volume>
</Journal>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Classification of binary document images into textual or nontextual data blocks using neural network models</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en">
<title>Classification of binary document images into textual or nontextual data blocks using neural network models</title>
</titleInfo>
<name type="personal" displayLabel="corresp">
<namePart type="given">Daniel</namePart>
<namePart type="given">X.</namePart>
<namePart type="family">Le</namePart>
<affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">George</namePart>
<namePart type="given">R.</namePart>
<namePart type="family">Thoma</namePart>
<affiliation>Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike Blvd., Bldg. 38A, MS 55, 20894, Bethesda, Md., USA</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Harry</namePart>
<namePart type="family">Wechsler</namePart>
<affiliation>Department of Computer Science, George Mason University, 22030-4444, Fairfax, Va., USA</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="OriginalPaper"></genre>
<originInfo>
<publisher>Springer-Verlag</publisher>
<place>
<placeTerm type="text">Berlin/Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">1995-09-01</dateIssued>
<copyrightDate encoding="w3cdtf">1995</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: This paper describes a new method for the classification of binary document images as textual or nontextual data blocks using neural network models. Binary document images are first segmented into blocks by the constrained run-length algorithm (CRLA). The component-labeling procedure is used to label the resulting blocks. The features for each block, calculated from the coordinates of its extremities, are then fed into the input layer of a neural network for classification. Four neural networks were considered, and they include back propagation (BP), radial basis functions (RBF), probabilistic neural network (PNN), and Kohonen's self-organizing feature maps (SOFMs). The performance and behavior of these neural network models are analyzed and compared in terms of training times, memory requirements, and classification accuracy. The experiments carried out on a variety of medical journals show the feasibility of using the neural network approach for textual block classification and indicate that in terms of both accuracy and training time RBF should be preferred.</abstract>
<relatedItem type="host">
<titleInfo>
<title>Machine Vision and Applications</title>
<subTitle>An International Journal</subTitle>
</titleInfo>
<titleInfo type="abbreviated">
<title>Machine Vis. Apps.</title>
</titleInfo>
<genre type="Journal" displayLabel="Archive Journal"></genre>
<originInfo>
<dateIssued encoding="w3cdtf">1995-09-01</dateIssued>
<copyrightDate encoding="w3cdtf">1995</copyrightDate>
</originInfo>
<subject>
<genre>Computer Science</genre>
<topic>Image Processing</topic>
<topic>Communications Engineering, Networks</topic>
</subject>
<identifier type="ISSN">0932-8092</identifier>
<identifier type="eISSN">1432-1769</identifier>
<identifier type="JournalID">138</identifier>
<identifier type="IssueArticleCount">11</identifier>
<identifier type="VolumeIssueCount">6</identifier>
<part>
<date>1995</date>
<detail type="volume">
<number>8</number>
<caption>vol.</caption>
</detail>
<detail type="issue">
<number>5</number>
<caption>no.</caption>
</detail>
<extent unit="pages">
<start>289</start>
<end>304</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer-Verlag, 1995</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">492D826F5B6760747F834966C56C9CDBA401E9E1</identifier>
<identifier type="DOI">10.1007/BF01211490</identifier>
<identifier type="ArticleID">BF01211490</identifier>
<identifier type="ArticleID">Art4</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag, 1995</accessCondition>
<recordInfo>
<recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag, 1995</recordOrigin>
</recordInfo>
</mods>
</metadata>
<enrichments>
<istex:refBibTEI uri="https://api.istex.fr/document/492D826F5B6760747F834966C56C9CDBA401E9E1/enrichments/refBib">
<teiHeader></teiHeader>
<text>
<front></front>
<body></body>
<back>
<listBibl>
<biblStruct xml:id="b0">
<analytic>
<title level="a" type="main">Automated entry system for printed documents</title>
<author>
<persName>
<forename type="first">T</forename>
<surname>Aldyama</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">N</forename>
<surname>Hagita</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Patt Recogn</title>
<imprint>
<biblScope unit="volume">23</biblScope>
<biblScope unit="page" from="1141" to="1154"></biblScope>
<date type="published" when="1990"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b1">
<analytic>
<title level="a" type="main">Local learning algorithms</title>
<author>
<persName>
<forename type="first">L</forename>
<surname>Bottou</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">V</forename>
<surname>Vapnik</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Neural Computat</title>
<imprint>
<biblScope unit="volume">4</biblScope>
<biblScope unit="page" from="888" to="900"></biblScope>
<date type="published" when="1992"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b2">
<analytic>
<title level="a" type="main">From statistics to neural networks Neural networks and the bias</title>
<author>
<persName>
<forename type="first">V</forename>
<surname>Cherkassky</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">J</forename>
<surname>Friedman</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Neural Comput</title>
<imprint>
<publisher>Springer</publisher>
<publisher>Springer</publisher>
<biblScope unit="volume">5</biblScope>
<biblScope unit="page" from="1" to="58"></biblScope>
<date type="published" when="1992"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b3">
<analytic>
<title level="a" type="main">Digital image processing Progress in supervised neural networks -what's new since Lippmann? IEEE Signal Processing</title>
<author>
<persName>
<forename type="first">Rc</forename>
<surname>Gonzalez</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">P</forename>
<surname>Wintz</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Home BG</title>
<imprint>
<publisher>Addison-Wesley</publisher>
<publisher>Addison-Wesley</publisher>
<biblScope unit="page" from="8" to="39"></biblScope>
<date type="published" when="1987"></date>
</imprint>
</monogr>
<note>2nd. edn</note>
</biblStruct>
<biblStruct xml:id="b4">
<analytic>
<title level="a" type="main">Text segmentation using gabor filters for automatic document processing. Machine Vis Appl 5:169- 184 Lippmann RP (1987) An introduction to computing with neural nets</title>
<author>
<persName>
<forename type="first">Ak</forename>
<surname>Jain</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">S</forename>
<surname>Bhattacharjee</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">IEEE Acoustics Speech Signal Processing</title>
<imprint>
<biblScope unit="volume">4</biblScope>
<biblScope unit="page" from="4" to="22"></biblScope>
<date type="published" when="1992"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b5">
<analytic>
<title level="a" type="main">Pattern classification using neural networks</title>
<author>
<persName>
<forename type="first">Rp</forename>
<surname>Lippmann</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">IEEE Communications</title>
<imprint>
<biblScope unit="page" from="47" to="62"></biblScope>
<date type="published" when="1989"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b6">
<analytic>
<title level="a" type="main">A comparative study of the practical characteristics of neural network and conventional pattern classifiers The document spectrum for page layout analysis</title>
<author>
<persName>
<forename type="first">K</forename>
<surname>Ng</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Rp</forename>
<surname>Lippmann</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Massachusetts Institute Of Technology IEEE Trans Pattern Anal Machine Intell</title>
<imprint>
<biblScope unit="volume">5</biblScope>
<biblScope unit="page" from="1162" to="1173"></biblScope>
<date type="published" when="1991"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b7">
<analytic>
<title level="a" type="main">Page segmentation and classification</title>
<author>
<persName>
<forename type="first">T</forename>
<surname>Pavlidi S</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">J</forename>
<surname>Zhou</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Cornput Vis Graph Image Processing</title>
<imprint>
<biblScope unit="volume">54</biblScope>
<biblScope unit="page" from="484" to="496"></biblScope>
<date type="published" when="1992"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b8">
<analytic>
<title level="a" type="main">Neural network classifiers estimate bayesian a posteriori probabilities</title>
<author>
<persName>
<forename type="first">Md</forename>
<surname>Richard</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Rp</forename>
<surname>Lippmann</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Neural Comput</title>
<imprint>
<biblScope unit="volume">3</biblScope>
<biblScope unit="page" from="461" to="483"></biblScope>
<date type="published" when="1991"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b9">
<analytic>
<title level="a" type="main">Probabilistic neural networks</title>
<author>
<persName>
<forename type="first">Df</forename>
<surname>Specht</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Neural Networks</title>
<imprint>
<biblScope unit="volume">3</biblScope>
<biblScope unit="page" from="109" to="118"></biblScope>
<date type="published" when="1990"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b10">
<analytic>
<title level="a" type="main">Block segmentation and text extraction in mixed text/image documents</title>
<author>
<persName>
<forename type="first">Fm</forename>
<surname>Wahl</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Ky</forename>
<surname>Wong</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Rg</forename>
<surname>Casey</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Comput Graph Image Processing</title>
<imprint>
<biblScope unit="volume">20</biblScope>
<biblScope unit="page" from="375" to="390"></biblScope>
<date type="published" when="1982"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b11">
<analytic>
<title level="a" type="main">Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems Introduction to artificial neural systems</title>
<author>
<persName>
<forename type="first">Sm</forename>
<surname>Weiss</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Ca</forename>
<surname>Kulikowski</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Calif Zurada JM</title>
<imprint>
<publisher>Morgan Kaufmann</publisher>
<publisher>Morgan Kaufmann</publisher>
<date type="published" when="1991"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b12">
<analytic>
<title></title>
<author>
<persName>
<forename type="first">Daniel</forename>
<forename type="middle">X</forename>
<surname>Le</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">June 1986 and the M.S. degree in computer science from</title>
<meeting>
<address>
<addrLine>Fairfax, Virginia ; Pasadena, California. From ; McLean, Virginia ; Since ; Bethesda, Maryland</addrLine>
</address>
</meeting>
<imprint>
<publisher>National Library of Medicine</publisher>
<date type="published" when="1957-01-01"></date>
</imprint>
</monogr>
<note>His. current work is concerned with document analysis, image quality and image processing</note>
</biblStruct>
<biblStruct xml:id="b13">
<analytic>
<title level="a" type="main">adaptive signal processing, information theory, genetic algorithms, and neural networks) scale-space and clustering for early and intermediate vision using conjoint (Gabor and wavelet) image representations, attention, functional and selective perception, face recognition, and object recognition. He was Director for the NATO Advanced Study Institutes on "Active Perception and Robot Vision He authored over 100 scientific papers, his book "Computational Vision</title>
</analytic>
<monogr>
<title level="m">His research in the areas of Computer Vision and Neural Networks (NN) includes multistrategy learning (statisticsMaratea, Italy, 1989) and on "From Statistics to Neural Networks" (Les Arcs, France, 1993) and he will serve as the co-Chair for the International Conference on Pattern Recognition to be</title>
<meeting>
<address>
<addrLine>Vienna, Austria</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="1990"></date>
</imprint>
</monogr>
<note>Vol. 1 & 2) published by Academic Press in 1991. He was elected as an IEEE Fellow in 1992</note>
</biblStruct>
</listBibl>
</back>
</text>
</istex:refBibTEI>
</enrichments>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A27 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000A27 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:492D826F5B6760747F834966C56C9CDBA401E9E1
   |texte=   Classification of binary document images into textual or nontextual data blocks using neural network models
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024