Binarization of color document images via luminance and saturation color features.
Identifieur interne : 000070 ( PubMed/Checkpoint ); précédent : 000069; suivant : 000071Binarization of color document images via luminance and saturation color features.
Auteurs : Chun-Ming Tsai [Taïwan] ; Hsi-Jian LeeSource :
- IEEE transactions on image processing : a publication of the IEEE Signal Processing Society [ 1057-7149 ] ; 2002.
Abstract
This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects various color features to binarize color document images. First, if the document image colors are concentrated within a limited range, saturation is employed. Second, if the image foreground colors are significant, luminance is adopted. Third, if the image background colors are concentrated within a limited range, luminance is also applied. Fourth, if the total number of pixels with low luminance (less than 60) is limited, saturation is applied; else both luminance and saturation are employed. Our experiments include 519 color images, most of which are uniform invoice and name-card document images. The proposed binarization method generates better results than other available methods in shape and connected-component measurements. Also, the binarization method obtains higher recognition accuracy in a commercial OCR system than other comparable methods.
DOI: 10.1109/TIP.2002.999677
PubMed: 18244645
Affiliations:
Links toward previous steps (curation, corpus...)
Links to Exploration step
pubmed:18244645Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Binarization of color document images via luminance and saturation color features.</title>
<author><name sortKey="Tsai, Chun Ming" sort="Tsai, Chun Ming" uniqKey="Tsai C" first="Chun-Ming" last="Tsai">Chun-Ming Tsai</name>
<affiliation wicri:level="1"><nlm:affiliation>Dept. of Comput. Sci. and Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan, R. O. C. chunming@csie.nctu.edu.tw</nlm:affiliation>
<country wicri:rule="url">Taïwan</country>
</affiliation>
</author>
<author><name sortKey="Lee, Hsi Jian" sort="Lee, Hsi Jian" uniqKey="Lee H" first="Hsi-Jian" last="Lee">Hsi-Jian Lee</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2002">2002</date>
<idno type="doi">10.1109/TIP.2002.999677</idno>
<idno type="RBID">pubmed:18244645</idno>
<idno type="pmid">18244645</idno>
<idno type="wicri:Area/PubMed/Corpus">000054</idno>
<idno type="wicri:Area/PubMed/Curation">000054</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000054</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Binarization of color document images via luminance and saturation color features.</title>
<author><name sortKey="Tsai, Chun Ming" sort="Tsai, Chun Ming" uniqKey="Tsai C" first="Chun-Ming" last="Tsai">Chun-Ming Tsai</name>
<affiliation wicri:level="1"><nlm:affiliation>Dept. of Comput. Sci. and Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan, R. O. C. chunming@csie.nctu.edu.tw</nlm:affiliation>
<country wicri:rule="url">Taïwan</country>
</affiliation>
</author>
<author><name sortKey="Lee, Hsi Jian" sort="Lee, Hsi Jian" uniqKey="Lee H" first="Hsi-Jian" last="Lee">Hsi-Jian Lee</name>
</author>
</analytic>
<series><title level="j">IEEE transactions on image processing : a publication of the IEEE Signal Processing Society</title>
<idno type="ISSN">1057-7149</idno>
<imprint><date when="2002" type="published">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects various color features to binarize color document images. First, if the document image colors are concentrated within a limited range, saturation is employed. Second, if the image foreground colors are significant, luminance is adopted. Third, if the image background colors are concentrated within a limited range, luminance is also applied. Fourth, if the total number of pixels with low luminance (less than 60) is limited, saturation is applied; else both luminance and saturation are employed. Our experiments include 519 color images, most of which are uniform invoice and name-card document images. The proposed binarization method generates better results than other available methods in shape and connected-component measurements. Also, the binarization method obtains higher recognition accuracy in a commercial OCR system than other comparable methods.</div>
</front>
</TEI>
<pubmed><MedlineCitation Owner="NLM" Status="PubMed-not-MEDLINE"><PMID Version="1">18244645</PMID>
<DateCreated><Year>2008</Year>
<Month>02</Month>
<Day>04</Day>
</DateCreated>
<DateCompleted><Year>2009</Year>
<Month>12</Month>
<Day>16</Day>
</DateCompleted>
<Article PubModel="Print"><Journal><ISSN IssnType="Print">1057-7149</ISSN>
<JournalIssue CitedMedium="Print"><Volume>11</Volume>
<Issue>4</Issue>
<PubDate><Year>2002</Year>
</PubDate>
</JournalIssue>
<Title>IEEE transactions on image processing : a publication of the IEEE Signal Processing Society</Title>
<ISOAbbreviation>IEEE Trans Image Process</ISOAbbreviation>
</Journal>
<ArticleTitle>Binarization of color document images via luminance and saturation color features.</ArticleTitle>
<Pagination><MedlinePgn>434-51</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1109/TIP.2002.999677</ELocationID>
<Abstract><AbstractText>This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects various color features to binarize color document images. First, if the document image colors are concentrated within a limited range, saturation is employed. Second, if the image foreground colors are significant, luminance is adopted. Third, if the image background colors are concentrated within a limited range, luminance is also applied. Fourth, if the total number of pixels with low luminance (less than 60) is limited, saturation is applied; else both luminance and saturation are employed. Our experiments include 519 color images, most of which are uniform invoice and name-card document images. The proposed binarization method generates better results than other available methods in shape and connected-component measurements. Also, the binarization method obtains higher recognition accuracy in a commercial OCR system than other comparable methods.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Tsai</LastName>
<ForeName>Chun-Ming</ForeName>
<Initials>CM</Initials>
<AffiliationInfo><Affiliation>Dept. of Comput. Sci. and Inf. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan, R. O. C. chunming@csie.nctu.edu.tw</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Lee</LastName>
<ForeName>Hsi-Jian</ForeName>
<Initials>HJ</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo><Country>United States</Country>
<MedlineTA>IEEE Trans Image Process</MedlineTA>
<NlmUniqueID>9886191</NlmUniqueID>
<ISSNLinking>1057-7149</ISSNLinking>
</MedlineJournalInfo>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="pubmed"><Year>2008</Year>
<Month>2</Month>
<Day>5</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2008</Year>
<Month>2</Month>
<Day>5</Day>
<Hour>9</Hour>
<Minute>1</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2008</Year>
<Month>2</Month>
<Day>5</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="doi">10.1109/TIP.2002.999677</ArticleId>
<ArticleId IdType="pubmed">18244645</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations><list><country><li>Taïwan</li>
</country>
</list>
<tree><noCountry><name sortKey="Lee, Hsi Jian" sort="Lee, Hsi Jian" uniqKey="Lee H" first="Hsi-Jian" last="Lee">Hsi-Jian Lee</name>
</noCountry>
<country name="Taïwan"><noRegion><name sortKey="Tsai, Chun Ming" sort="Tsai, Chun Ming" uniqKey="Tsai C" first="Chun-Ming" last="Tsai">Chun-Ming Tsai</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000070 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 000070 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= PubMed |étape= Checkpoint |type= RBID |clé= pubmed:18244645 |texte= Binarization of color document images via luminance and saturation color features. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i -Sk "pubmed:18244645" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd \ | NlmPubMed2Wicri -a OcrV1
This area was generated with Dilib version V0.6.32. |