OcrV1, Istex, Corpus, bibRecord, 001666

Image-based keyword recognition in oriental language document images

Identifieur interne : 001666 ( Istex/Corpus ); précédent : 001665; suivant : 001667

Image-based keyword recognition in oriental language document images

Auteurs : Jason Zhu ; Tao Hong ; Jonathan J. Hull

Source :

Pattern Recognition [ 0031-3203 ] ; 1996.

RBID : ISTEX:FC715F63A7954E7FB306669F696A79E93A2828E0

Abstract

An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.

Url:

https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/pdf

DOI: 10.1016/S0031-3203(97)83110-X

Links to Exploration step

ISTEX:FC715F63A7954E7FB306669F696A79E93A2828E0

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Image-based keyword recognition in oriental language document images</title>
<author><name sortKey="Zhu, Jason" sort="Zhu, Jason" uniqKey="Zhu J" first="Jason" last="Zhu">Jason Zhu</name>
<affiliation><mods:affiliation>Microsoft Corporation Seattle, WA, U.S.A.</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hong, Tao" sort="Hong, Tao" uniqKey="Hong T" first="Tao" last="Hong">Tao Hong</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hull, Jonathan J" sort="Hull, Jonathan J" uniqKey="Hull J" first="Jonathan J." last="Hull">Jonathan J. Hull</name>
<affiliation><mods:affiliation>Author to whom correspondence should be addressed.</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: hull@crc.ricoh.com</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:FC715F63A7954E7FB306669F696A79E93A2828E0</idno>
<date when="1997" year="1997">1997</date>
<idno type="doi">10.1016/S0031-3203(97)83110-X</idno>
<idno type="url">https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001666</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Image-based keyword recognition in oriental language document images</title>
<author><name sortKey="Zhu, Jason" sort="Zhu, Jason" uniqKey="Zhu J" first="Jason" last="Zhu">Jason Zhu</name>
<affiliation><mods:affiliation>Microsoft Corporation Seattle, WA, U.S.A.</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hong, Tao" sort="Hong, Tao" uniqKey="Hong T" first="Tao" last="Hong">Tao Hong</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hull, Jonathan J" sort="Hull, Jonathan J" uniqKey="Hull J" first="Jonathan J." last="Hull">Jonathan J. Hull</name>
<affiliation><mods:affiliation>Author to whom correspondence should be addressed.</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: hull@crc.ricoh.com</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1996">1996</date>
<biblScope unit="volume">30</biblScope>
<biblScope unit="issue">8</biblScope>
<biblScope unit="page" from="1293">1293</biblScope>
<biblScope unit="page" to="1300">1300</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">FC715F63A7954E7FB306669F696A79E93A2828E0</idno>
<idno type="DOI">10.1016/S0031-3203(97)83110-X</idno>
<idno type="PII">S0031-3203(97)83110-X</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.</div>
</front>
</TEI>
<istex><corpusName>elsevier</corpusName>
<author><json:item><name>Jason Zhu</name>
<affiliations><json:string>Microsoft Corporation Seattle, WA, U.S.A.</json:string>
</affiliations>
</json:item>
<json:item><name>Tao Hong</name>
<affiliations><json:string>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</json:string>
</affiliations>
</json:item>
<json:item><name>Jonathan J. Hull</name>
<affiliations><json:string>Author to whom correspondence should be addressed.</json:string>
<json:string>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</json:string>
<json:string>E-mail: hull@crc.ricoh.com</json:string>
</affiliations>
</json:item>
</author>
<subject><json:item><lang><json:string>eng</json:string>
</lang>
<value>Word recognition</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Chinese</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Japanese</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Oriental languages</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Text recognition</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Contextual post-processing</value>
</json:item>
<json:item><lang><json:string>eng</json:string>
</lang>
<value>Word spotting</value>
</json:item>
</subject>
<language><json:string>eng</json:string>
</language>
<abstract>An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.</abstract>
<qualityIndicators><score>6.617</score>
<pdfVersion>1.2</pdfVersion>
<pdfPageSize>526 x 771 pts</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>7</keywordCount>
<abstractCharCount>1244</abstractCharCount>
<pdfWordCount>4421</pdfWordCount>
<pdfCharCount>25239</pdfCharCount>
<pdfPageCount>8</pdfPageCount>
<abstractWordCount>183</abstractWordCount>
</qualityIndicators>
<title>Image-based keyword recognition in oriental language document images</title>
<pii><json:string>S0031-3203(97)83110-X</json:string>
</pii>
<genre><json:string>research-article</json:string>
</genre>
<serie><volume>80</volume>
<pages><last>1058</last>
<first>1029</first>
</pages>
<genre></genre>
<language><json:string>unknown</json:string>
</language>
<title>(7)</title>
</serie>
<host><volume>30</volume>
<pii><json:string>S0031-3203(00)X0030-1</json:string>
</pii>
<pages><last>1300</last>
<first>1293</first>
</pages>
<issn><json:string>0031-3203</json:string>
</issn>
<issue>8</issue>
<genre><json:string>Journal</json:string>
</genre>
<language><json:string>unknown</json:string>
</language>
<title>Pattern Recognition</title>
<publicationDate>1997</publicationDate>
</host>
<categories><wos><json:string>COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE</json:string>
<json:string>ENGINEERING, ELECTRICAL & ELECTRONIC</json:string>
</wos>
</categories>
<publicationDate>1996</publicationDate>
<copyrightDate>1997</copyrightDate>
<doi><json:string>10.1016/S0031-3203(97)83110-X</json:string>
</doi>
<id>FC715F63A7954E7FB306669F696A79E93A2828E0</id>
<fulltext><json:item><original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/pdf</uri>
</json:item>
<json:item><original>true</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/txt</uri>
</json:item>
<json:item><original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/fulltext/tei"><teiHeader><fileDesc><titleStmt><title level="a">Image-based keyword recognition in oriental language document images</title>
</titleStmt>
<publicationStmt><authority>ISTEX</authority>
<publisher>ELSEVIER</publisher>
<availability><p>ELSEVIER</p>
</availability>
<date>1997</date>
</publicationStmt>
<sourceDesc><biblStruct type="inbook"><analytic><title level="a">Image-based keyword recognition in oriental language document images</title>
<author><persName><forename type="first">Jason</forename>
<surname>Zhu</surname>
</persName>
<affiliation>Microsoft Corporation Seattle, WA, U.S.A.</affiliation>
</author>
<author><persName><forename type="first">Tao</forename>
<surname>Hong</surname>
</persName>
<affiliation>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</affiliation>
</author>
<author><persName><forename type="first">Jonathan J.</forename>
<surname>Hull</surname>
</persName>
<email>hull@crc.ricoh.com</email>
<affiliation>Author to whom correspondence should be addressed.</affiliation>
<affiliation>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</affiliation>
</author>
</analytic>
<monogr><title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="pISSN">0031-3203</idno>
<idno type="PII">S0031-3203(00)X0030-1</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1996"></date>
<biblScope unit="volume">30</biblScope>
<biblScope unit="issue">8</biblScope>
<biblScope unit="page" from="1293">1293</biblScope>
<biblScope unit="page" to="1300">1300</biblScope>
</imprint>
</monogr>
<idno type="istex">FC715F63A7954E7FB306669F696A79E93A2828E0</idno>
<idno type="DOI">10.1016/S0031-3203(97)83110-X</idno>
<idno type="PII">S0031-3203(97)83110-X</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><creation><date>1997</date>
</creation>
<langUsage><language ident="en">en</language>
</langUsage>
<abstract xml:lang="en"><p>An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.</p>
</abstract>
<textClass><keywords scheme="keyword"><list><head>Keywords</head>
<item><term>Word recognition</term>
</item>
<item><term>Chinese</term>
</item>
<item><term>Japanese</term>
</item>
<item><term>Oriental languages</term>
</item>
<item><term>Text recognition</term>
</item>
<item><term>Contextual post-processing</term>
</item>
<item><term>Word spotting</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc><change when="1996-10-21">Registration</change>
<change when="1996-07-30">Modified</change>
<change when="1996">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
</fulltext>
<metadata><istex:metadataXml wicri:clean="Elsevier, elements deleted: tail"><istex:xmlDeclaration>version="1.0" encoding="utf-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//ES//DTD journal article DTD version 4.5.2//EN//XML" URI="art452.dtd" name="istex:docType"></istex:docType>
<istex:document><converted-article version="4.5.2" docsubtype="fla"><item-info><jid>PR</jid>
<aid>9783110X</aid>
<ce:pii>S0031-3203(97)83110-X</ce:pii>
<ce:doi>10.1016/S0031-3203(97)83110-X</ce:doi>
<ce:copyright type="unknown" year="1997"></ce:copyright>
</item-info>
<head><ce:title>Image-based keyword recognition in oriental language document images</ce:title>
<ce:author-group><ce:author><ce:given-name>Jason</ce:given-name>
<ce:surname>Zhu</ce:surname>
<ce:cross-ref refid="AFF1"><ce:sup>†</ce:sup>
</ce:cross-ref>
</ce:author>
<ce:author><ce:given-name>Tao</ce:given-name>
<ce:surname>Hong</ce:surname>
<ce:cross-ref refid="AFF2"><ce:sup>‡</ce:sup>
</ce:cross-ref>
</ce:author>
<ce:author><ce:given-name>Jonathan J.</ce:given-name>
<ce:surname>Hull</ce:surname>
<ce:cross-ref refid="COR1"><ce:sup>∗</ce:sup>
</ce:cross-ref>
<ce:cross-ref refid="AFF3"><ce:sup>§</ce:sup>
</ce:cross-ref>
<ce:e-address>hull@crc.ricoh.com</ce:e-address>
</ce:author>
<ce:affiliation id="AFF1"><ce:label>†</ce:label>
<ce:textfn>Microsoft Corporation Seattle, WA, U.S.A.</ce:textfn>
</ce:affiliation>
<ce:affiliation id="AFF2"><ce:label>‡</ce:label>
<ce:textfn>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</ce:textfn>
</ce:affiliation>
<ce:affiliation id="AFF3"><ce:label>c</ce:label>
<ce:textfn>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</ce:textfn>
</ce:affiliation>
<ce:correspondence id="COR1"><ce:label>∗</ce:label>
<ce:text>Author to whom correspondence should be addressed.</ce:text>
</ce:correspondence>
</ce:author-group>
<ce:date-received day="16" month="7" year="1996"></ce:date-received>
<ce:date-revised day="30" month="7" year="1996"></ce:date-revised>
<ce:date-accepted day="21" month="10" year="1996"></ce:date-accepted>
<ce:abstract><ce:section-title>Abstract</ce:section-title>
<ce:abstract-sec><ce:simple-para>An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.</ce:simple-para>
</ce:abstract-sec>
</ce:abstract>
<ce:keywords><ce:section-title>Keywords</ce:section-title>
<ce:keyword><ce:text>Word recognition</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Chinese</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Japanese</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Oriental languages</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Text recognition</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Contextual post-processing</ce:text>
</ce:keyword>
<ce:keyword><ce:text>Word spotting</ce:text>
</ce:keyword>
</ce:keywords>
</head>
</converted-article>
</istex:document>
</istex:metadataXml>
<mods version="3.6"><titleInfo><title>Image-based keyword recognition in oriental language document images</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA"><title>Image-based keyword recognition in oriental language document images</title>
</titleInfo>
<name type="personal"><namePart type="given">Jason</namePart>
<namePart type="family">Zhu</namePart>
<affiliation>Microsoft Corporation Seattle, WA, U.S.A.</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Tao</namePart>
<namePart type="family">Hong</namePart>
<affiliation>Center of Excellence for Document Analysis and Recognition, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Jonathan J.</namePart>
<namePart type="family">Hull</namePart>
<affiliation>Author to whom correspondence should be addressed.</affiliation>
<affiliation>Ricoh California Research Center, 2882 Sand Hill Road, Suite 115, Menlo Park, CA 94025, U.S.A.</affiliation>
<affiliation>E-mail: hull@crc.ricoh.com</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="Full-length article"></genre>
<originInfo><publisher>ELSEVIER</publisher>
<dateIssued encoding="w3cdtf">1996</dateIssued>
<dateValid encoding="w3cdtf">1996-10-21</dateValid>
<dateModified encoding="w3cdtf">1996-07-30</dateModified>
<copyrightDate encoding="w3cdtf">1997</copyrightDate>
</originInfo>
<language><languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription><internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">An algorithm is presented for keyword recognition in Oriental language document images. The objective is to recognize keywords composed of more than one consecutive character in document images where there are no explicit visually defined word boundaries. The technique exploits the redundancy expressed by the difference between the number of possible character strings of a fixed length and the number of legal words of that length. Sequences of character images are matched simultaneously to a dictionary of keywords and illegal strings that are visually similar to the keywords. A keyword is located if its image is more likely to occur than any of the illegal strings that are visually similar to it. No intermediate character recognition step is used. The application of contextual information directly to the interpretation of features extracted from the image overcomes noise that could make isolated character recognition impossible and the location of words with conventional post-processing algorithms difficult. Experimental results demonstrate the ability of the proposed algorithm to correctly recognize words in the presence of noise that could not be overcome by conventional character recognition or post-processing algorithms.</abstract>
<subject><genre>Keywords</genre>
<topic>Word recognition</topic>
<topic>Chinese</topic>
<topic>Japanese</topic>
<topic>Oriental languages</topic>
<topic>Text recognition</topic>
<topic>Contextual post-processing</topic>
<topic>Word spotting</topic>
</subject>
<relatedItem type="host"><titleInfo><title>Pattern Recognition</title>
</titleInfo>
<titleInfo type="abbreviated"><title>PR</title>
</titleInfo>
<genre type="Journal">journal</genre>
<originInfo><dateIssued encoding="w3cdtf">199708</dateIssued>
</originInfo>
<identifier type="ISSN">0031-3203</identifier>
<identifier type="PII">S0031-3203(00)X0030-1</identifier>
<part><date>199708</date>
<detail type="issue"><title>Oriental Character Recognition</title>
</detail>
<detail type="volume"><number>30</number>
<caption>vol.</caption>
</detail>
<detail type="issue"><number>8</number>
<caption>no.</caption>
</detail>
<extent unit="issue pages"><start>1253</start>
<end>1371</end>
</extent>
<extent unit="pages"><start>1293</start>
<end>1300</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">FC715F63A7954E7FB306669F696A79E93A2828E0</identifier>
<identifier type="DOI">10.1016/S0031-3203(97)83110-X</identifier>
<identifier type="PII">S0031-3203(97)83110-X</identifier>
<recordInfo><recordContentSource>ELSEVIER</recordContentSource>
</recordInfo>
</mods>
</metadata>
<enrichments><istex:catWosTEI uri="https://api.istex.fr/document/FC715F63A7954E7FB306669F696A79E93A2828E0/enrichments/catWos"><teiHeader><profileDesc><textClass><classCode scheme="WOS">COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE</classCode>
<classCode scheme="WOS">ENGINEERING, ELECTRICAL & ELECTRONIC</classCode>
</textClass>
</profileDesc>
</teiHeader>
</istex:catWosTEI>
</enrichments>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001666 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001666 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:FC715F63A7954E7FB306669F696A79E93A2828E0
   |texte=   Image-based keyword recognition in oriental language document images
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Image-based keyword recognition in oriental language document images

Image-based keyword recognition in oriental language document images

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri