OcrV1, Istex, Corpus, bibRecord, 000B88

Document recognition for a Digital Library

Identifieur interne : 000B88 ( Istex/Corpus ); précédent : 000B87; suivant : 000B89

Document recognition for a Digital Library

Auteurs : N. Srihari ; W. Lam ; J. Hull

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 1995.

RBID : ISTEX:7133932DAB9E25F7D661F69E71FA4085686E2629

Abstract

Abstract: Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.

Url:

https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/pdf

DOI: 10.1007/BFb0026853

Links to Exploration step

ISTEX:7133932DAB9E25F7D661F69E71FA4085686E2629

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Document recognition for a Digital Library</title>
<author><name sortKey="Srihari, N" sort="Srihari, N" uniqKey="Srihari N" first="N." last="Srihari">N. Srihari</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: srihari@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Lam, W" sort="Lam, W" uniqKey="Lam W" first="W." last="Lam">W. Lam</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: lam@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hull, J" sort="Hull, J" uniqKey="Hull J" first="J." last="Hull">J. Hull</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: hull@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:7133932DAB9E25F7D661F69E71FA4085686E2629</idno>
<date when="1995" year="1995">1995</date>
<idno type="doi">10.1007/BFb0026853</idno>
<idno type="url">https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B88</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Document recognition for a Digital Library</title>
<author><name sortKey="Srihari, N" sort="Srihari, N" uniqKey="Srihari N" first="N." last="Srihari">N. Srihari</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: srihari@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Lam, W" sort="Lam, W" uniqKey="Lam W" first="W." last="Lam">W. Lam</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: lam@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hull, J" sort="Hull, J" uniqKey="Hull J" first="J." last="Hull">J. Hull</name>
<affiliation><mods:affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: hull@cedar.buffalo.edu</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>1995</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">7133932DAB9E25F7D661F69E71FA4085686E2629</idno>
<idno type="DOI">10.1007/BFb0026853</idno>
<idno type="ChapterID">8</idno>
<idno type="ChapterID">Chap8</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.</div>
</front>
</TEI>
<istex><corpusName>springer</corpusName>
<author><json:item><name>Sargur N. Srihari</name>
<affiliations><json:string>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</json:string>
<json:string>E-mail: srihari@cedar.buffalo.edu</json:string>
</affiliations>
</json:item>
<json:item><name>Stephen W. Lam</name>
<affiliations><json:string>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</json:string>
<json:string>E-mail: lam@cedar.buffalo.edu</json:string>
</affiliations>
</json:item>
<json:item><name>Jonathan J. Hull</name>
<affiliations><json:string>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</json:string>
<json:string>E-mail: hull@cedar.buffalo.edu</json:string>
</affiliations>
</json:item>
</author>
<language><json:string>eng</json:string>
</language>
<abstract>Abstract: Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.</abstract>
<qualityIndicators><score>5.052</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>439.208 x 666 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>963</abstractCharCount>
<pdfWordCount>3240</pdfWordCount>
<pdfCharCount>18643</pdfCharCount>
<pdfPageCount>10</pdfPageCount>
<abstractWordCount>151</abstractWordCount>
</qualityIndicators>
<title>Document recognition for a Digital Library</title>
<genre.original><json:string>ReviewPaper</json:string>
</genre.original>
<chapterId><json:string>8</json:string>
<json:string>Chap8</json:string>
</chapterId>
<genre><json:string>conference [eBooks]</json:string>
</genre>
<serie><editor><json:item><name>Gerhard Goos</name>
</json:item>
<json:item><name>Juris Hartmanis</name>
</json:item>
<json:item><name>Jan van Leeuwen</name>
</json:item>
</editor>
<issn><json:string>0302-9743</json:string>
</issn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Lecture Notes in Computer Science</title>
<copyrightDate>1995</copyrightDate>
</serie>
<host><editor><json:item><name>Nabil R. Adam</name>
</json:item>
<json:item><name>Bharat K. Bhargava</name>
</json:item>
<json:item><name>Yelena Yesha</name>
</json:item>
</editor>
<subject><json:item><value>Computer Science</value>
</json:item>
<json:item><value>Computer Science</value>
</json:item>
<json:item><value>Database Management</value>
</json:item>
<json:item><value>Information Storage and Retrieval</value>
</json:item>
<json:item><value>Information Systems Applications (incl.Internet)</value>
</json:item>
<json:item><value>Computer Communication Networks</value>
</json:item>
<json:item><value>Coding and Information Theory</value>
</json:item>
<json:item><value>Image Processing and Computer Vision</value>
</json:item>
</subject>
<isbn><json:string>978-3-540-59282-2</json:string>
</isbn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Digital Libraries Current Issues</title>
<genre.original><json:string>Proceedings</json:string>
</genre.original>
<bookId><json:string>3540592822</json:string>
</bookId>
<volume>916</volume>
<pages><last>128</last>
<first>119</first>
</pages>
<issn><json:string>0302-9743</json:string>
</issn>
<genre><json:string>Book Series</json:string>
</genre>
<eisbn><json:string>978-3-540-49230-6</json:string>
</eisbn>
<copyrightDate>1995</copyrightDate>
<doi><json:string>10.1007/BFb0026845</json:string>
</doi>
</host>
<publicationDate>1995</publicationDate>
<copyrightDate>1995</copyrightDate>
<doi><json:string>10.1007/BFb0026853</json:string>
</doi>
<id>7133932DAB9E25F7D661F69E71FA4085686E2629</id>
<fulltext><json:item><original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/pdf</uri>
</json:item>
<json:item><original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/tei"><teiHeader><fileDesc><titleStmt><title level="a" type="main" xml:lang="en">Document recognition for a Digital Library</title>
</titleStmt>
<publicationStmt><authority>ISTEX</authority>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<availability><p>SPRINGER</p>
</availability>
<date>1995</date>
</publicationStmt>
<sourceDesc><biblStruct type="inbook"><analytic><title level="a" type="main" xml:lang="en">Document recognition for a Digital Library</title>
<author><persName><forename type="first">Sargur</forename>
<surname>Srihari</surname>
</persName>
<email>srihari@cedar.buffalo.edu</email>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
</author>
<author><persName><forename type="first">Stephen</forename>
<surname>Lam</surname>
</persName>
<email>lam@cedar.buffalo.edu</email>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
</author>
<author><persName><forename type="first">Jonathan</forename>
<surname>Hull</surname>
</persName>
<email>hull@cedar.buffalo.edu</email>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
</author>
</analytic>
<monogr><title level="m">Digital Libraries Current Issues</title>
<title level="m" type="sub">Digital Libraries Workshop DL '94 Newark, NJ, USA, May 19–20, 1994 Selected Papers</title>
<idno type="pISBN">978-3-540-59282-2</idno>
<idno type="eISBN">978-3-540-49230-6</idno>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="DOI">10.1007/BFb0026845</idno>
<idno type="BookID">3540592822</idno>
<idno type="BookTitleID">42673</idno>
<idno type="BookVolumeNumber">916</idno>
<idno type="BookChapterCount">17</idno>
<editor><persName><forename type="first">Nabil</forename>
<forename type="first">R.</forename>
<surname>Adam</surname>
</persName>
</editor>
<editor><persName><forename type="first">Bharat</forename>
<forename type="first">K.</forename>
<surname>Bhargava</surname>
</persName>
</editor>
<editor><persName><forename type="first">Yelena</forename>
<surname>Yesha</surname>
</persName>
</editor>
<imprint><publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<date type="published" when="1995"></date>
<biblScope unit="volume">916</biblScope>
<biblScope unit="chap">8</biblScope>
<biblScope unit="page" from="119">119</biblScope>
<biblScope unit="page" to="128">128</biblScope>
</imprint>
</monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<editor><persName><forename type="first">Gerhard</forename>
<surname>Goos</surname>
</persName>
</editor>
<editor><persName><forename type="first">Juris</forename>
<surname>Hartmanis</surname>
</persName>
</editor>
<editor><persName><forename type="first">Jan</forename>
<surname>van Leeuwen</surname>
</persName>
</editor>
<biblScope><date>1995</date>
</biblScope>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="seriesId">558</idno>
</series>
<idno type="istex">7133932DAB9E25F7D661F69E71FA4085686E2629</idno>
<idno type="DOI">10.1007/BFb0026853</idno>
<idno type="ChapterID">8</idno>
<idno type="ChapterID">Chap8</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><creation><date>1995</date>
</creation>
<langUsage><language ident="en">en</language>
</langUsage>
<abstract xml:lang="en"><p>Abstract: Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.</p>
</abstract>
<textClass><keywords scheme="Book Subject Collection"><list><label>SUCO11645</label>
<item><term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass><keywords scheme="Book Subject Group"><list><label>I</label>
<item><term>Computer Science</term>
</item>
<label>I18024</label>
<item><term>Database Management</term>
</item>
<label>I18032</label>
<item><term>Information Storage and Retrieval</term>
</item>
<label>I18040</label>
<item><term>Information Systems Applications (incl.Internet)</term>
</item>
<label>I13022</label>
<item><term>Computer Communication Networks</term>
</item>
<label>I15041</label>
<item><term>Coding and Information Theory</term>
</item>
<label>I22021</label>
<item><term>Image Processing and Computer Vision</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc><change when="1995">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item><original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/7133932DAB9E25F7D661F69E71FA4085686E2629/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata><istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header"><istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document><Publisher><PublisherInfo><PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series><SeriesInfo TocLevels="0"><SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
<SeriesAbbreviatedTitle>Lect Notes Comput Sci</SeriesAbbreviatedTitle>
</SeriesInfo>
<SeriesHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>Gerhard</GivenName>
<FamilyName>Goos</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Juris</GivenName>
<FamilyName>Hartmanis</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Jan</GivenName>
<Particle>van</Particle>
<FamilyName>Leeuwen</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</SeriesHeader>
<Book Language="En"><BookInfo MediaType="eBook" Language="En" BookProductType="Proceedings" TocLevels="0" NumberingStyle="ChapterContent"><BookID>3540592822</BookID>
<BookTitle>Digital Libraries Current Issues</BookTitle>
<BookSubTitle>Digital Libraries Workshop DL '94 Newark, NJ, USA, May 19–20, 1994 Selected Papers</BookSubTitle>
<BookVolumeNumber>916</BookVolumeNumber>
<BookDOI>10.1007/BFb0026845</BookDOI>
<BookTitleID>42673</BookTitleID>
<BookPrintISBN>978-3-540-59282-2</BookPrintISBN>
<BookElectronicISBN>978-3-540-49230-6</BookElectronicISBN>
<BookChapterCount>17</BookChapterCount>
<BookCopyright><CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1995</CopyrightYear>
</BookCopyright>
<BookSubjectGroup><BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I18024" Priority="1" Type="Secondary">Database Management</BookSubject>
<BookSubject Code="I18032" Priority="2" Type="Secondary">Information Storage and Retrieval</BookSubject>
<BookSubject Code="I18040" Priority="3" Type="Secondary">Information Systems Applications (incl.Internet)</BookSubject>
<BookSubject Code="I13022" Priority="4" Type="Secondary">Computer Communication Networks</BookSubject>
<BookSubject Code="I15041" Priority="5" Type="Secondary">Coding and Information Theory</BookSubject>
<BookSubject Code="I22021" Priority="6" Type="Secondary">Image Processing and Computer Vision</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
</BookInfo>
<BookHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>Nabil</GivenName>
<GivenName>R.</GivenName>
<FamilyName>Adam</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Bharat</GivenName>
<GivenName>K.</GivenName>
<FamilyName>Bhargava</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Yelena</GivenName>
<FamilyName>Yesha</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</BookHeader>
<Chapter ID="Chap8" Language="En"><ChapterInfo ChapterType="ReviewPaper" NumberingStyle="ChapterContent" TocLevels="0" ContainsESM="No"><ChapterID>8</ChapterID>
<ChapterNumber>8</ChapterNumber>
<ChapterDOI>10.1007/BFb0026853</ChapterDOI>
<ChapterSequenceNumber>8</ChapterSequenceNumber>
<ChapterTitle Language="En">Document recognition for a Digital Library</ChapterTitle>
<ChapterCategory>Information Retrieval/Hypertext</ChapterCategory>
<ChapterFirstPage>119</ChapterFirstPage>
<ChapterLastPage>128</ChapterLastPage>
<ChapterCopyright><CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1995</CopyrightYear>
</ChapterCopyright>
<ChapterHistory><OnlineDate><Year>2005</Year>
<Month>6</Month>
<Day>18</Day>
</OnlineDate>
</ChapterHistory>
<ChapterGrants Type="Regular"><MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext><SeriesID>558</SeriesID>
<BookID>3540592822</BookID>
<BookTitle>Digital Libraries Current Issues</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader><AuthorGroup><Author AffiliationIDS="Aff1"><AuthorName DisplayOrder="Western"><GivenName>Sargur</GivenName>
<GivenName>N.</GivenName>
<FamilyName>Srihari</FamilyName>
</AuthorName>
<Contact><Email>srihari@cedar.buffalo.edu</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff2"><AuthorName DisplayOrder="Western"><GivenName>Stephen</GivenName>
<GivenName>W.</GivenName>
<FamilyName>Lam</FamilyName>
</AuthorName>
<Contact><Email>lam@cedar.buffalo.edu</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff3"><AuthorName DisplayOrder="Western"><GivenName>Jonathan</GivenName>
<GivenName>J.</GivenName>
<FamilyName>Hull</FamilyName>
</AuthorName>
<Contact><Email>hull@cedar.buffalo.edu</Email>
</Contact>
</Author>
<Affiliation ID="Aff1"><OrgDivision>Center of Excellence for Document Analysis and Recognition (CEDAR)</OrgDivision>
<OrgName>State University of New York at Buffalo</OrgName>
<OrgAddress><Street>520 Lee Entrance, Suite 202</Street>
<Postcode>14228</Postcode>
<City>Amherst</City>
<State>NY</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2"><OrgDivision>Center of Excellence for Document Analysis and Recognition (CEDAR)</OrgDivision>
<OrgName>State University of New York at Buffalo</OrgName>
<OrgAddress><Street>520 Lee Entrance, Suite 202</Street>
<Postcode>14228</Postcode>
<City>Amherst</City>
<State>NY</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff3"><OrgDivision>Center of Excellence for Document Analysis and Recognition (CEDAR)</OrgDivision>
<OrgName>State University of New York at Buffalo</OrgName>
<OrgAddress><Street>520 Lee Entrance, Suite 202</Street>
<Postcode>14228</Postcode>
<City>Amherst</City>
<State>NY</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En"><Heading>Abstract</Heading>
<Para>Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.</Para>
</Abstract>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6"><titleInfo lang="en"><title>Document recognition for a Digital Library</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en"><title>Document recognition for a Digital Library</title>
</titleInfo>
<name type="personal"><namePart type="given">Sargur</namePart>
<namePart type="given">N.</namePart>
<namePart type="family">Srihari</namePart>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
<affiliation>E-mail: srihari@cedar.buffalo.edu</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Stephen</namePart>
<namePart type="given">W.</namePart>
<namePart type="family">Lam</namePart>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
<affiliation>E-mail: lam@cedar.buffalo.edu</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Jonathan</namePart>
<namePart type="given">J.</namePart>
<namePart type="family">Hull</namePart>
<affiliation>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, 520 Lee Entrance, Suite 202, 14228, Amherst, NY, USA</affiliation>
<affiliation>E-mail: hull@cedar.buffalo.edu</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="conference [eBooks]" displayLabel="ReviewPaper"></genre>
<originInfo><publisher>Springer Berlin Heidelberg</publisher>
<place><placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">1995</dateIssued>
<copyrightDate encoding="w3cdtf">1995</copyrightDate>
</originInfo>
<language><languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription><internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: Our work concerns three major processes which help to build a DL database for IR. Several components of the system are shown to be useful for creating a DL database. The design of these processes stresses the importance of robustness and ease of adaptation to the processing of different documents. Output generated by these processes facilitates the IR mechanism to produce intelligent response to user queries. An adaptive approach to document understanding was presented in this chapter. Its robustness was shown to be crucial to the success in processing varied library documents. This chapter also presented an adaptation of the vector space model for information retrieval to improving the performance of a word recognition algorithm. The neighborhoods of visually similar words determined by word recognition are matched to a database of documents and a subset of documents with topics that are similar to those of the input image are determined.</abstract>
<relatedItem type="host"><titleInfo><title>Digital Libraries Current Issues</title>
<subTitle>Digital Libraries Workshop DL '94 Newark, NJ, USA, May 19–20, 1994 Selected Papers</subTitle>
</titleInfo>
<name type="personal"><namePart type="given">Nabil</namePart>
<namePart type="given">R.</namePart>
<namePart type="family">Adam</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Bharat</namePart>
<namePart type="given">K.</namePart>
<namePart type="family">Bhargava</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Yelena</namePart>
<namePart type="family">Yesha</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="Book Series" displayLabel="Proceedings"></genre>
<originInfo><copyrightDate encoding="w3cdtf">1995</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject><genre>Book Subject Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject><genre>Book Subject Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18024">Database Management</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18032">Information Storage and Retrieval</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18040">Information Systems Applications (incl.Internet)</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I13022">Computer Communication Networks</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I15041">Coding and Information Theory</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I22021">Image Processing and Computer Vision</topic>
</subject>
<identifier type="DOI">10.1007/BFb0026845</identifier>
<identifier type="ISBN">978-3-540-59282-2</identifier>
<identifier type="eISBN">978-3-540-49230-6</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">42673</identifier>
<identifier type="BookID">3540592822</identifier>
<identifier type="BookChapterCount">17</identifier>
<identifier type="BookVolumeNumber">916</identifier>
<part><date>1995</date>
<detail type="volume"><number>916</number>
<caption>vol.</caption>
</detail>
<detail type="chapter"><number>8</number>
</detail>
<extent unit="pages"><start>119</start>
<end>128</end>
</extent>
</part>
<recordInfo><recordOrigin>Springer-Verlag, 1995</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series"><titleInfo><title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal"><namePart type="given">Gerhard</namePart>
<namePart type="family">Goos</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Juris</namePart>
<namePart type="family">Hartmanis</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Jan</namePart>
<namePart type="family">van Leeuwen</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo><copyrightDate encoding="w3cdtf">1995</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<part><detail type="chapter"><number>8</number>
</detail>
</part>
<recordInfo><recordOrigin>Springer-Verlag, 1995</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">7133932DAB9E25F7D661F69E71FA4085686E2629</identifier>
<identifier type="DOI">10.1007/BFb0026853</identifier>
<identifier type="ChapterID">8</identifier>
<identifier type="ChapterID">Chap8</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag, 1995</accessCondition>
<recordInfo><recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag, 1995</recordOrigin>
</recordInfo>
</mods>
</metadata>
<enrichments></enrichments>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B88 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000B88 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:7133932DAB9E25F7D661F69E71FA4085686E2629
   |texte=   Document recognition for a Digital Library
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Document recognition for a Digital Library

Document recognition for a Digital Library

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri