OcrV1, Main, Exploration, bibRecord, 000618

Processing Handwritten Words by Intelligent Use of OCR Results

Identifieur interne : 000618 ( Main/Exploration ); précédent : 000617; suivant : 000619

Processing Handwritten Words by Intelligent Use of OCR Results

Auteurs : Benjamin Mund [Allemagne] ; Karl-Heinz Steinke [Allemagne]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.

RBID : ISTEX:66DBD17F17D9E35575A159D102BB28054409CB98

Abstract

Abstract: About 3.5 million dried plants on paper sheets are deposited in the Botanical Museum Berlin in Germany. Frequently they have handwritten annotations (see figure 1). So a procedure had to be developed in order to process the handwriting on the sheet. In the present work an approach tries to identify the writer by handwritten words and to read handwritten keywords. Therefore the word is cut out and transformed into a 6-dimensional time series and compared e.g. by means of DTW-method. A recognition rate of 98.6% is achieved with 12 different words (1200 samples). All herbar documents contain several printed tokens which indicate more information about the plant. With the token it is possible to get information who has found this plant, where this plant was found (country and sometimes the town), what kind of plant it is and so on. By using the local connections of the text it is possible to get more information from the herbar document, e.g. to find and recognize handwritten text in a defined area.

Url:

https://api.istex.fr/document/66DBD17F17D9E35575A159D102BB28054409CB98/fulltext/pdf

DOI: 10.1007/978-3-642-14400-4_14

Affiliations:

Allemagne

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000071
to stream Istex, to step Curation: 000069
to stream Istex, to step Checkpoint: 000198
to stream Main, to step Merge: 000623
to stream Main, to step Curation: 000618

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Processing Handwritten Words by Intelligent Use of OCR Results</title>
<author><name sortKey="Mund, Benjamin" sort="Mund, Benjamin" uniqKey="Mund B" first="Benjamin" last="Mund">Benjamin Mund</name>
</author>
<author><name sortKey="Steinke, Karl Heinz" sort="Steinke, Karl Heinz" uniqKey="Steinke K" first="Karl-Heinz" last="Steinke">Karl-Heinz Steinke</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:66DBD17F17D9E35575A159D102BB28054409CB98</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-14400-4_14</idno>
<idno type="url">https://api.istex.fr/document/66DBD17F17D9E35575A159D102BB28054409CB98/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000071</idno>
<idno type="wicri:Area/Istex/Curation">000069</idno>
<idno type="wicri:Area/Istex/Checkpoint">000198</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Mund B:processing:handwritten:words</idno>
<idno type="wicri:Area/Main/Merge">000623</idno>
<idno type="wicri:Area/Main/Curation">000618</idno>
<idno type="wicri:Area/Main/Exploration">000618</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Processing Handwritten Words by Intelligent Use of OCR Results</title>
<author><name sortKey="Mund, Benjamin" sort="Mund, Benjamin" uniqKey="Mund B" first="Benjamin" last="Mund">Benjamin Mund</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Applied Sciences and Arts, Hanover, Hanover</wicri:regionArea>
<wicri:noRegion>Hanover</wicri:noRegion>
<wicri:noRegion>Hanover</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Steinke, Karl Heinz" sort="Steinke, Karl Heinz" uniqKey="Steinke K" first="Karl-Heinz" last="Steinke">Karl-Heinz Steinke</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Applied Sciences and Arts, Hanover, Hanover</wicri:regionArea>
<wicri:noRegion>Hanover</wicri:noRegion>
<wicri:noRegion>Hanover</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">66DBD17F17D9E35575A159D102BB28054409CB98</idno>
<idno type="DOI">10.1007/978-3-642-14400-4_14</idno>
<idno type="ChapterID">14</idno>
<idno type="ChapterID">Chap14</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: About 3.5 million dried plants on paper sheets are deposited in the Botanical Museum Berlin in Germany. Frequently they have handwritten annotations (see figure 1). So a procedure had to be developed in order to process the handwriting on the sheet. In the present work an approach tries to identify the writer by handwritten words and to read handwritten keywords. Therefore the word is cut out and transformed into a 6-dimensional time series and compared e.g. by means of DTW-method. A recognition rate of 98.6% is achieved with 12 different words (1200 samples). All herbar documents contain several printed tokens which indicate more information about the plant. With the token it is possible to get information who has found this plant, where this plant was found (country and sometimes the town), what kind of plant it is and so on. By using the local connections of the text it is possible to get more information from the herbar document, e.g. to find and recognize handwritten text in a defined area.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Mund, Benjamin" sort="Mund, Benjamin" uniqKey="Mund B" first="Benjamin" last="Mund">Benjamin Mund</name>
</noRegion>
<name sortKey="Steinke, Karl Heinz" sort="Steinke, Karl Heinz" uniqKey="Steinke K" first="Karl-Heinz" last="Steinke">Karl-Heinz Steinke</name>
<name sortKey="Steinke, Karl Heinz" sort="Steinke, Karl Heinz" uniqKey="Steinke K" first="Karl-Heinz" last="Steinke">Karl-Heinz Steinke</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000618 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000618 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:66DBD17F17D9E35575A159D102BB28054409CB98
   |texte=   Processing Handwritten Words by Intelligent Use of OCR Results
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Processing Handwritten Words by Intelligent Use of OCR Results

Processing Handwritten Words by Intelligent Use of OCR Results

Source :

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri