Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Increasing the efficiency of digitization workflows for herbarium specimens.

Identifieur interne : 000029 ( PubMed/Checkpoint ); précédent : 000028; suivant : 000030

Increasing the efficiency of digitization workflows for herbarium specimens.

Auteurs : Melissa Tulig [États-Unis] ; Nicole Tarnowsky ; Michael Bevans ; Anthony Kirchgessner ; Barbara M. Thiers

Source :

RBID : pubmed:22859882

Abstract

The New York Botanical Garden Herbarium has been databasing and imaging its estimated 7.3 million plant specimens for the past 17 years. Due to the size of the collection, we have been selectively digitizing fundable subsets of specimens, making successive passes through the herbarium with each new grant. With this strategy, the average rate for databasing complete records has been 10 specimens per hour. With 1.3 million specimens databased, this effort has taken about 130,000 hours of staff time. At this rate, to complete the herbarium and digitize the remaining 6 million specimens, another 600,000 hours would be needed. Given the current biodiversity and economic crises, there is neither the time nor money to complete the collection at this rate.Through a combination of grants over the last few years, The New York Botanical Garden has been testing new protocols and tactics for increasing the rate of digitization through combinations of data collaboration, field book digitization, partial data entry and imaging, and optical character recognition (OCR) of specimen images. With the launch of the National Science Foundation's new Advancing Digitization of Biological Collections program, we hope to move forward with larger, more efficient digitization projects, capturing data from larger portions of the herbarium at a fraction of the cost and time.

DOI: 10.3897/zookeys.209.3125
PubMed: 22859882


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:22859882

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Increasing the efficiency of digitization workflows for herbarium specimens.</title>
<author>
<name sortKey="Tulig, Melissa" sort="Tulig, Melissa" uniqKey="Tulig M" first="Melissa" last="Tulig">Melissa Tulig</name>
<affiliation wicri:level="2">
<nlm:affiliation>William and Lynda Steere Herbarium, The New York Botanical Garden, Bronx, New York, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>William and Lynda Steere Herbarium, The New York Botanical Garden, Bronx, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Tarnowsky, Nicole" sort="Tarnowsky, Nicole" uniqKey="Tarnowsky N" first="Nicole" last="Tarnowsky">Nicole Tarnowsky</name>
</author>
<author>
<name sortKey="Bevans, Michael" sort="Bevans, Michael" uniqKey="Bevans M" first="Michael" last="Bevans">Michael Bevans</name>
</author>
<author>
<name sortKey="Anthony Kirchgessner" sort="Anthony Kirchgessner" uniqKey="Anthony Kirchgessner" last="Anthony Kirchgessner">Anthony Kirchgessner</name>
</author>
<author>
<name sortKey="Thiers, Barbara M" sort="Thiers, Barbara M" uniqKey="Thiers B" first="Barbara M" last="Thiers">Barbara M. Thiers</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2012">2012</date>
<idno type="doi">10.3897/zookeys.209.3125</idno>
<idno type="RBID">pubmed:22859882</idno>
<idno type="pmid">22859882</idno>
<idno type="wicri:Area/PubMed/Corpus">000027</idno>
<idno type="wicri:Area/PubMed/Curation">000027</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000027</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Increasing the efficiency of digitization workflows for herbarium specimens.</title>
<author>
<name sortKey="Tulig, Melissa" sort="Tulig, Melissa" uniqKey="Tulig M" first="Melissa" last="Tulig">Melissa Tulig</name>
<affiliation wicri:level="2">
<nlm:affiliation>William and Lynda Steere Herbarium, The New York Botanical Garden, Bronx, New York, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>William and Lynda Steere Herbarium, The New York Botanical Garden, Bronx, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Tarnowsky, Nicole" sort="Tarnowsky, Nicole" uniqKey="Tarnowsky N" first="Nicole" last="Tarnowsky">Nicole Tarnowsky</name>
</author>
<author>
<name sortKey="Bevans, Michael" sort="Bevans, Michael" uniqKey="Bevans M" first="Michael" last="Bevans">Michael Bevans</name>
</author>
<author>
<name sortKey="Anthony Kirchgessner" sort="Anthony Kirchgessner" uniqKey="Anthony Kirchgessner" last="Anthony Kirchgessner">Anthony Kirchgessner</name>
</author>
<author>
<name sortKey="Thiers, Barbara M" sort="Thiers, Barbara M" uniqKey="Thiers B" first="Barbara M" last="Thiers">Barbara M. Thiers</name>
</author>
</analytic>
<series>
<title level="j">ZooKeys</title>
<idno type="eISSN">1313-2970</idno>
<imprint>
<date when="2012" type="published">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The New York Botanical Garden Herbarium has been databasing and imaging its estimated 7.3 million plant specimens for the past 17 years. Due to the size of the collection, we have been selectively digitizing fundable subsets of specimens, making successive passes through the herbarium with each new grant. With this strategy, the average rate for databasing complete records has been 10 specimens per hour. With 1.3 million specimens databased, this effort has taken about 130,000 hours of staff time. At this rate, to complete the herbarium and digitize the remaining 6 million specimens, another 600,000 hours would be needed. Given the current biodiversity and economic crises, there is neither the time nor money to complete the collection at this rate.Through a combination of grants over the last few years, The New York Botanical Garden has been testing new protocols and tactics for increasing the rate of digitization through combinations of data collaboration, field book digitization, partial data entry and imaging, and optical character recognition (OCR) of specimen images. With the launch of the National Science Foundation's new Advancing Digitization of Biological Collections program, we hope to move forward with larger, more efficient digitization projects, capturing data from larger portions of the herbarium at a fraction of the cost and time.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Owner="NLM" Status="PubMed-not-MEDLINE">
<PMID Version="1">22859882</PMID>
<DateCreated>
<Year>2012</Year>
<Month>08</Month>
<Day>03</Day>
</DateCreated>
<DateCompleted>
<Year>2012</Year>
<Month>08</Month>
<Day>31</Day>
</DateCompleted>
<DateRevised>
<Year>2013</Year>
<Month>05</Month>
<Day>30</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1313-2970</ISSN>
<JournalIssue CitedMedium="Internet">
<Issue>209</Issue>
<PubDate>
<Year>2012</Year>
</PubDate>
</JournalIssue>
<Title>ZooKeys</Title>
<ISOAbbreviation>Zookeys</ISOAbbreviation>
</Journal>
<ArticleTitle>Increasing the efficiency of digitization workflows for herbarium specimens.</ArticleTitle>
<Pagination>
<MedlinePgn>103-13</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.3897/zookeys.209.3125</ELocationID>
<Abstract>
<AbstractText>The New York Botanical Garden Herbarium has been databasing and imaging its estimated 7.3 million plant specimens for the past 17 years. Due to the size of the collection, we have been selectively digitizing fundable subsets of specimens, making successive passes through the herbarium with each new grant. With this strategy, the average rate for databasing complete records has been 10 specimens per hour. With 1.3 million specimens databased, this effort has taken about 130,000 hours of staff time. At this rate, to complete the herbarium and digitize the remaining 6 million specimens, another 600,000 hours would be needed. Given the current biodiversity and economic crises, there is neither the time nor money to complete the collection at this rate.Through a combination of grants over the last few years, The New York Botanical Garden has been testing new protocols and tactics for increasing the rate of digitization through combinations of data collaboration, field book digitization, partial data entry and imaging, and optical character recognition (OCR) of specimen images. With the launch of the National Science Foundation's new Advancing Digitization of Biological Collections program, we hope to move forward with larger, more efficient digitization projects, capturing data from larger portions of the herbarium at a fraction of the cost and time.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Tulig</LastName>
<ForeName>Melissa</ForeName>
<Initials>M</Initials>
<AffiliationInfo>
<Affiliation>William and Lynda Steere Herbarium, The New York Botanical Garden, Bronx, New York, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Tarnowsky</LastName>
<ForeName>Nicole</ForeName>
<Initials>N</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Bevans</LastName>
<ForeName>Michael</ForeName>
<Initials>M</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Anthony Kirchgessner</LastName>
</Author>
<Author ValidYN="Y">
<LastName>Thiers</LastName>
<ForeName>Barbara M</ForeName>
<Initials>BM</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2012</Year>
<Month>07</Month>
<Day>20</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Bulgaria</Country>
<MedlineTA>Zookeys</MedlineTA>
<NlmUniqueID>101497933</NlmUniqueID>
<ISSNLinking>1313-2970</ISSNLinking>
</MedlineJournalInfo>
<CommentsCorrectionsList>
<CommentsCorrections RefType="Cites">
<RefSource>Biol Rev Camb Philos Soc. 2010 May;85(2):247-66</RefSource>
<PMID Version="1">19961469</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
<OtherID Source="NLM">PMC3406470</OtherID>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">Herbarium specimen digitization</Keyword>
<Keyword MajorTopicYN="N">digital imaging</Keyword>
<Keyword MajorTopicYN="N">field books</Keyword>
<Keyword MajorTopicYN="N">georeferencing</Keyword>
<Keyword MajorTopicYN="N">workflows</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2012</Year>
<Month>3</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2012</Year>
<Month>6</Month>
<Day>25</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="epublish">
<Year>2012</Year>
<Month>7</Month>
<Day>20</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2012</Year>
<Month>8</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2012</Year>
<Month>8</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2012</Year>
<Month>8</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="doi">10.3897/zookeys.209.3125</ArticleId>
<ArticleId IdType="pubmed">22859882</ArticleId>
<ArticleId IdType="pmc">PMC3406470</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>État de New York</li>
</region>
</list>
<tree>
<noCountry>
<name sortKey="Anthony Kirchgessner" sort="Anthony Kirchgessner" uniqKey="Anthony Kirchgessner" last="Anthony Kirchgessner">Anthony Kirchgessner</name>
<name sortKey="Bevans, Michael" sort="Bevans, Michael" uniqKey="Bevans M" first="Michael" last="Bevans">Michael Bevans</name>
<name sortKey="Tarnowsky, Nicole" sort="Tarnowsky, Nicole" uniqKey="Tarnowsky N" first="Nicole" last="Tarnowsky">Nicole Tarnowsky</name>
<name sortKey="Thiers, Barbara M" sort="Thiers, Barbara M" uniqKey="Thiers B" first="Barbara M" last="Thiers">Barbara M. Thiers</name>
</noCountry>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Tulig, Melissa" sort="Tulig, Melissa" uniqKey="Tulig M" first="Melissa" last="Tulig">Melissa Tulig</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000029 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 000029 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:22859882
   |texte=   Increasing the efficiency of digitization workflows for herbarium specimens.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:22859882" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024