Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data

Identifieur interne : 000224 ( Ncbi/Merge ); précédent : 000223; suivant : 000225

Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data

Auteurs : Robert P. Guralnick [États-Unis] ; Nico Cellinese [États-Unis] ; John Deck [États-Unis] ; Richard L. Pyle ; John Kunze [États-Unis] ; Lyubomir Penev [Bulgarie] ; Ramona Walls [États-Unis] ; Gregor Hagedorn [Allemagne] ; Donat Agosti ; John Wieczorek ; Terry Catapano ; Roderic D. M. Page

Source :

RBID : PMC:4400380

Abstract

Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm PageBreakin October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.


Url:
DOI: 10.3897/zookeys.494.9352
PubMed: 25901117
PubMed Central: 4400380

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:4400380

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data</title>
<author>
<name sortKey="Guralnick, Robert P" sort="Guralnick, Robert P" uniqKey="Guralnick R" first="Robert P." last="Guralnick">Robert P. Guralnick</name>
<affiliation wicri:level="2">
<nlm:aff id="A1">Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Floride</region>
</placeName>
<wicri:cityArea>Florida Museum of Natural History, University of Florida, Gainesville</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Cellinese, Nico" sort="Cellinese, Nico" uniqKey="Cellinese N" first="Nico" last="Cellinese">Nico Cellinese</name>
<affiliation wicri:level="2">
<nlm:aff id="A1">Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Floride</region>
</placeName>
<wicri:cityArea>Florida Museum of Natural History, University of Florida, Gainesville</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Deck, John" sort="Deck, John" uniqKey="Deck J" first="John" last="Deck">John Deck</name>
<affiliation wicri:level="2">
<nlm:aff id="A2">Berkeley Natural History Museums, University of California, Berkeley, California, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Berkeley Natural History Museums, University of California, Berkeley, California</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Pyle, Richard L" sort="Pyle, Richard L" uniqKey="Pyle R" first="Richard L." last="Pyle">Richard L. Pyle</name>
<affiliation>
<nlm:aff id="A3">Department of Natural Sciences, Bernice P. Bishop Museum, Honolulu, HI USA 96817</nlm:aff>
<wicri:noCountry code="subfield">HI USA 96817</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Kunze, John" sort="Kunze, John" uniqKey="Kunze J" first="John" last="Kunze">John Kunze</name>
<affiliation wicri:level="2">
<nlm:aff id="A4">California Digital Library, University of California Office of the President, Oakland, CA USA</nlm:aff>
<country>États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>California Digital Library, University of California Office of the President, Oakland</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Penev, Lyubomir" sort="Penev, Lyubomir" uniqKey="Penev L" first="Lyubomir" last="Penev">Lyubomir Penev</name>
<affiliation wicri:level="3">
<nlm:aff id="A5">Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, and Pensoft Publishers, Sofia, Bulgaria</nlm:aff>
<country xml:lang="fr">Bulgarie</country>
<wicri:regionArea>Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, and Pensoft Publishers, Sofia</wicri:regionArea>
<placeName>
<settlement type="city">Sofia</settlement>
<region nuts="2">Sofia-ville (oblast)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Walls, Ramona" sort="Walls, Ramona" uniqKey="Walls R" first="Ramona" last="Walls">Ramona Walls</name>
<affiliation wicri:level="2">
<nlm:aff id="A6">iPlant Collaborative, University of Arizona,Tucson, AZ 85721</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Arizona</region>
</placeName>
<wicri:cityArea>iPlant Collaborative, University of Arizona,Tucson</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Hagedorn, Gregor" sort="Hagedorn, Gregor" uniqKey="Hagedorn G" first="Gregor" last="Hagedorn">Gregor Hagedorn</name>
<affiliation wicri:level="3">
<nlm:aff id="A7">Museum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Invalidenstraße 43, 10115 Berlin, Germany</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Museum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Invalidenstraße 43, 10115 Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Agosti, Donat" sort="Agosti, Donat" uniqKey="Agosti D" first="Donat" last="Agosti">Donat Agosti</name>
<affiliation>
<nlm:aff id="A8">Plazi, Zinggstrasse 16, 3007 Bern, Switzerand</nlm:aff>
<wicri:noCountry code="subfield">Switzerand</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Wieczorek, John" sort="Wieczorek, John" uniqKey="Wieczorek J" first="John" last="Wieczorek">John Wieczorek</name>
<affiliation>
<nlm:aff id="A9">Museum of Vertebrate Zoology, University of California, Berkeley, CA USA. United States of America. 94720-3160</nlm:aff>
<wicri:noCountry code="subfield">CA USA. United States of America. 94720-3160</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Catapano, Terry" sort="Catapano, Terry" uniqKey="Catapano T" first="Terry" last="Catapano">Terry Catapano</name>
<affiliation>
<nlm:aff id="A8">Plazi, Zinggstrasse 16, 3007 Bern, Switzerand</nlm:aff>
<wicri:noCountry code="subfield">Switzerand</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Page, Roderic D M" sort="Page, Roderic D M" uniqKey="Page R" first="Roderic D. M." last="Page">Roderic D. M. Page</name>
<affiliation>
<nlm:aff id="A10">Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow Glasgow, G12 8QQ. UK</nlm:aff>
<wicri:noCountry code="subfield">G12 8QQ. UK</wicri:noCountry>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">25901117</idno>
<idno type="pmc">4400380</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4400380</idno>
<idno type="RBID">PMC:4400380</idno>
<idno type="doi">10.3897/zookeys.494.9352</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000205</idno>
<idno type="wicri:Area/Pmc/Curation">000205</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000027</idno>
<idno type="wicri:Area/Ncbi/Merge">000224</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data</title>
<author>
<name sortKey="Guralnick, Robert P" sort="Guralnick, Robert P" uniqKey="Guralnick R" first="Robert P." last="Guralnick">Robert P. Guralnick</name>
<affiliation wicri:level="2">
<nlm:aff id="A1">Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Floride</region>
</placeName>
<wicri:cityArea>Florida Museum of Natural History, University of Florida, Gainesville</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Cellinese, Nico" sort="Cellinese, Nico" uniqKey="Cellinese N" first="Nico" last="Cellinese">Nico Cellinese</name>
<affiliation wicri:level="2">
<nlm:aff id="A1">Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Floride</region>
</placeName>
<wicri:cityArea>Florida Museum of Natural History, University of Florida, Gainesville</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Deck, John" sort="Deck, John" uniqKey="Deck J" first="John" last="Deck">John Deck</name>
<affiliation wicri:level="2">
<nlm:aff id="A2">Berkeley Natural History Museums, University of California, Berkeley, California, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Berkeley Natural History Museums, University of California, Berkeley, California</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Pyle, Richard L" sort="Pyle, Richard L" uniqKey="Pyle R" first="Richard L." last="Pyle">Richard L. Pyle</name>
<affiliation>
<nlm:aff id="A3">Department of Natural Sciences, Bernice P. Bishop Museum, Honolulu, HI USA 96817</nlm:aff>
<wicri:noCountry code="subfield">HI USA 96817</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Kunze, John" sort="Kunze, John" uniqKey="Kunze J" first="John" last="Kunze">John Kunze</name>
<affiliation wicri:level="2">
<nlm:aff id="A4">California Digital Library, University of California Office of the President, Oakland, CA USA</nlm:aff>
<country>États-Unis</country>
<placeName>
<region type="state">Californie</region>
</placeName>
<wicri:cityArea>California Digital Library, University of California Office of the President, Oakland</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Penev, Lyubomir" sort="Penev, Lyubomir" uniqKey="Penev L" first="Lyubomir" last="Penev">Lyubomir Penev</name>
<affiliation wicri:level="3">
<nlm:aff id="A5">Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, and Pensoft Publishers, Sofia, Bulgaria</nlm:aff>
<country xml:lang="fr">Bulgarie</country>
<wicri:regionArea>Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, and Pensoft Publishers, Sofia</wicri:regionArea>
<placeName>
<settlement type="city">Sofia</settlement>
<region nuts="2">Sofia-ville (oblast)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Walls, Ramona" sort="Walls, Ramona" uniqKey="Walls R" first="Ramona" last="Walls">Ramona Walls</name>
<affiliation wicri:level="2">
<nlm:aff id="A6">iPlant Collaborative, University of Arizona,Tucson, AZ 85721</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Arizona</region>
</placeName>
<wicri:cityArea>iPlant Collaborative, University of Arizona,Tucson</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Hagedorn, Gregor" sort="Hagedorn, Gregor" uniqKey="Hagedorn G" first="Gregor" last="Hagedorn">Gregor Hagedorn</name>
<affiliation wicri:level="3">
<nlm:aff id="A7">Museum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Invalidenstraße 43, 10115 Berlin, Germany</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Museum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Invalidenstraße 43, 10115 Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Agosti, Donat" sort="Agosti, Donat" uniqKey="Agosti D" first="Donat" last="Agosti">Donat Agosti</name>
<affiliation>
<nlm:aff id="A8">Plazi, Zinggstrasse 16, 3007 Bern, Switzerand</nlm:aff>
<wicri:noCountry code="subfield">Switzerand</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Wieczorek, John" sort="Wieczorek, John" uniqKey="Wieczorek J" first="John" last="Wieczorek">John Wieczorek</name>
<affiliation>
<nlm:aff id="A9">Museum of Vertebrate Zoology, University of California, Berkeley, CA USA. United States of America. 94720-3160</nlm:aff>
<wicri:noCountry code="subfield">CA USA. United States of America. 94720-3160</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Catapano, Terry" sort="Catapano, Terry" uniqKey="Catapano T" first="Terry" last="Catapano">Terry Catapano</name>
<affiliation>
<nlm:aff id="A8">Plazi, Zinggstrasse 16, 3007 Bern, Switzerand</nlm:aff>
<wicri:noCountry code="subfield">Switzerand</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Page, Roderic D M" sort="Page, Roderic D M" uniqKey="Page R" first="Roderic D. M." last="Page">Roderic D. M. Page</name>
<affiliation>
<nlm:aff id="A10">Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow Glasgow, G12 8QQ. UK</nlm:aff>
<wicri:noCountry code="subfield">G12 8QQ. UK</wicri:noCountry>
</affiliation>
</author>
</analytic>
<series>
<title level="j">ZooKeys</title>
<idno type="ISSN">1313-2989</idno>
<idno type="eISSN">1313-2970</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<label>Abstract</label>
<p>Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm
<pmc-comment>PageBreak</pmc-comment>
in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Baskauf, S" uniqKey="Baskauf S">S Baskauf</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Catapano, T" uniqKey="Catapano T">T Catapano</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cryer, P" uniqKey="Cryer P">P Cryer</name>
</author>
<author>
<name sortKey="Hyam, R" uniqKey="Hyam R">R Hyam</name>
</author>
<author>
<name sortKey="Miller, C" uniqKey="Miller C">C Miller</name>
</author>
<author>
<name sortKey="Nicolson, N" uniqKey="Nicolson N">N Nicolson</name>
</author>
<author>
<name sortKey=", Tuama" uniqKey=" T">Tuama Ó</name>
</author>
<author>
<name sortKey="Eamonn" uniqKey="Eamonn">Éamonn</name>
</author>
<author>
<name sortKey="Page, R" uniqKey="Page R">R Page</name>
</author>
<author>
<name sortKey="Rees, J" uniqKey="Rees J">J Rees</name>
</author>
<author>
<name sortKey="Riccardi, G" uniqKey="Riccardi G">G Riccardi</name>
</author>
<author>
<name sortKey="Richards, K" uniqKey="Richards K">K Richards</name>
</author>
<author>
<name sortKey="White, R" uniqKey="White R">R White</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deck, J" uniqKey="Deck J">J Deck</name>
</author>
<author>
<name sortKey="Guralnick, R" uniqKey="Guralnick R">R Guralnick</name>
</author>
<author>
<name sortKey="Walls, R" uniqKey="Walls R">R Walls</name>
</author>
<author>
<name sortKey="Blum, S" uniqKey="Blum S">S Blum</name>
</author>
<author>
<name sortKey="Haendel, M" uniqKey="Haendel M">M Haendel</name>
</author>
<author>
<name sortKey="Matsunaga, A" uniqKey="Matsunaga A">A Matsunaga</name>
</author>
<author>
<name sortKey="Wieczorek, J" uniqKey="Wieczorek J">J Wieczorek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guralnick, R" uniqKey="Guralnick R">R Guralnick</name>
</author>
<author>
<name sortKey="Conlin, T" uniqKey="Conlin T">T Conlin</name>
</author>
<author>
<name sortKey="Deck, J" uniqKey="Deck J">J Deck</name>
</author>
<author>
<name sortKey="Stucky, B" uniqKey="Stucky B">B Stucky</name>
</author>
<author>
<name sortKey="Cellinese, N" uniqKey="Cellinese N">N Cellinese</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hagedorn, G" uniqKey="Hagedorn G">G Hagedorn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hagedorn, G" uniqKey="Hagedorn G">G Hagedorn</name>
</author>
<author>
<name sortKey="Catapano, T" uniqKey="Catapano T">T Catapano</name>
</author>
<author>
<name sortKey="Guntsch, A" uniqKey="Guntsch A">A Güntsch</name>
</author>
<author>
<name sortKey="Mietchen, D" uniqKey="Mietchen D">D Mietchen</name>
</author>
<author>
<name sortKey="Endresen, D" uniqKey="Endresen D">D Endresen</name>
</author>
<author>
<name sortKey="Sierra, S" uniqKey="Sierra S">S Sierra</name>
</author>
<author>
<name sortKey="Groom, Q" uniqKey="Groom Q">Q Groom</name>
</author>
<author>
<name sortKey="Biserkov, J" uniqKey="Biserkov J">J Biserkov</name>
</author>
<author>
<name sortKey="Glockler, F" uniqKey="Glockler F">F Glöckler</name>
</author>
<author>
<name sortKey="Morris, R" uniqKey="Morris R">R Morris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Page, Rdm" uniqKey="Page R">RDM Page</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Page, Rdm" uniqKey="Page R">RDM Page</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Penev, L" uniqKey="Penev L">L Penev</name>
</author>
<author>
<name sortKey="Agosti, D" uniqKey="Agosti D">D Agosti</name>
</author>
<author>
<name sortKey="Georgiev, T" uniqKey="Georgiev T">T Georgiev</name>
</author>
<author>
<name sortKey="Catapano, T" uniqKey="Catapano T">T Catapano</name>
</author>
<author>
<name sortKey="Miller, J" uniqKey="Miller J">J Miller</name>
</author>
<author>
<name sortKey="Blagoderov, V" uniqKey="Blagoderov V">V Blagoderov</name>
</author>
<author>
<name sortKey="Roberts, D" uniqKey="Roberts D">D Roberts</name>
</author>
<author>
<name sortKey="Smith, Vs" uniqKey="Smith V">VS Smith</name>
</author>
<author>
<name sortKey="Brake, I" uniqKey="Brake I">I Brake</name>
</author>
<author>
<name sortKey="Ryrcroft, S" uniqKey="Ryrcroft S">S Ryrcroft</name>
</author>
<author>
<name sortKey="Scott, B" uniqKey="Scott B">B Scott</name>
</author>
<author>
<name sortKey="Johnson, Nf" uniqKey="Johnson N">NF Johnson</name>
</author>
<author>
<name sortKey="Morris, Ra" uniqKey="Morris R">RA Morris</name>
</author>
<author>
<name sortKey="Sautter, G" uniqKey="Sautter G">G Sautter</name>
</author>
<author>
<name sortKey="Chavan, V" uniqKey="Chavan V">V Chavan</name>
</author>
<author>
<name sortKey="Robertson, T" uniqKey="Robertson T">T Robertson</name>
</author>
<author>
<name sortKey="Remsen, D" uniqKey="Remsen D">D Remsen</name>
</author>
<author>
<name sortKey="Stoev, P" uniqKey="Stoev P">P Stoev</name>
</author>
<author>
<name sortKey="Parr, C" uniqKey="Parr C">C Parr</name>
</author>
<author>
<name sortKey="Knapp, S" uniqKey="Knapp S">S Knapp</name>
</author>
<author>
<name sortKey="Kress, Wj" uniqKey="Kress W">WJ Kress</name>
</author>
<author>
<name sortKey="Thompson, Fc" uniqKey="Thompson F">FC Thompson</name>
</author>
<author>
<name sortKey="Erwin, T" uniqKey="Erwin T">T Erwin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Penev, L" uniqKey="Penev L">L Penev</name>
</author>
<author>
<name sortKey="Catapano, T" uniqKey="Catapano T">T Catapano</name>
</author>
<author>
<name sortKey="Agosti, D" uniqKey="Agosti D">D Agosti</name>
</author>
<author>
<name sortKey="Sautter, G" uniqKey="Sautter G">G Sautter</name>
</author>
<author>
<name sortKey="Stoev, P" uniqKey="Stoev P">P Stoev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pereira, R" uniqKey="Pereira R">R Pereira</name>
</author>
<author>
<name sortKey="Hobern, D" uniqKey="Hobern D">D Hobern</name>
</author>
<author>
<name sortKey="Hyam, R" uniqKey="Hyam R">R Hyam</name>
</author>
<author>
<name sortKey="Belbin, L" uniqKey="Belbin L">L Belbin</name>
</author>
<author>
<name sortKey="Richards, K" uniqKey="Richards K">K Richards</name>
</author>
<author>
<name sortKey="Blum, S" uniqKey="Blum S">S Blum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pyle, Rl" uniqKey="Pyle R">RL Pyle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richards, K" uniqKey="Richards K">K Richards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richards, K" uniqKey="Richards K">K Richards</name>
</author>
<author>
<name sortKey="White, R" uniqKey="White R">R White</name>
</author>
<author>
<name sortKey="Nicolson, N" uniqKey="Nicolson N">N Nicolson</name>
</author>
<author>
<name sortKey="Pyle, R" uniqKey="Pyle R">R Pyle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, V" uniqKey="Smith V">V Smith</name>
</author>
<author>
<name sortKey="Georgiev, T" uniqKey="Georgiev T">T Georgiev</name>
</author>
<author>
<name sortKey="Stoev, P" uniqKey="Stoev P">P Stoev</name>
</author>
<author>
<name sortKey="Biserkov, J" uniqKey="Biserkov J">J Biserkov</name>
</author>
<author>
<name sortKey="Miller, J" uniqKey="Miller J">J Miller</name>
</author>
<author>
<name sortKey="Livermore, L" uniqKey="Livermore L">L Livermore</name>
</author>
<author>
<name sortKey="Baker, E" uniqKey="Baker E">E Baker</name>
</author>
<author>
<name sortKey="Mietchen, D" uniqKey="Mietchen D">D Mietchen</name>
</author>
<author>
<name sortKey="Couvreur, T" uniqKey="Couvreur T">T Couvreur</name>
</author>
<author>
<name sortKey="Mueller, G" uniqKey="Mueller G">G Mueller</name>
</author>
<author>
<name sortKey="Dikow, T" uniqKey="Dikow T">T Dikow</name>
</author>
<author>
<name sortKey="Helgen, K" uniqKey="Helgen K">K Helgen</name>
</author>
<author>
<name sortKey="Frank, J" uniqKey="Frank J">J Frank</name>
</author>
<author>
<name sortKey="Agosti, D" uniqKey="Agosti D">D Agosti</name>
</author>
<author>
<name sortKey="Roberts, D" uniqKey="Roberts D">D Roberts</name>
</author>
<author>
<name sortKey="Penev, L" uniqKey="Penev L">L Penev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Starr, J" uniqKey="Starr J">J Starr</name>
</author>
<author>
<name sortKey="Castro, E" uniqKey="Castro E">E Castro</name>
</author>
<author>
<name sortKey="Crosas, M" uniqKey="Crosas M">M Crosas</name>
</author>
<author>
<name sortKey="Dumontier, M" uniqKey="Dumontier M">M Dumontier</name>
</author>
<author>
<name sortKey="Downs, Rr" uniqKey="Downs R">RR Downs</name>
</author>
<author>
<name sortKey="Duerr, R" uniqKey="Duerr R">R Duerr</name>
</author>
<author>
<name sortKey="Haak, L" uniqKey="Haak L">L Haak</name>
</author>
<author>
<name sortKey="Haendel, M" uniqKey="Haendel M">M Haendel</name>
</author>
<author>
<name sortKey="Herman, I" uniqKey="Herman I">I Herman</name>
</author>
<author>
<name sortKey="Hodson, S" uniqKey="Hodson S">S Hodson</name>
</author>
<author>
<name sortKey="Hourcle, J" uniqKey="Hourcle J">J Hourclé</name>
</author>
<author>
<name sortKey="Kratz, Je" uniqKey="Kratz J">JE Kratz</name>
</author>
<author>
<name sortKey="Lin, J" uniqKey="Lin J">J Lin</name>
</author>
<author>
<name sortKey="Nielsen, Lh" uniqKey="Nielsen L">LH Nielsen</name>
</author>
<author>
<name sortKey="Nurnberger, A" uniqKey="Nurnberger A">A Nurnberger</name>
</author>
<author>
<name sortKey="Proll, S" uniqKey="Proll S">S Pröll</name>
</author>
<author>
<name sortKey="Rauber, A" uniqKey="Rauber A">A Rauber</name>
</author>
<author>
<name sortKey="Sacchi, S" uniqKey="Sacchi S">S Sacchi</name>
</author>
<author>
<name sortKey="Smith, Ap" uniqKey="Smith A">AP Smith</name>
</author>
<author>
<name sortKey="Taylor, M" uniqKey="Taylor M">M Taylor</name>
</author>
<author>
<name sortKey="Clark, T" uniqKey="Clark T">T Clark</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stucky, B" uniqKey="Stucky B">B Stucky</name>
</author>
<author>
<name sortKey="Deck, J" uniqKey="Deck J">J Deck</name>
</author>
<author>
<name sortKey="Conlin, T" uniqKey="Conlin T">T Conlin</name>
</author>
<author>
<name sortKey="Ziemba, L" uniqKey="Ziemba L">L Ziemba</name>
</author>
<author>
<name sortKey="Cellinese, N" uniqKey="Cellinese N">N Cellinese</name>
</author>
<author>
<name sortKey="Guralnick, R" uniqKey="Guralnick R">R Guralnick</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Walls, Rl" uniqKey="Walls R">RL Walls</name>
</author>
<author>
<name sortKey="Deck, J" uniqKey="Deck J">J Deck</name>
</author>
<author>
<name sortKey="Guralnick, R" uniqKey="Guralnick R">R Guralnick</name>
</author>
<author>
<name sortKey="Baskauf, S" uniqKey="Baskauf S">S Baskauf</name>
</author>
<author>
<name sortKey="Beaman, R" uniqKey="Beaman R">R Beaman</name>
</author>
<author>
<name sortKey="Blum, S" uniqKey="Blum S">S Blum</name>
</author>
<author>
<name sortKey="Bowers, S" uniqKey="Bowers S">S Bowers</name>
</author>
<author>
<name sortKey="Buttigieg, Pl" uniqKey="Buttigieg P">PL Buttigieg</name>
</author>
<author>
<name sortKey="Davies, N" uniqKey="Davies N">N Davies</name>
</author>
<author>
<name sortKey="Endresen, D" uniqKey="Endresen D">D Endresen</name>
</author>
<author>
<name sortKey="Gandolfo, Ma" uniqKey="Gandolfo M">MA Gandolfo</name>
</author>
<author>
<name sortKey="Hanner, R" uniqKey="Hanner R">R Hanner</name>
</author>
<author>
<name sortKey="Janning, A" uniqKey="Janning A">A Janning</name>
</author>
<author>
<name sortKey="Krishtalka, L" uniqKey="Krishtalka L">L Krishtalka</name>
</author>
<author>
<name sortKey="Matsunaga, A" uniqKey="Matsunaga A">A Matsunaga</name>
</author>
<author>
<name sortKey="Midford, P" uniqKey="Midford P">P Midford</name>
</author>
<author>
<name sortKey="Morrison, N" uniqKey="Morrison N">N Morrison</name>
</author>
<author>
<name sortKey=" Tuama, E" uniqKey=" Tuama E">E Ó Tuama</name>
</author>
<author>
<name sortKey="Schildhauer, M" uniqKey="Schildhauer M">M Schildhauer</name>
</author>
<author>
<name sortKey="Smith, B" uniqKey="Smith B">B Smith</name>
</author>
<author>
<name sortKey="Stucky, Bj" uniqKey="Stucky B">BJ Stucky</name>
</author>
<author>
<name sortKey="Thomer, A" uniqKey="Thomer A">A Thomer</name>
</author>
<author>
<name sortKey="Wieczorek, J" uniqKey="Wieczorek J">J Wieczorek</name>
</author>
<author>
<name sortKey="Whitacre, J" uniqKey="Whitacre J">J Whitacre</name>
</author>
<author>
<name sortKey="Wooley, J" uniqKey="Wooley J">J Wooley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wieczorek, Jd" uniqKey="Wieczorek J">JD Wieczorek</name>
</author>
<author>
<name sortKey="Bloom, R" uniqKey="Bloom R">R Bloom</name>
</author>
<author>
<name sortKey="Guralnick, S" uniqKey="Guralnick S">S Guralnick</name>
</author>
<author>
<name sortKey="Blum, M" uniqKey="Blum M">M Blum</name>
</author>
<author>
<name sortKey="Doring, R" uniqKey="Doring R">R Döring</name>
</author>
<author>
<name sortKey="De Giovanni, T" uniqKey="De Giovanni T">T De Giovanni</name>
</author>
<author>
<name sortKey="Robertson" uniqKey="Robertson">Robertson</name>
</author>
<author>
<name sortKey="Vieglais, D" uniqKey="Vieglais D">D Vieglais</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Zookeys</journal-id>
<journal-id journal-id-type="iso-abbrev">Zookeys</journal-id>
<journal-id journal-id-type="publisher-id">ZooKeys</journal-id>
<journal-title-group>
<journal-title>ZooKeys</journal-title>
</journal-title-group>
<issn pub-type="ppub">1313-2989</issn>
<issn pub-type="epub">1313-2970</issn>
<publisher>
<publisher-name>Pensoft Publishers</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">25901117</article-id>
<article-id pub-id-type="pmc">4400380</article-id>
<article-id pub-id-type="doi">10.3897/zookeys.494.9352</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Review Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Guralnick</surname>
<given-names>Robert P.</given-names>
</name>
<xref ref-type="aff" rid="A1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Cellinese</surname>
<given-names>Nico</given-names>
</name>
<xref ref-type="aff" rid="A1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Deck</surname>
<given-names>John</given-names>
</name>
<xref ref-type="aff" rid="A2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pyle</surname>
<given-names>Richard L.</given-names>
</name>
<xref ref-type="aff" rid="A3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kunze</surname>
<given-names>John</given-names>
</name>
<xref ref-type="aff" rid="A4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Penev</surname>
<given-names>Lyubomir</given-names>
</name>
<xref ref-type="aff" rid="A5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Walls</surname>
<given-names>Ramona</given-names>
</name>
<xref ref-type="aff" rid="A6">6</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hagedorn</surname>
<given-names>Gregor</given-names>
</name>
<xref ref-type="aff" rid="A7">7</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Agosti</surname>
<given-names>Donat</given-names>
</name>
<xref ref-type="aff" rid="A8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wieczorek</surname>
<given-names>John</given-names>
</name>
<xref ref-type="aff" rid="A9">9</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Catapano</surname>
<given-names>Terry</given-names>
</name>
<xref ref-type="aff" rid="A8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Page</surname>
<given-names>Roderic D. M.</given-names>
</name>
<xref ref-type="aff" rid="A10">10</xref>
</contrib>
</contrib-group>
<aff id="A1">
<label>1</label>
Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA</aff>
<aff id="A2">
<label>2</label>
Berkeley Natural History Museums, University of California, Berkeley, California, USA</aff>
<aff id="A3">
<label>3</label>
Department of Natural Sciences, Bernice P. Bishop Museum, Honolulu, HI USA 96817</aff>
<aff id="A4">
<label>4</label>
California Digital Library, University of California Office of the President, Oakland, CA USA</aff>
<aff id="A5">
<label>5</label>
Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, and Pensoft Publishers, Sofia, Bulgaria</aff>
<aff id="A6">
<label>6</label>
iPlant Collaborative, University of Arizona,Tucson, AZ 85721</aff>
<aff id="A7">
<label>7</label>
Museum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Invalidenstraße 43, 10115 Berlin, Germany</aff>
<aff id="A8">
<label>8</label>
Plazi, Zinggstrasse 16, 3007 Bern, Switzerand</aff>
<aff id="A9">
<label>9</label>
Museum of Vertebrate Zoology, University of California, Berkeley, CA USA. United States of America. 94720-3160</aff>
<aff id="A10">
<label>10</label>
Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow Glasgow, G12 8QQ. UK</aff>
<author-notes>
<corresp>Corresponding author: Robert P. Guralnick (
<email xlink:type="simple">rguralnick@flmnh.ufl.edu</email>
)</corresp>
<fn fn-type="edited-by">
<p>Academic editor: R. Mesibov</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<pub-date pub-type="epub">
<day>6</day>
<month>4</month>
<year>2015</year>
</pub-date>
<issue>494</issue>
<fpage>133</fpage>
<lpage>154</lpage>
<history>
<date date-type="received">
<day>7</day>
<month>2</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>3</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Robert P. Guralnick, Nico Cellinese, John Deck, Richard L. Pyle, John Kunze, Lyubomir Penev, Ramona Walls, Gregor Hagedorn, Donat Agosti, John Wieczorek, Terry Catapano, Roderic D. M. Page</copyright-statement>
<license license-type="creative-commons-attribution" xlink:href="http://creativecommons.org/licenses/by/4.0">
<license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
</license>
</permissions>
<self-uri content-type="zoobank" xlink:type="simple" xlink:href="http://zoobank.org/88A49A9D-612C-4BA7-93DB-B4A203346506">http://zoobank.org/88A49A9D-612C-4BA7-93DB-B4A203346506</self-uri>
<abstract>
<label>Abstract</label>
<p>Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm
<pmc-comment>PageBreak</pmc-comment>
in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>Biocollections</kwd>
<kwd>identifiers</kwd>
<kwd>Globally Unique Identifiers</kwd>
<kwd>GUIDs</kwd>
<kwd>field collections</kwd>
<kwd>legacy collections</kwd>
<kwd>linked open data</kwd>
<kwd>semantic publishing</kwd>
</kwd-group>
</article-meta>
<notes>
<sec sec-type="Citation">
<title>Citation</title>
<p>Guralnick RP, Cellinese N, Deck J, Pyle RL, Kunze J, Penev L, Walls R, Hagedorn G, Agosti D, Wieczorek J, Catapano T, Page EDM (2015) Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data. ZooKeys 494: 133–154. doi:
<ext-link ext-link-type="doi" xlink:href="10.3897/zookeys.494.9352">10.3897/zookeys.494.9352</ext-link>
</p>
</sec>
</notes>
</front>
<body>
<sec sec-type="Introduction">
<title>Introduction</title>
<p>The current biodiversity and genomic fields are characterized by large and rapidly growing digital datasets. While this trend in digitizing the global biodiversity knowledge base is valuable and important for accessing and synthesizing biodiversity information in the era of the Internet and Big Data, much of this information remains only loosely integrated. Efforts to cross-link otherwise disconnected silos of data (
<xref rid="B9" ref-type="bibr">Page 2008</xref>
,
<xref rid="B10" ref-type="bibr">2009</xref>
) still rely on largely imprecise points of intersection, such as text-string taxon names (as proxies for taxon concepts), combinations of institution codes, collection codes, and catalog numbers (as labels for biological specimens and other samples), and aggregates of metadata that allow inferring equivalency (e.g., a combination of place, time, and participants for collecting events).</p>
<p>The necessary solution to build more connected, cross-linked and digitially accessible Internet content is to assign recognizable, persistent, globally unique, stable identifiers to biocollections specimens and data objects. While effort has been put forth on applicability statements for both Life Science Identifiers (LSIDs) and globally unique identifiers (GUIDs) (
<xref rid="B13" ref-type="bibr">Pereira et al. 2007</xref>
, Richards 2009), and on other fronts (
<xref rid="B14" ref-type="bibr">Pyle 2006</xref>
,
<xref rid="B4" ref-type="bibr">Cryer et al. 2009</xref>
,
<xref rid="B1" ref-type="bibr">Baskauf 2010</xref>
,
<xref rid="B16" ref-type="bibr">Richards et al. 2011</xref>
,
<xref rid="B20" ref-type="bibr">TDWG 2013</xref>
,
<xref rid="B2" ref-type="bibr">Bouchout Declaration 2014</xref>
), no single solution or clear best practice has taken hold in the biocollections community. To illustrate, Table
<xref ref-type="table" rid="T1">1</xref>
shows some example of identifiers associated with data mobilised by GBIF and includes LSIDs, URNs, HTTP-URIs (URLs) of various types, and DOIs (See Box
<xref ref-type="table" rid="T2">1</xref>
for explanations of abbreviations used in this article). The community has also struggled to define its view on identifier and dereferencing service persistence, and whether physical objects and abstract concepts should have identifiers that include embedded information on dereferencing services and protocols (a dereferenceable identifier contains an Internet protocol that directs a client to information about the resource it identifies), or whether functions of object identification and dereferencing should be decoupled. Further, and perhaps most important, the next steps towards a community-wide GUID solution are unclear.</p>
<table-wrap id="T1" orientation="portrait" position="float">
<label>Table 1.</label>
<caption>
<p>Examples of identifiers in use for biological samples in the GBIF database.</p>
</caption>
<table frame="hsides" rules="all">
<tbody>
<tr>
<th rowspan="1" colspan="1">GBIF occurrence</th>
<th rowspan="1" colspan="1">Identifier type</th>
<th rowspan="1" colspan="1">Identifier</th>
<th rowspan="1" colspan="1">Catalog number</th>
<th rowspan="1" colspan="1">Collection</th>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/872747863">872747863</ext-link>
</td>
<td rowspan="1" colspan="1">LSID</td>
<td rowspan="1" colspan="1">urn:lsid:biosci.ohio-state.edu:osuc_occurrences:OSUC__169968</td>
<td rowspan="1" colspan="1">OSUC 169968</td>
<td rowspan="1" colspan="1">C.A. Triplehorn Insect Collection</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/896421698">896421698</ext-link>
</td>
<td rowspan="1" colspan="1">URN</td>
<td rowspan="1" colspan="1">urn:occurrence:Arctos:MVZ:Bird:157675:1526959</td>
<td rowspan="1" colspan="1">MVZ 157675</td>
<td rowspan="1" colspan="1">MVZ Bird Collection</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/784060956">784060956</ext-link>
</td>
<td rowspan="1" colspan="1">URN</td>
<td rowspan="1" colspan="1">urn:catalog:UMMZ:Mammals:171041</td>
<td rowspan="1" colspan="1">UMMZ 71041</td>
<td rowspan="1" colspan="1">UMMZ Mammal Collection</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/575336458">575336458</ext-link>
</td>
<td rowspan="1" colspan="1">HTTP URI</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://data.rbge.org.uk/herb/E00115694">http://data.rbge.org.uk/herb/E00115694</ext-link>
</td>
<td rowspan="1" colspan="1">E00115694</td>
<td rowspan="1" colspan="1">Royal Botanic Garden Edinburgh Herbarium</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/1050474791">1050474791</ext-link>
</td>
<td rowspan="1" colspan="1">HTTP URI</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://arctos.database.museum/guid/UAM:Ento:230092">http://arctos.database.museum/guid/UAM:Ento:230092</ext-link>
</td>
<td rowspan="1" colspan="1">UAM 230092</td>
<td rowspan="1" colspan="1">UAM Entomology Collection</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/1050474791">1050474791</ext-link>
</td>
<td rowspan="1" colspan="1">DOI</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="doi" xlink:href="10.7299/X7VQ32SJ">10.7299/X7VQ32SJ</ext-link>
</td>
<td rowspan="1" colspan="1">UAM 230092</td>
<td rowspan="1" colspan="1">UAM Entomology Collection</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/624211191">624211191</ext-link>
</td>
<td rowspan="1" colspan="1">UUID</td>
<td rowspan="1" colspan="1">EF0A4D3E-702F-4882-81B8-CA737AEB7B28</td>
<td rowspan="1" colspan="1">UF 161444</td>
<td rowspan="1" colspan="1">UF FLMNH Ichthyology</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.gbif.org/occurrence/476850316">476850316</ext-link>
</td>
<td rowspan="1" colspan="1">Darwin Core Triplet</td>
<td rowspan="1" colspan="1">MCZ:Mamm:8831</td>
<td rowspan="1" colspan="1">MCZ 8831</td>
<td rowspan="1" colspan="1">Museum of Comparative Zoology, Harvard University</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T2" orientation="portrait" position="float">
<label>Box 1.</label>
<caption>
<p>Abbreviations and the full spelled out version or more detailed meaning.</p>
</caption>
<table frame="hsides" rules="all">
<tbody>
<tr>
<td rowspan="1" colspan="1">ABCD</td>
<td rowspan="1" colspan="1">Access to Biological Collections Data</td>
</tr>
<tr>
<td rowspan="1" colspan="1">ARK</td>
<td rowspan="1" colspan="1">Archival Resource Key</td>
</tr>
<tr>
<td rowspan="1" colspan="1">BCO</td>
<td rowspan="1" colspan="1">Biological Collections Ontology</td>
</tr>
<tr>
<td rowspan="1" colspan="1">DMP</td>
<td rowspan="1" colspan="1">Data Management Plan</td>
</tr>
<tr>
<td rowspan="1" colspan="1">DOI</td>
<td rowspan="1" colspan="1">Digital Object Identifier</td>
</tr>
<tr>
<td rowspan="1" colspan="1">EZID</td>
<td rowspan="1" colspan="1">A type of identifier & system run by California Digital Library</td>
</tr>
<tr>
<td rowspan="1" colspan="1">GBIF</td>
<td rowspan="1" colspan="1">Global Biodiversity Information Facility</td>
</tr>
<tr>
<td rowspan="1" colspan="1">GRBio</td>
<td rowspan="1" colspan="1">Global Repository of Biorepositories</td>
</tr>
<tr>
<td rowspan="1" colspan="1">GUID</td>
<td rowspan="1" colspan="1">Globally Unique Identifier</td>
</tr>
<tr>
<td rowspan="1" colspan="1">HTTP-URI</td>
<td rowspan="1" colspan="1">HTTP Uniform Resource Identifier</td>
</tr>
<tr>
<td rowspan="1" colspan="1">IGSN</td>
<td rowspan="1" colspan="1">International Geosample Number</td>
</tr>
<tr>
<td rowspan="1" colspan="1">LOD</td>
<td rowspan="1" colspan="1">Linked Open Data</td>
</tr>
<tr>
<td rowspan="1" colspan="1">LSID</td>
<td rowspan="1" colspan="1">Life Sciences Identifier</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NEON</td>
<td rowspan="1" colspan="1">National Ecological Observatory Network</td>
</tr>
<tr>
<td rowspan="1" colspan="1">OCR</td>
<td rowspan="1" colspan="1">Optical Character Recognition</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TDWG</td>
<td rowspan="1" colspan="1">Biodiversity Information Standards</td>
</tr>
<tr>
<td rowspan="1" colspan="1">URI</td>
<td rowspan="1" colspan="1">Uniform Resource Identifier</td>
</tr>
<tr>
<td rowspan="1" colspan="1">URL</td>
<td rowspan="1" colspan="1">Uniform Resource Locator</td>
</tr>
<tr>
<td rowspan="1" colspan="1">URN</td>
<td rowspan="1" colspan="1">Uniform Resource Name</td>
</tr>
<tr>
<td rowspan="1" colspan="1">UUID</td>
<td rowspan="1" colspan="1">Universally Unique Identifier</td>
</tr>
</tbody>
</table>
</table-wrap>
<pmc-comment>PageBreak</pmc-comment>
<p>The application of identifiers to biocollections and the physical (and conceptual) objects they contain is complicated by both long and ingrained identifier curation practice, and a rapidly changing technology landscape. Legacy collections often have
<pmc-comment>PageBreak</pmc-comment>
a checkered past of provenance-tracking; as a result, essential linkages between data and collections have been lost due to lack of coordination and data practices predating digital recording. New, “born-digital” sampling methods promise to open floodgates of data and can make it easier to assign globally unique identifiers at the point of data creation. Thus, the optimal identifier solutions for new collections may be different than those for legacy data. Adding to the challenge, vast amounts of biodiversity data are in the scientific literature, which is the oldest form of biodiversity reporting. These data can be mined from the legacy literature but are largely “hidden” in non-semantic formats. In the future, advances in digital publishing will enable data to be more thoroughly linked to the literature, and vice-versa (
<xref rid="B11" ref-type="bibr">Penev et al. 2010</xref>
), thus laying the foundation for new best practices for citing datasets by means of identifiers.</p>
<p>In order to further progress on this critical issue, a group of biocollections and biodiversity informatics experts and stakeholders (Appendix
<xref ref-type="app" rid="App1">1</xref>
) assembled at the Stockholm Museum of Natural History, 25–26 October 2014 to lay out a set of recommendations and next steps for community-wide approaches to globally unique identifier assignment, persistence, and dereferencing. After the opening discussions and compiling of key identifier characteristics (Box
<xref ref-type="table" rid="T3">2</xref>
), the participants organized into four subgroups during the meeting: New biocollections, legacy biocollections, semantically enabled publications, and cross-cutting issues. In this paper we review the workshop results under those four headings and summarise consensus views on what should happen next.</p>
<table-wrap id="T3" orientation="portrait" position="float">
<label>Box 2.</label>
<caption>
<p>Below the main characteristics of identifier schemes are listed. The list is not meant to be exhaustive but is intended to cover the major differences across different approaches.</p>
</caption>
<table frame="hsides" rules="all">
<tbody>
<tr>
<td rowspan="1" colspan="1">Identifier Schemes:
<list list-type="bullet">
<list-item>
<p>support
<bold>locally unique</bold>
(e.g., catalog numbers) and/or
<bold>globally unique</bold>
(e.g. DOIs, URLs or UUIDs) identifiers. Global uniqueness is vital to minimize ambiguity;</p>
</list-item>
<list-item>
<p>provide identifiers that are
<bold>actionable.</bold>
Actionable identifiers may rely on special knowledge (e.g. for LSIDs, DOIs, or http services for plain identifiers) or they may rely on Internet standards (URIs);</p>
</list-item>
<list-item>
<p>may require resolvers to support access to the
<bold>object</bold>
and to its
<bold>metadata</bold>
; for example,
<bold>content negotiation</bold>
(e.g., used by Linked Open Data) supports the provision of a human-readable object in one context and machine-readable metadata (e.g., RDF, JSON) in another context; additionally,
<bold>inflections</bold>
(e.g., ARK) let an ordinary user add to the identifier to request the object or its metadata</p>
</list-item>
<list-item>
<p>may use
<bold>centralized</bold>
(e.g.,
<ext-link ext-link-type="uri" xlink:href="http://purl.org">purl.org</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://doi.org">doi.org</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://n2t.net">n2t.net</ext-link>
) or
<bold>decentralized dereferencing hosts</bold>
(e.g., an institutional site);</p>
</list-item>
<list-item>
<p>may support
<bold>transparent identifiers</bold>
(e.g., identifier strings that contain information which can lead to semantic guesses by humans, such as collection numbers, collectors’ initials, or institutional names) or
<bold>opaque identifiers</bold>
, e.g. strings of letters and digits created by software (counter, UUID generator, Noid minter);</p>
</list-item>
<list-item>
<p>may come with
<bold>fees</bold>
for creation of an identifier (e.g. DOIs);</p>
</list-item>
<list-item>
<p>may come with
<bold>fees</bold>
for the use of the resolver; these fees, which affect
<bold>scalability</bold>
, are separate from the time and effort required of end-providers no matter which identifier scheme they use (object curation, disk storage, updating resolver data as the object moves, etc.);</p>
</list-item>
<list-item>
<p>may come with
<bold>metadata</bold>
requirements (e.g., DataCite DOIs) or guidelines; presence or absence of
<bold>citation</bold>
metadata can affect
<bold>visibility</bold>
;</p>
</list-item>
<list-item>
<p>may come with administrative tools for central identifier
<bold>registration</bold>
; besides recording metadata, registration enters identifiers into a database so that the resolver host can look it up and forward requests to the object’s current location; for example, user interfaces and APIs exist for EZID ARKs, DataCite DOIs, Handles, and PURLs</p>
</list-item>
</list>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>Application of Identifiers to Newly Collected Field Biocollections</title>
<p>Field biocollections are extraordinarily diverse and continue to grow in scope and scale with the advent of novel technologies such as environmental DNA analyses (e.g., metagenomics), and new continental field-based endeavors such as the National Ecological Observatory Network (NEON;
<ext-link ext-link-type="uri" xlink:href="http://www.neoninc.org/">http://www.neoninc.org/</ext-link>
) in the United States. Current practices in field collecting are highly heterogeneous and often based on traditional practices of local identifier assignment. Traditionally, “field numbers” are assigned prior to the specimen being fully accessioned. More permanent identifiers (which are also often only locally unique within an institution) are assigned when specimens are accessioned in a collection. In some cases, organizations and communities are already using globally unique identifier systems and even assigning permanent UUIDs for field collection objects while still in the field (as is planned by NEON). In contrast, the geology community has rallied around International Geo Sample Numbers (IGSNs;
<ext-link ext-link-type="uri" xlink:href="http://www.geosamples.org/igsnabout">http://www.geosamples.org/igsnabout</ext-link>
), which provide not just global uniqueness, but also minting authority, governance, and a set of services for resolving those numbers that are managed centrally. The lack of consistent practices in biological field sampling compared to what has been accomplished in geology is a lamentable drawback in biodiversity research.</p>
<p>The assignment of local identifiers (e.g., catalog numbers) to specimens for internal management purposes and for external referencing has been the standard practice
<pmc-comment>PageBreak</pmc-comment>
of biocollections for centuries. As long as humans need to communicate with other humans about specimens, this practice will (and should) continue. By themselves, however, such local identifiers ultimately lead to reduced value of specimens if they are used as the nexus to which all other derived, digitized data connect. The main problem is that local identifiers are not sufficient for linking data across the Internet; globally unique and persistent identifiers are a requirement for this. Thus, to maximize the value of specimens for both human-human communication and human-computer (as well as computer-computer) communication, globally unique identifiers should be issued to data objects together with local identifiers.</p>
<sec sec-type="Roadblocks">
<title>Roadblocks</title>
<p>Providing a chain of provenance for specimens and related data is a major challenge and has a set of roadblocks along multiple dimensions. Traditional field collecting methods are ingrained in many scientists. The informatics community needs to reach out more effectively and explain to scientists the limitations of existing workflows and why an identifier scheme built around global uniqueness is not only necessary from an
<pmc-comment>PageBreak</pmc-comment>
informatics standpoint, but would dramatically enhance the value of data for re-use, syntheses and analyses. Identifier solutions must support scientists’ current practices and create minimal burden during the collecting process. The solution should provide incentives for adoption, both in the field and in downstream information systems. In particular, effort is needed to ensure perpetuation of field-assigned identifiers through to more permanent data curation steps. Whatever underlying identifier system is chosen, it needs to be robust in preventing the same identifiers from being assigned to different objects (and, ideally, reducing circumstances where the same object receives multiple identifiers).</p>
<p>An additional roadblock is a lack of clarity as to which classes of objects, concepts, or events identifiers should be assigned. Should GUIDs be associated with the actual, physical specimen or with the derived digital (e.g. images) or physical (e.g. tissues) derivatives? Focusing on biocollections specimens as material samples helps semantically clarify what bears the identifier, but many other modeling challenges relating measurement processes etc. to specimens still remain. Even for physical specimens, there are challenges in defining the types of entities that can constitute a specimen, which range from a distinct organism to a part of an organism, to a set of organisms, to abiotic samples containing specimens (e.g., a jar of seawater).</p>
</sec>
<sec sec-type="Next Steps">
<title>Next Steps</title>
<p>For newly collected samples, a highly desirable next step is the ability to assign globally unique identifiers directly to newly collected specimens or mixed samples in the field or shortly thereafter. In many cases, it may be desirable that these identifiers be pre-minted and written into a physical barcode or QR-Code, perhaps in conjunction with a human-friendly identifier. Figures
<xref ref-type="fig" rid="F1">1</xref>
and
<xref ref-type="fig" rid="F2">2</xref>
show different examples, the first representing a traditional biocollections object and the second depicting mass-labeling of tubes associated with collections samples. Assigning GUIDs to specimens at the time of collection allows field researchers to publish references to recently collected specimens without waiting for institutional identifiers that are assigned during the accession process. Beyond simply assigning unique identifiers in the field, it is critical that these identifiers persist perpetually with the objects they identify and all descendant samples, subsamples, analyses, data and publications referring to them, ensuring an unbroken chain of data provenance. In the best of all possible worlds, identifiers assigned in the field are retained as the permanent institutional identifier during accessioning.</p>
<fig id="F1" orientation="portrait" position="float">
<label>Figure 1.</label>
<caption>
<p>Example of UUIDs embedded within QR-Codes on microcentrifuge tube labels. The 5 mm × 5 mm QR-Codes (Version 2) are printed with a standard laser printer on sheets of self-adhesive 9 mm dots, and scan reliably with a standard barcode reader, while still providing room for a human-readable 5-character prefix + 5-digit number (the human-readable number and UUID are permanently cross-linked in the data management system). Photo: Robert K. Whitton.</p>
</caption>
<graphic id="oo_41586.jpg" xlink:href="zookeys-494-133-g001"></graphic>
</fig>
<fig id="F2" orientation="portrait" position="float">
<label>Figure 2.</label>
<caption>
<p>Example of a PURL-URI as a QR-Code, in this example attached to a digitised lichen type specimen in the Natural History Museum, University of Oslo. The QR-Code corresponds to
<ext-link ext-link-type="uri" xlink:href="http://purl.org/nhmuio/id/c1a8b878-a4f9-448b-be00-26cbad58b11c">http://purl.org/nhmuio/id/c1a8b878-a4f9-448b-be00-26cbad58b11c</ext-link>
.</p>
</caption>
<graphic id="oo_41587.jpg" xlink:href="zookeys-494-133-g002"></graphic>
</fig>
<p>It is not feasible (or, at this stage, even desirable) for the entire biodiversity community to adopt a single implementation for identifiers. However, evaluation of the available technical solutions is a high priority, and the scope of solutions includes IGSNs, DOIs, EZID ARKs, LOD-URIs and UUIDs (comparisons among many of the different options are shown in Table
<xref ref-type="table" rid="T4">2</xref>
and a comparison of more or less centrally managed mapping and redirection services is shown in Figure
<xref ref-type="fig" rid="F3">3</xref>
). The group explored
<pmc-comment>PageBreak</pmc-comment>
several different viewpoints promoting the utilization of HTTP URIs for all identifiers and did not reach a consensus. HTTP URIs have the advantage that they provide a semantic web compatible default dereferencing method through the standard http protocol and can be flexibly constructed (
<xref rid="B8" ref-type="bibr">Hagedorn et al. 2013</xref>
). The advantage of many identifiers not being a HTTP URI is that the omission of a default dereferencing method avoids potential confusion and may allow for even greater flexibility. However, we recommend all identifiers have the ability to be dereferenceable through at least one http-based service, even if the http-form is not preferred.</p>
<fig id="F3" orientation="portrait" position="float">
<label>Figure 3.</label>
<caption>
<p>Identifier schemes differ in whether redirections and mappings to ensure stability are centrally managed or not. Top: a DOI dereferencing service like CrossRef or Datacite redirects to the actual content provider; the URIs of content data and RDF metadata are publicly visible and can be used as independent (albeit often unstable) identifiers. Bottom: A linked open data pattern, where each content provider assumes the responsibility for maintaining a stable mapping; the content negotiation is internal. Modified after
<xref rid="B7" ref-type="bibr">Hagedorn 2013</xref>
.</p>
</caption>
<graphic id="oo_41588.jpg" xlink:href="zookeys-494-133-g003"></graphic>
</fig>
<table-wrap id="T4" orientation="portrait" position="float">
<label>Table 2.</label>
<caption>
<p>Identifiers schemes according to key characteristics noted in part in Box
<xref ref-type="table" rid="T3">2</xref>
.</p>
</caption>
<table frame="hsides" rules="all">
<tbody>
<tr>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">Identifier characteristics</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">DataCite DOI</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">EZID ARK</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">OCLC PURL</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">Self-minted HTTP URI
<xref ref-type="table-fn" rid="TN1">*</xref>
</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">LSID</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">DwC Triplet</th>
<th rowspan="1" colspan="1" style="background-color:#D5D5D5;">UUID</th>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Globally Unique</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Service Metadata Required for global uniqueness</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">no</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Per-identifier Cost</bold>
</td>
<td rowspan="1" colspan="1">per id or subscription fee</td>
<td rowspan="1" colspan="1">yearly subscription fee</td>
<td rowspan="1" colspan="1">free</td>
<td rowspan="1" colspan="1">free</td>
<td rowspan="1" colspan="1">free</td>
<td rowspan="1" colspan="1">free</td>
<td rowspan="1" colspan="1">free</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Identifier Issuance</bold>
</td>
<td rowspan="1" colspan="1">registration</td>
<td rowspan="1" colspan="1">registration
<xref ref-type="table-fn" rid="TN2">**</xref>
</td>
<td rowspan="1" colspan="1">registration</td>
<td rowspan="1" colspan="1">local</td>
<td rowspan="1" colspan="1">local</td>
<td rowspan="1" colspan="1">local</td>
<td rowspan="1" colspan="1">local</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Human-Friendly</bold>
</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">high</td>
<td rowspan="1" colspan="1">low</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Opacity</bold>
</td>
<td rowspan="1" colspan="1">partial</td>
<td rowspan="1" colspan="1">partial</td>
<td rowspan="1" colspan="1">partial</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">high</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Adoption by biodiversity informatics community</bold>
</td>
<td rowspan="1" colspan="1">biodiversity publishing</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">high</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">collections community</td>
<td rowspan="1" colspan="1">variable</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Adoption by broader informatics infrastructures</bold>
</td>
<td rowspan="1" colspan="1">variable</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">variable</td>
<td rowspan="1" colspan="1">high</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">low</td>
<td rowspan="1" colspan="1">high</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Dereferencing Service Integration</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">no</td>
</tr>
<tr>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;">
<bold>Dereferencing Characteristics</bold>
</td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
<td rowspan="1" colspan="1" style="background-color:#D5D5D5;"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Dereferencing Type</bold>
</td>
<td rowspan="1" colspan="1">central</td>
<td rowspan="1" colspan="1">central</td>
<td rowspan="1" colspan="1">central</td>
<td rowspan="1" colspan="1">distributed</td>
<td rowspan="1" colspan="1">distributed</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Structured Identifier Responses directly from resolver</bold>
<xref ref-type="table-fn" rid="TN3">***</xref>
</td>
<td rowspan="1" colspan="1">HTML, RDF/XML</td>
<td rowspan="1" colspan="1">HTML</td>
<td rowspan="1" colspan="1">HTML</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Redirection</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">possible</td>
<td rowspan="1" colspan="1">possible</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Clear Namespace policy and contract</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Resolution service backed by institutions</bold>
</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">yes</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">provider dependent</td>
<td rowspan="1" colspan="1">no</td>
<td rowspan="1" colspan="1">
<xref ref-type="table-fn" rid="TN4">****</xref>
</td>
<td rowspan="1" colspan="1">
<xref ref-type="table-fn" rid="TN4">****</xref>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TN1">
<label>*</label>
<p>Self-minted HTTP URIs may include ARKs or PURLs as well</p>
</fn>
<fn id="TN2">
<label>**</label>
<p>ARKs have special mechanisms to extend scalability</p>
</fn>
<fn id="TN3">
<label>***</label>
<p>Structured metadata responses may be available after redirection, depending on the provider (e.g.
<ext-link ext-link-type="uri" xlink:href="http://dublincore.org">dublincore.org</ext-link>
returns RDF/XML for PURLs)</p>
</fn>
<fn id="TN4">
<label>****</label>
<p>Perhaps, if hosted by a general service (e.g. GrBio for Biocollections, GBIF for occurrence records, etc.)</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The group strongly suggested that an immediate next step would be to prototype solutions to create persistent identifiers built on different, existing platforms. Such prototypes would engage stakeholders in testing and feedback in order to refine prototypes. The prototypes could also spawn key actions, including more focused workshops/hackathons, perhaps in the context of the Taxonomic Databases Working Group meetings (TDWG), with the goal of reporting outputs of such trials. TDWG, in particular, is a crucial stakeholder as an international standards organization for biodiversity objects and data.</p>
<pmc-comment>PageBreak</pmc-comment>
<p>Scaling up to a larger system will require obtaining funding to support development. A fruitful path would be to align a few organizations that are working nationally or globally (e.g., NEON, iPlant (
<ext-link ext-link-type="uri" xlink:href="http://iplant.org">http://iplant.org</ext-link>
), iDigBio, Critical Zone Observatories, Consortium of European Taxonomic Facilities) to adopt an early version of the system and to show interoperability and enhanced ability for tracking specimens and their derivatives as an outcome. For those more at the longer-tail of the specimen curation process, such as smaller biocollections or individual labs, incentives for adopting a system to replace the local numbering systems currently in practice could help coalesce efforts, and could further promote the value of such approaches when putting together data management plans (DMP) for funding agencies. In particular, identifier-specific DMP Tool (
<ext-link ext-link-type="uri" xlink:href="https://dmptool.org/">https://dmptool.org/</ext-link>
) template content should be provided. Finally, with the strong growth of handheld devices, the biodiversity informatics community should work to produce tools for assigning identifiers with such devices.</p>
<p>A more detailed implementation proposal could be specified just for field collections, as part of a TDWG task group, leading to a community input and review process. This would be one key part of a larger effort to identify and reach out to national and international stakeholder groups, including collection managers, aggregators, publishers, scientists, funding agencies, downstream users of the data, and developers of software (e.g., Specify,
<ext-link ext-link-type="uri" xlink:href="http://specifyx.specifysoftware.org/">http://specifyx.specifysoftware.org/</ext-link>
; Symbiota,
<ext-link ext-link-type="uri" xlink:href="http://symbiota.org/docs/">http://symbiota.org/docs/</ext-link>
; and in-house software used by aggregators such as GBIF).</p>
<pmc-comment>PageBreak</pmc-comment>
</sec>
</sec>
<sec>
<title>Application of Identifiers to Legacy Data</title>
<p>Legacy specimens can be defined as material already stored in collections. The identifiers being considered here are those referring to collection objects, which may or may not persist in the collections, (e.g., living collections, tissue sample for DNA extraction, ecological specimens). A single physical collection object is a curatorial unit, which may represent only a part of a larger thing (e.g., mammal skeleton, fur, tissues), or may be an aggregate (e.g., lots, fossils with multiple organisms, herbarium sheets with multiple specimens). When aggregates are split (e.g., multiple taxa split
<pmc-comment>PageBreak</pmc-comment>
into different lots, parasites found on organisms, tissue samples removed), the original identifier generally relates to one of the elements and a new identifier is issued for the additional elements.</p>
<p>Most scientific journals, and even GenBank (
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genbank/">http://www.ncbi.nlm.nih.gov/genbank/</ext-link>
), make only vague recommendations about citing voucher specimens. The legacy identifier commonly used in the literature for botanical specimens over the last hundreds of years is the collector’s name and collecting/field number, which often represents the collector’s personal series number. The legacy identifier commonly used for zoological specimens is the institution acronym/catalog number. For example, the American Society of Mammalogists makes the following recommendation for the Journal of Mammalogy (
<ext-link ext-link-type="uri" xlink:href="http://www.mammalsociety.org/uploads/JM%20Author%20Instructions.pdf">http://www.mammalsociety.org/uploads/JM%20Author%20Instructions.pdf</ext-link>
):</p>
<p>
<styled-content style="padding-left:20px; text-indent:40px; display:block;">“All DNA sequences must be submitted to GenBank, and accession numbers provided in the manuscript before publication. Museum catalogue numbers for all voucher specimens (including associated tissue) examined must be included in the manuscript (in an Appendix if numerous).”</styled-content>
</p>
<pmc-comment>PageBreak</pmc-comment>
<sec sec-type="Roadblocks">
<title>Roadblocks</title>
<p>The single key roadblock with legacy data is the use of local identifiers at all steps during the collection and accessioning process. While these provide means for local provenance tracking, they are insufficient for managing across collections, and are hard to adapt and scale to an open platform for data discovery such as the Internet. A classic example is botanical “duplicates” that come from the same collecting event where different clippings of the same plant were sent to multiple museums. Similar issues can be found for cases where specimens were gifted from one collection to another. In these cases, linkages associating those specimens across collections were typically severed when biocollections were accessioned independently into institutional museum repositories. Those past associations can only be inferred from re-compiling data and looking for content-level matches related to the collections events.</p>
<p>Further, because most collections have effectively developed local curatorial practices, often based on regional and taxon-specific approaches, there is a wide variety of different legacy identifiers associated with specimens and their data. In sum, current practices were and remain highly heterogeneous and the information that could re-associate specimens across collections are lost and cannot be solved simply via post-hoc application of new GUIDs. Thus, the problems with legacy collections are managing both identifiers already in use and dealing with potential application of new ones.</p>
<p>
<bold>Next Steps:</bold>
As a pragmatic matter, the immediate next steps for legacy collections may not include broad application of globally unique identifiers. Instead, a short term next step is for biodiversity informaticians and collections staff to work together to standardize practices for assigning unique identifiers that are persistent (remain tightly associated with the objects they identify) and stable (continue to be actionable). At a minimum, institutions should clarify the identifier scheme being used locally via their own internal policies. Further development of community-wide best practices would be more effective because they would not only foster local curatorial practice, but also specify how those locally curated materials and their data eventually become part of the rapidly coalescing global, digital framework. These best practices need to be developed in the context of existing efforts and/or organizations such as the Global Registry of Biorepositories (GRBio;
<ext-link ext-link-type="uri" xlink:href="http://grbio.org/">http://grbio.org/</ext-link>
), which provides a needed framework for publishing repository-specific information like standard acronyms for institutions and collections. Curators should register their collections in GRBio and specify the adopted identifier scheme for the collection.</p>
<p>The legacy group also considered medium-term and longer-term goals, focusing more on broad informatics solutions than local identifier curation practices. One critical step is to assemble identifiers published by curators to aggregators such as GBIF and to assess identifier heterogeneity. This can feed into developing software for comparing identifiers (e.g., resolvers) that is better able to perform fuzzy matching on identifier strings (and fetch such variations), given that identifiers are sometimes expressed in unintended ways (e.g., added spaces or hyphens, capitalization, etc.). Using just a simple string comparison is insufficient and more robust systems should be set in place, which will then forward to the correct identifier. The same applies to whether a URI prefix should be part of the identifier or not.</p>
<pmc-comment>PageBreak</pmc-comment>
<p>Next, in order to avoid broken URIs, institution-independent resolvers (e.g.,
<ext-link ext-link-type="uri" xlink:href="http://purl.org">purl.org</ext-link>
) or aggregators (e.g., GBIF) should check dereferenceable URIs at certain intervals and inform the responsible contact person when the target URIs return a 404 HTTP status code or are otherwise unavailable. Some providers, such as CrossRef (
<ext-link ext-link-type="uri" xlink:href="http://www.crossref.org/">http://www.crossref.org/</ext-link>
), offer services for policing broken URIs. With regards to the data records associated with specimens and published to aggregators such as GBIF, the legacy identifiers group strongly argued for the longer-term goal of inclusion of proper GUIDs in the occurrenceID (or materialSampleID) Darwin Core (
<xref rid="B22" ref-type="bibr">Wieczorek et al. 2012</xref>
) field, rather than some sort of concatenation of local identifiers, such as a Darwin Core triplet (
<xref rid="B6" ref-type="bibr">Guralnick et al. 2014</xref>
). Finally, we strongly encourage integration of identifier metadata into existing standard schemas (e.g., Darwin Core, ABCD;
<ext-link ext-link-type="uri" xlink:href="http://www.tdwg.org/activities/abcd/">http://www.tdwg.org/activities/abcd/</ext-link>
) as new concepts. Such metadata would include information regarding identifiers, persistence, rules for attribution (use, citation, reference) etc.. as is also discussed further in the “cross-cutting solutions” section.</p>
<p>The legacy biocollections group developed a list of immediate action items to most efficiently take the steps listed above. As a priority list, these include:</p>
<list list-type="order">
<list-item>
<p>Assemble current identifiers from aggregate data as a means to determine current practices. Some of this work has already been accomplished as part of work by
<xref rid="B6" ref-type="bibr">Guralnick et al. (2014)</xref>
to evaluate Darwin Core Triplets and their current use as identifiers in different systems (e.g., VertNet,
<ext-link ext-link-type="uri" xlink:href="http://vertnet.org">http://vertnet.org</ext-link>
; Barcode of Life Data Systems,
<ext-link ext-link-type="uri" xlink:href="http://www.boldsystems.org/">http://www.boldsystems.org/</ext-link>
; GenBank), but further work focusing on GBIF datasets is needed. A critical assessment of current implementations will feed into the next step of generating more informed best practices or appropriate strategies that individual institutions can adopt based on their current GUIDs application.</p>
</list-item>
<list-item>
<p>Create best practice documentation on known identifier minting schemes. Document best practices with use cases, examples, and pros and cons.</p>
</list-item>
<list-item>
<p>As in the new field-collected biocollections group, there is a need to further clarify what exactly is being identified - MaterialSample vs. Organism vs. Occurrence; physical object vs. digital representation.</p>
</list-item>
<list-item>
<p>Clearly define the scope of the proposed identifying scheme and what benefits can be gained by it.</p>
</list-item>
<list-item>
<p>Demonstrate the implications for publishing in the primary literature.</p>
</list-item>
</list>
</sec>
</sec>
<sec>
<title>Application of Biodiversity Data Identifiers In Publishing</title>
<p>Scientific publications are at the core of science communication and still one of the most powerful means for researchers to share their findings. Biodiversity oriented publications, including historical ones dated from the time of Linnaeus and before, provide one of the most important source of data and information, along with the means to quantitatively assess the impact of biocollections, institutions, and taxonomic groups. This enormous resource ultimately provides needed content for museums worldwide in their efforts to
<pmc-comment>PageBreak</pmc-comment>
secure continued funding for preserving and digitizing their specimen collections. Although the legacy literature is an essential resource and ultimate home for data derived from biocollections, it remains difficult to mine data from it, and provide the means to cite or track data usage. In the 21st century, these problems magnify as new digital systems are built to support registration of new data and provisioning of older content. By maintaining the currently prevailing model of publishing biodiversity information in formats not readable by machine or not readily harvestable, such as paper or PDF, we further impede efforts to convert data into fluid formats that support new science. One of the solutions to the problem is the wide adoption of identifiers for different data elements normally present in biodiversity publications. We present a set of use cases that would strongly benefit from a system of globally unique identifiers:</p>
<list list-type="order">
<list-item>
<p>Use of identifiers for handling data across a registry (e.g., ZooBank), a publisher (e.g., Pensoft;
<ext-link ext-link-type="uri" xlink:href="http://www.pensoft.net">http://www.pensoft.net</ext-link>
) and a data aggregator (e.g., Plazi;
<ext-link ext-link-type="uri" xlink:href="http://plazi.org">http://plazi.org</ext-link>
), thus providing linkages between all three.</p>
</list-item>
<list-item>
<p>Use of DOI identifiers for legacy literature allowing full citations from specimens to formal taxon treatments to other publications and vice versa.</p>
</list-item>
<list-item>
<p>Enabling of impact tracking of biological specimens, collections, institutions, and biodiversity data across journal articles.</p>
</list-item>
<list-item>
<p>Managing of information about specimens (e.g., occurrence records) in a similar way to publication and citation of data in the scholarly literature. For example, there is no current method to import (e.g., through an API) specimen records from resources such as GBIF into manuscripts, and ensure proper provenance and citations of these.</p>
</list-item>
<list-item>
<p>Import and citing of specimen records in publications with their own identifiers generated by the primary data providers or by aggregators (e.g., VertNet, GBIF, iDigBio;
<ext-link ext-link-type="uri" xlink:href="http://idigbio.org">http://idigbio.org</ext-link>
), paving the way to a wide array of future re-uses, including automated tracking of data usage and impact metrics.</p>
</list-item>
<list-item>
<p>Reconciliation of specimen label data with collection records published in literature (e.g., for transcription purposes or usage tracking of collections data) via the identifiers as a needed mechanism for linkage.</p>
</list-item>
<list-item>
<p>Aggregation of Web content from biodiversity data contained in publications. For example, articles that benefit from semantic markup allows for parsing and linking of independently published biodiversity data.</p>
</list-item>
<list-item>
<p>Use of identifiers to reference needed evidence: “In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited” (
<ext-link ext-link-type="uri" xlink:href="http://www.force11.org/datacitation">http://www.force11.org/datacitation</ext-link>
, principle 3).</p>
</list-item>
</list>
<sec sec-type="Roadblocks">
<title>Roadblocks</title>
<p>The difficulties in managing, tracking, and large-scale extracting of citations from any sources other than traditional publications are, in part, due to the paucity of widely adopted, persistent, globally unique and resolvable biodiversity data identifiers. Addi
<pmc-comment>PageBreak</pmc-comment>
tionally, extracting specimen, taxon, and other biodiversity data from modern scholarly publications with unstructured formats and little to no markup is needlessly challenging. Another major obstacle is that information about specimens might be published in different places and with different levels of granularity. For example, a specimen might be cited as a holotype in a protologue, then georeferenced and published again in subsequent revisions, perhaps even under a synonym, with images and DNA data appearing separately in other publications. Unless the original specimen collection number is used consistently across all publications, it is difficult, if not impossible, to link together all the important digital derivatives independently generated from that specimen.</p>
<p>A final roadblock is the lack of adoption of advanced publishing approaches, including semantic markup, by almost all publishers in this domain. The TaxPub /Journal Archival Tag Suite provides (
<xref rid="B3" ref-type="bibr">Catapano 2010</xref>
,
<xref rid="B11" ref-type="bibr">Penev et al. 2010</xref>
,
<xref rid="B12" ref-type="bibr">2012</xref>
) all the necessary functionality and has been successfully implemented by Pensoft in 14 journals, including the registration of the their articles in PubMed and PubMedCentral. However, it places the burden on publishers to adopt new technical approaches that are difficult to meet given a lack of resources and strong incentives for change.</p>
</sec>
<sec sec-type="Next Steps">
<title>Next Steps</title>
<p>The key next step is to establish the best practices to generate and assign identifiers as they either propagate from biocollections into the literature or are created during semantically enabled publishing processes. Such practices will assure that publications follow a set of principles ratified by various stakeholders and governments, and perhaps best described broadly in the Force11 data citation principles (
<ext-link ext-link-type="uri" xlink:href="https://www.force11.org/datacitation">https://www.force11.org/datacitation</ext-link>
), and more directly for the biodiversity community in the Bouchout Declaration (
<xref rid="B2" ref-type="bibr">Bouchout Declaration 2014</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://bouchoutdeclaration.org/">http://bouchoutdeclaration.org/</ext-link>
). Tools are needed to retrieve identifiers assigned to biological names, taxonomic treatments associated with a name and specimen data discovered in the published records and/or stored in domain specific databases.</p>
<p>Below we summarize critical practices and principles for the use of identifiers in semantically enhanced publications:</p>
<list list-type="order">
<list-item>
<p>Publishers should use GUIDs for formally cited or potentially relevant data (e.g., authors, books, articles, taxon names, taxonomic treatments, gene sequences, specimens, etc.) maintained in well- established and widely used external registries.</p>
</list-item>
<list-item>
<p>Publishers should issue GUIDs for data first made widely available through document publication (e.g., observation on a species published by an amateur naturalist with no GUID issued by or associated with an Institution).</p>
</list-item>
<list-item>
<p>Publishers should provide both human- and machine-readable content (
<xref rid="B18" ref-type="bibr">Starr et al. 2015</xref>
) through resolvable GUIDs for separate elements of an article (e.g., individual images, graphs, tables, supplementary materials, taxonomic treatments, checklists, etc.).</p>
</list-item>
<pmc-comment>PageBreak</pmc-comment>
<list-item>
<p>Resolvable GUIDs should be used as widely as possible to annotate published content; for example, adding a species to a published checklist should be identified by a GUID, which can be linked to the exact “place” within a published text (e.g., between two species in the checklist).</p>
</list-item>
<list-item>
<p>Publishers should use GUIDs and authority files for authors, e.g., ORCID (
<ext-link ext-link-type="uri" xlink:href="http://orcid.org/">http://orcid.org/</ext-link>
), VIAF (
<ext-link ext-link-type="uri" xlink:href="http://viaf.org/">http://viaf.org/</ext-link>
), authors of plant names (
<ext-link ext-link-type="uri" xlink:href="http://www.kew.org/data/authors.html">http://www.kew.org/data/authors.html</ext-link>
), ZooBank authors (
<ext-link ext-link-type="uri" xlink:href="http://zoobank.org">http://zoobank.org</ext-link>
) or internal systems that unambiguously identify names of authors.</p>
</list-item>
<list-item>
<p>For the conversion of legacy literature, assign GUIDs to relevant elements that are widely used, resolve to content (e.g., articles, treatments, observation records) and can be a source for Linked Open Data. Whenever possible, use an existing identifier service (such as Plazi for treatments), rather than minting additional identifiers.</p>
</list-item>
<list-item>
<p>The identifier system(s) should be sustainable for the long term, highly reliable, and have an API as a backbone service.</p>
</list-item>
<list-item>
<p>We note a preference for identifiers used by indexing services (while such services use many kinds of identifiers, CrossRef and DataCite (
<ext-link ext-link-type="uri" xlink:href="http://datacite.org">http://datacite.org</ext-link>
) DOIs are the most commonly used). Publishers should link data related to an article and the article itself through their GUIDs (CrossRef and DataCite DOIs cross-referencing service).</p>
</list-item>
<list-item>
<p>Identifiers and their metadata related to annotations in publications should be housed and made available by an independent party.</p>
</list-item>
</list>
<p>We discussed how systems can be built around identifiers that support all the different participants involved in publishing. Authors are critical participants and should better be able to cite usage of their data from semantically enhanced, rather than unstructured, formats. Publishers can assist authors by making all published data linkable/citable and contributing to specialized databases and/or permanent repositories (e.g. Dryad (
<ext-link ext-link-type="uri" xlink:href="http://datadryad.org/">http://datadryad.org/</ext-link>
) or the Biodiversity Literature Repository (
<ext-link ext-link-type="uri" xlink:href="https://zenodo.org/collection/user-biosyslit">https://zenodo.org/collection/user-biosyslit</ext-link>
)). Publishers can also provide authoring tools (such as the Pensoft Writing Tool (PWT) used by the Biodiversity Data Journal – see
<xref rid="B17" ref-type="bibr">Smith et al. 2013</xref>
) that assist authors with entry of structured data (i.e., upfront pre-submission markup and easy data import into the manuscript) to which new or existing identifiers can be assigned or included. Hence, easy data download and export to aggregators from the published paper can be achieved.</p>
<p>To serve the broader community (i.e., beyond authors), publishers can also provide tools to find cited data (e.g.,
<ext-link ext-link-type="uri" xlink:href="http://refindit.org">http://refindit.org</ext-link>
, which searches across CrossRef, DataCite, Mendeley (
<ext-link ext-link-type="uri" xlink:href="http://www.mendeley.com">http://www.mendeley.com</ext-link>
), RefBank (
<ext-link ext-link-type="uri" xlink:href="http://refbank.org">http://refbank.org</ext-link>
), Global Names Usage Bank (
<ext-link ext-link-type="uri" xlink:href="http://www.globalnames.org/GNUB">http://www.globalnames.org/GNUB</ext-link>
), Biodiversity Heritage Library (
<ext-link ext-link-type="uri" xlink:href="http://www.biodiversitylibrary.org/">http://www.biodiversitylibrary.org/</ext-link>
), Biodiversity Literature Repository (
<ext-link ext-link-type="uri" xlink:href="https://zenodo.org/collection/user-biosyslit">https://zenodo.org/collection/user-biosyslit</ext-link>
) and others), as well as an ORCID lookup linked to data creators or owners. Contributing institutions can much more easily assess their institutional impact in biodiversity research output by tracking the usage of identifiers embedded in the articles, as well as better manage intellectual property. For example,
<pmc-comment>PageBreak</pmc-comment>
publishers can work with organizations such as GRBio to create identifiers for institutions so that all can be cited. Funding agencies can better argue that open access is not only a legal mandate but maximizes their return on investments in terms of products made available to the public. One possible step forward is to create identifiers for funding agencies (e.g., Fundref
<ext-link ext-link-type="uri" xlink:href="http://www.crossref.org/fundref/">http://www.crossref.org/fundref/</ext-link>
).</p>
</sec>
</sec>
<sec>
<title>Cross-Cutting Issues and Needs</title>
<p>On the second day of the workshop, a subgroup met to broadly consider cross-cutting issues and needs, given the complexity of semantically interlinked publishing, legacy data, new biocollections, and connections to ecological, biomedical, and climate datasets. The group noted that many needed solutions are described in detail by the Cool URIs W3C Interest Group Note (
<ext-link ext-link-type="uri" xlink:href="http://www.w3.org/TR/cooluris/">http://www.w3.org/TR/cooluris/</ext-link>
). In addition, the group suggested that promoting any particular approaches and standards apart from W3C efforts should be undertaken as part of the reinvigoration of the TDWG globally unique identifiers task group (
<ext-link ext-link-type="uri" xlink:href="http://www.tdwg.org/activities/guid/">http://www.tdwg.org/activities/guid/</ext-link>
). Because identifier concerns are cross-cutting and involve research scientists, collectors, curators, publishers, and downstream users, collaboration with additional organizations focused on care of collections, such as the Society for Preservation of Natural History Collections, is needed. Shared responsibility among stakeholders can also break down barriers and enhance knowledge dissemination, helping to bridge the two worlds of physical and digital objects in curation of biocollections.</p>
<sec sec-type="Defining the Target of the Identifier">
<title>Defining the Target of the Identifier</title>
<p>Not all identifier schemes are unambiguous in declaring which identifier refers to an information resource and which to a physical object or abstract concept or event. For instance, an identifier referencing a photo of an eagle on a tree could be identifying the digital photo itself, a photographic print that was later scanned, a reference to the eagle as a physical specimen stored in a museum, the event of capturing the image, or a reference to an individual eagle that exists in nature. Distinguishing concepts such as “digital media”, “print media”, “individual”, and “specimen” is not trivial and ultimately relies on attaching formal descriptions from a biodiversity or biocollections ontology to the identified object. We encourage the use of the Darwin Core Basis Of Record term (
<ext-link ext-link-type="uri" xlink:href="http://rs.tdwg.org/dwc/terms/basisOfRecord">http://rs.tdwg.org/dwc/terms/basisOfRecord</ext-link>
) to describe the exact nature of the resource. There is a current proposal for tying values for the Basis Of Record term to ontology sources in the Biological Collections Ontology (
<xref rid="B21" ref-type="bibr">Walls et al. 2014</xref>
,
<xref rid="B5" ref-type="bibr">Deck et al. in press</xref>
) which will greatly help in clarifying the concepts underlying identified objects and their downstream use.</p>
<pmc-comment>PageBreak</pmc-comment>
</sec>
<sec sec-type="Standardizing Identifier Metadata Requests & Responses">
<title>Standardizing Identifier Metadata Requests & Responses</title>
<p>Various identifier schemes behave differently when posting requests and receiving responses; standardized responses are urgently needed. An important example is the standardized content negotiation behavior in the semantic web; other examples are the unified content negotiation by CrossRef and DataCite (
<ext-link ext-link-type="uri" xlink:href="http://crosstech.crossref.org/2012/05/crossref_and_datacite_unify_su.html">http://crosstech.crossref.org/2012/05/crossref_and_datacite_unify_su.html</ext-link>
). Identifier metadata can be requested from the service provider not only using Linked Data patterns (which a user cannot do with just a web browser), but also by manipulating the URL endpoint directly, such as URL inflections (
<ext-link ext-link-type="uri" xlink:href="https://wiki.ucop.edu/display/Curation/ARK">https://wiki.ucop.edu/display/Curation/ARK</ext-link>
), alternate resolution prefixes, 303 re-directs or hashtags to denote physical objects, or parameter specification in the URL query string. The EZID system provides the ability to deliver DataCite, Dublin Core, CrossRef, or Dublin Core kernel (
<ext-link ext-link-type="uri" xlink:href="http://dublincore.org/groups/kernel/spec/">http://dublincore.org/groups/kernel/spec/</ext-link>
) metadata profiles.
<bold>A strong recommendation is to create a biodiversity metadata profile to complement these existing profiles.</bold>
</p>
</sec>
<sec sec-type="Policy and Contracts">
<title>Policy and Contracts</title>
<p>What intention goes into the creation of an identifier, including any contracts and technical specifications? The policies of identifier assigning authority provide information about the expectation of commitment, longevity, use, and re-use. Some identifier schemes require membership and fees in order to create identifiers while others are open and free. Some schemes mandate use of a particular table lookup technology while others do not. Each scheme has its unique history, community, and conditions of use (as described in more detail in Table
<xref ref-type="table" rid="T4">2</xref>
). Whatever method is used for creating the identifier, it
<bold>should be publicized explicitly by the identifier authority.</bold>
Consumers need to know about the persistence mission of the agency and any potential contracts implied by use of the identifier.</p>
</sec>
<sec sec-type="Persisting GUIDs across Systems">
<title>Persisting GUIDs across Systems</title>
<p>The group discussed issues with contracts about retaining identifiers in downstream systems.
<bold>We strongly recommend creating community conventions when re-using data to place special importance on referencing and maintaining earlier identifiers</bold>
, especially those with clear policy and behaviour contracts. Use of such conventions provides significant value for data producers and consumers, such as data citation networks, analogous to those produced by CrossRef for journal publications.</p>
<pmc-comment>PageBreak</pmc-comment>
</sec>
<sec sec-type="Content Mutability">
<title>Content Mutability</title>
<p>If a physical object is categorized as “organism” and is later changed to “bulk sample”, does its associated identifier change with it? Does the identifier to a concept change if a spelling error or ambiguous wording is corrected in its definition? Does an information resource identifier change during versioning? Does an identifier guarantee binary identical results, or only identical core-content (which may be embedded in a modified template or formatted differently)? In some cases the answer may be a permanent single mutability policy of the identifier scheme itself, in other cases the identifier scheme may support multiple policies, and the mutability policy may be available as metadata on the identified object.
<bold>We recommend using, and where necessary, developing, a vocabulary to document mutability policies and conventions for various content types.</bold>
</p>
</sec>
<sec sec-type="Resolver Persistence">
<title>Resolver Persistence</title>
<p>Dereferencing, or the automated process that a software tool (e.g., a web browser) employs to go from identifier to content or metadata access, starts with a URL. All identifiers, regardless of scheme, are resolved by a user agent if they are embedded in a URL. As for institutions that have long-term access in their mission, many people think that smaller, newer institutions’ website hostnames will be short-lived compared to those of older, larger institutions (e.g.,
<ext-link ext-link-type="uri" xlink:href="http://loc.gov">loc.gov</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://bnf.fr">bnf.fr</ext-link>
). Some people prefer to trust a hostname backed by a group of institutions, even if comparatively young (e.g.,
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org">dx.doi.org</ext-link>
), rather than by any one institution. Among such group or consortial arrangements, some people prefer to trust those committed to open access (e.g.,
<ext-link ext-link-type="uri" xlink:href="http://gbif.org">gbif.org</ext-link>
). Persistence missions can also far exceed current technological solutions. Will, for example, current http protocols look anything like the protocols used in 2065?
<bold>Forecasting about resolver persistence for 10, 20, 50, or 100 years is at best educated guesswork, but it should take into account such things as inevitable technological advances, resolver organization’s mission, size, business model, openness, and current age.</bold>
</p>
</sec>
<sec sec-type="Identifier Ergonomics and Curation">
<title>Identifier Ergonomics and Curation</title>
<p>Identifier readability and ease of transcription are concerns whenever identifiers are routinely recognized, typed, or written by human beings (e.g., on specimen labels). Non-opaque identifiers (containing recognizable strings) tend to be easy to read and to enter because humans can often spot transcription errors; however, it is difficult to mint them uniquely and quickly, and to keep them persistent (their structure makes them prone to “semantic rot”). It is easy to create UUIDs quickly and in large number, which can be especially useful for tracking instances of samples or events in aggregator databases. On the other hand, UUIDs rendered as hexadecimal characters (as opposed to embedded in QR-Codes) are opaque and long, and not as useful in situations where a UUID is
<pmc-comment>PageBreak</pmc-comment>
expected to be printed onto an insect pin, placed in a vial, or entered via a user interface by hand. There are other means of generating shorter unique opaque identifiers (e.g., Noid), but they have other disadvantages.
<bold>One solution to this dilemma is to maintain human-friendly identifiers (e.g., catalog numbers) when presenting content to humans in addition to computer-friendly identifiers (LOD, UUID, DOI, ARK, etc.) for electronic cross-linking.</bold>
Such a solution does require curation overhead to assure that both are managed for the long-term. Emerging services such as GRBio maps human-friendly Institution and Collection Codes to URIs for biocollections.</p>
</sec>
</sec>
<sec>
<title>Conclusions and Planning For The Longer Term</title>
<p>Perhaps the most critical outcome of this workshop was general agreement about a key set of issues, listed below:</p>
<list list-type="order">
<list-item>
<p>As opposed to discussing particular implementations, which is likely to be counterproductive, the group was much more interested in cross-cutting issues and the importance of delivery mechanisms that help machines and users interpret identifiers and metadata about them and the biodiversity data objects to which they point.</p>
</list-item>
<list-item>
<p>New field-based biocollections and legacy biocollections have different immediate and longer-term needs when it comes to identifier solutions. While there is every reason to assign a globally unique, persistent identifier to new data in biocollections, it may be less critical for legacy records. For legacy data, the problem of broken associations already exists and can only be repaired by spending effort to re-assert the relationships.</p>
</list-item>
<list-item>
<p>When a publisher creates records for a new derivative from a legacy collection, it should always copy in the “original” identifier field from the legacy record into the new record. Best practices and conventions for doing so still need to be developed.</p>
</list-item>
<list-item>
<p>Publications and data aggregators should not only honor existing identifiers and the metadata about those identifiers, but also follow practices that maximize interoperation with emerging digital library practices regarding data citation.</p>
</list-item>
<list-item>
<p>We see great value in reviving or establishing task groups in (and between) TDWG and SPNHC that can help implement some of the best practices and next steps discussed in this document, in particular the creation of a biodiversity metadata profile for identifiers, which can provide critical information about the type of biodiversity object to which the identifier points.</p>
</list-item>
</list>
<p>It is noteworthy that the assembled group represented people who have expressed sometimes opposing views on which identifier implementation is most likely to best support sharing and linking biodiversity data. The longer term is likely to see a whole suite of differing solutions, and Table
<xref ref-type="table" rid="T4">2</xref>
provides more details about differing identifier implementations and services. More important are the cross-cutting solutions, independent of any one identifier implementation, which can best facilitate a vibrant interconnected graph of specimens, samples, images, descriptions/traits, sequences and published content.</p>
<pmc-comment>PageBreak</pmc-comment>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>We would like to thank the National Science Foundation for supporting the BiSciCol (Biological Science Collections Tagging and Tracking;
<ext-link ext-link-type="uri" xlink:href="http://biscicol.org">biscicol.org</ext-link>
) project and all activities that took place during its development, including this workshop (DEB 0956371, DEB 0956350, and collaborative awards). We also note support from the Research Coordination Network for Genomic Standards Consortium (DBI-0840989), EAGER: An Interoperable Information Infrastructure for Biodiversity Research (IIS-1255035) who has supported travel and discussion related to this topic. The work of Pensot’s and Plazi’s teams was partly supported by the EU BON (Building the European Biodiversity Observation Network), an FP-7 (European Union Seventh Framework Programme, 2007-2013) grant (No 308454). We are grateful to all those who were involved in the project throughout the years for the constructive feedback and many discussions that took place throughout the course of developing outputs for the project. Robert Whitton provided Figure
<xref ref-type="fig" rid="F1">1</xref>
and shared his insights on workflows involving field-assigned identifiers. We thank the Stockholm Natural History Museum for providing us with the venue and appreciate the coordination with the TDWG executive and organizing committees.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Baskauf</surname>
<given-names>S</given-names>
</name>
</person-group>
(
<year>2010</year>
)
<source>Recommendations for implementation of GUIDs in the SERNEC collections community (Ver. 1.3are:, 2010-01-17)</source>
.
<ext-link ext-link-type="uri" xlink:href="http://bioimages.vanderbilt.edu/guid-10-07-17.pdf">http://bioimages.vanderbilt.edu/guid-10-07-17.pdf</ext-link>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="other">
<institution>Bouchout Declaration</institution>
(
<year>2014</year>
)
<source>Bouchout Declaration on Open Biodiversity Knowledge Management</source>
.
<ext-link ext-link-type="uri" xlink:href="http://bouchoutdeclaration.org">http://bouchoutdeclaration.org</ext-link>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Catapano</surname>
<given-names>T</given-names>
</name>
</person-group>
(
<year>2010</year>
)
<article-title>TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions.</article-title>
<source>Proceedings of the Journal Article Tag Suite Conference 2010</source>
.
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/books/NBK47081/">http://www.ncbi.nlm.nih.gov/books/NBK47081/</ext-link>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="book">
<person-group>
<name>
<surname>Cryer</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hyam</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Nicolson</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Ó</surname>
<given-names>Tuama</given-names>
</name>
<name>
<surname>Éamonn</surname>
</name>
<name>
<surname>Page</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rees</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Riccardi</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Richards</surname>
<given-names>K</given-names>
</name>
<name>
<surname>White</surname>
<given-names>R</given-names>
</name>
</person-group>
(
<year>2009</year>
)
<article-title>Adoption of Persistent Identifiers for Biodiversity Informatics.</article-title>
<source>Recommendations of the GBIF LSID GUID Task Group, 6 November 2009</source>
.
<publisher-name>GBIF Secretariat</publisher-name>
,
<publisher-loc>Copenhagen</publisher-loc>
,
<size units="page">23 pp</size>
<ext-link ext-link-type="uri" xlink:href="http://imsgbif.gbif.org/CMS_ORC/?doc_id=2956">http://imsgbif.gbif.org/CMS_ORC/?doc_id=2956</ext-link>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Deck</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Guralnick</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Walls</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Matsunaga</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wieczorek</surname>
<given-names>J</given-names>
</name>
</person-group>
(
<year>in press</year>
)
<source>Identifying practical applications of ontologies for biodiversity informatics. In review for Stand Genomic Sci</source>
.</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Guralnick</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Conlin</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Deck</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Stucky</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Cellinese</surname>
<given-names>N</given-names>
</name>
</person-group>
(
<year>2014</year>
)
<article-title>The Trouble with Triplets in Biodiversity Informatics: A Data-Driven Case Against Current Identifier Practices.</article-title>
<source>PLoS ONE</source>
<volume>9</volume>
(
<issue>12</issue>
):
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1371/journal.pone.0114069">10.1371/journal.pone.0114069</ext-link>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="book">
<person-group>
<name>
<surname>Hagedorn</surname>
<given-names>G</given-names>
</name>
</person-group>
(
<year>2013</year>
)
<article-title>Beyond Darwin Core – Stable identifiers and then quickly beyond towards linked open data.</article-title>
<source>TDWG 2013</source>
,
<publisher-loc>Florence, Italy</publisher-loc>
<ext-link ext-link-type="uri" xlink:href="http://www.slideshare.net/G.Hagedorn/tdwg-2013-florence-italy-hagedorn-beyond-dw-c-stableids-linkedopendata">http://www.slideshare.net/G.Hagedorn/tdwg-2013-florence-italy-hagedorn-beyond-dw-c-stableids-linkedopendata</ext-link>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Hagedorn</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Catapano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Güntsch</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mietchen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Endresen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Sierra</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Groom</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Biserkov</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Glöckler</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>R</given-names>
</name>
</person-group>
(
<year>2013</year>
)
<source>Best practices for stable URIs</source>
.
<ext-link ext-link-type="uri" xlink:href="http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs">http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs</ext-link>
</mixed-citation>
</ref>
<pmc-comment>PageBreak</pmc-comment>
<ref id="B9">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Page</surname>
<given-names>RDM</given-names>
</name>
</person-group>
(
<year>2008</year>
)
<article-title>Biodiversity informatics: the challenge of linking data and the role of shared identifiers.</article-title>
<source>Briefings in Bioinformatics</source>
<volume>9</volume>
(
<issue>5</issue>
):
<fpage>345</fpage>
<lpage>354</lpage>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1093/bib/bbn022">10.1093/bib/bbn022</ext-link>
<pub-id pub-id-type="pmid">18445641</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Page</surname>
<given-names>RDM</given-names>
</name>
</person-group>
(
<year>2009</year>
)
<article-title>bioGUID: resolving, discovering, and minting identifiers for biodiversity informatics.</article-title>
<source>BMC Bioinformatics</source>
<volume>10</volume>
:
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1186/1471-2105-10-s14-s5">10.1186/1471-2105-10-s14-s5</ext-link>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Penev</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Agosti</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Georgiev</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Catapano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Blagoderov</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Roberts</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>VS</given-names>
</name>
<name>
<surname>Brake</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Ryrcroft</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>NF</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Sautter</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Chavan</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Robertson</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Remsen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Stoev</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Parr</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Knapp</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kress</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>FC</given-names>
</name>
<name>
<surname>Erwin</surname>
<given-names>T</given-names>
</name>
</person-group>
(
<year>2010</year>
)
<article-title>Semantic tagging of and semantic enhancements to systematics papers. ZooKeys working example.</article-title>
<source>ZooKeys</source>
<volume>50</volume>
:
<fpage>1</fpage>
<lpage>16</lpage>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.3897/zookeys.50.538">10.3897/zookeys.50.538</ext-link>
<pub-id pub-id-type="pmid">21594113</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="book">
<person-group>
<name>
<surname>Penev</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Catapano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Agosti</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Sautter</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Stoev</surname>
<given-names>P</given-names>
</name>
</person-group>
(
<year>2012</year>
)
<article-title>Implementation of TaxPub, an NLM DTD extension for domain-specific markup in taxonomy, from the experience of a biodiversity publisher.</article-title>
In:
<source>Journal Article Tag Suite Conference (JATS-Con) Proceedings 2012 [Internet]</source>
.
<publisher-name>National Center for Biotechnology Information (US)</publisher-name>
,
<publisher-loc>Bethesda (MD)</publisher-loc>
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/books/NBK100351/">http://www.ncbi.nlm.nih.gov/books/NBK100351/</ext-link>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Pereira</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hobern</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hyam</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Belbin</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Richards</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>S</given-names>
</name>
</person-group>
(
<year>2007</year>
)
<article-title>TDWG Life Sciences Identifiers (LSID) Applicability Statement.</article-title>
<source>Biodiversity Information Standards (TDWG)</source>
,
<size units="page">28 pp</size>
<ext-link ext-link-type="uri" xlink:href="http://www.tdwg.org/fileadmin/subgroups/guid/LSID_Applicability_Statement_draft.pdf">http://www.tdwg.org/fileadmin/subgroups/guid/LSID_Applicability_Statement_draft.pdf</ext-link>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Pyle</surname>
<given-names>RL</given-names>
</name>
</person-group>
(
<year>2006</year>
)
<article-title>Identifiers for the Life Sciences: A Primer for Biologists.</article-title>
<source>Taxonomic Databases Working Group, Biodiversity Information Standards (TDWG)</source>
,
<size units="page">2 pp</size>
.</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Richards</surname>
<given-names>K</given-names>
</name>
</person-group>
(
<year>2010</year>
)
<article-title>TDWG GUID Applicability Statement.</article-title>
<source>Biodiversity Information Standards (TDWG)</source>
,
<size units="page">17 pp</size>
.</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="book">
<person-group>
<name>
<surname>Richards</surname>
<given-names>K</given-names>
</name>
<name>
<surname>White</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Nicolson</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Pyle</surname>
<given-names>R</given-names>
</name>
</person-group>
(
<year>2011</year>
)
<source>Beginners’ Guide to Persistent Identifiers Version 1.0. Global Biodiversity Information Facility</source>
.
<publisher-name>Copenhagen</publisher-name>
,
<size units="page">33 pp</size>
<ext-link ext-link-type="uri" xlink:href="http://links.gbif.org/persistent_identifiers_guide_en_v1.pdf">http://links.gbif.org/persistent_identifiers_guide_en_v1.pdf</ext-link>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Smith</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Georgiev</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Stoev</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Biserkov</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Livermore</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Mietchen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Couvreur</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Mueller</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Dikow</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Helgen</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Frank</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Agosti</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Roberts</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Penev</surname>
<given-names>L</given-names>
</name>
</person-group>
(
<year>2013</year>
)
<article-title>Beyond dead trees: integrating the scientific process in the Biodiversity Data Journal.</article-title>
<source>Biodiversity Data Journal</source>
<volume>1</volume>
:
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.3897/BDJ.1.e995">10.3897/BDJ.1.e995</ext-link>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="other">
<person-group>
<name>
<surname>Starr</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Castro</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Crosas</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Dumontier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Downs</surname>
<given-names>RR</given-names>
</name>
<name>
<surname>Duerr</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Haak</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Herman</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Hodson</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hourclé</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kratz</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>LH</given-names>
</name>
<name>
<surname>Nurnberger</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pröll</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Rauber</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sacchi</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Clark</surname>
<given-names>T</given-names>
</name>
</person-group>
(
<year>2015</year>
)
<article-title>Achieving human and machine accessibility of cited data in scholarly publications.</article-title>
<source>PeerJ PrePrints</source>
<year>3</year>
:
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.7287/peerj.preprints.697v3">10.7287/peerj.preprints.697v3</ext-link>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Stucky</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Deck</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Conlin</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ziemba</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Cellinese</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Guralnick</surname>
<given-names>R</given-names>
</name>
</person-group>
(
<year>2014</year>
)
<article-title>The BiSciCol Triplifier: bringing biodiversity data to the Semantic Web.</article-title>
<source>BMC Bioinformatics</source>
<volume>15</volume>
:
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1186/1471-2105-15-257">10.1186/1471-2105-15-257</ext-link>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="other">
<institution>TDWG</institution>
[Biodiversity Information Standards] (
<year>2013</year>
)
<source>Globally Unique Identifiers (GUID) Wiki</source>
.
<ext-link ext-link-type="uri" xlink:href="http://wiki.tdwg.org/twiki/bin/view/GUID">http://wiki.tdwg.org/twiki/bin/view/GUID</ext-link>
[
<date-in-citation content-type="update-date">last updated 21 January 2013</date-in-citation>
;
<date-in-citation content-type="access-date">accessed 19 January 2015</date-in-citation>
]</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Walls</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Deck</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Guralnick</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Baskauf</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Beaman</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bowers</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Buttigieg</surname>
<given-names>PL</given-names>
</name>
<name>
<surname>Davies</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Endresen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Gandolfo</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Hanner</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Janning</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Krishtalka</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Matsunaga</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Midford</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Morrison</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Ó Tuama</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Schildhauer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Stucky</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Thomer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wieczorek</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Whitacre</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wooley</surname>
<given-names>J</given-names>
</name>
</person-group>
(
<year>2014</year>
)
<article-title>Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies.</article-title>
<source>PLoS ONE</source>
<volume>9</volume>
(
<issue>3</issue>
):
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1371/journal.pone.0089606">10.1371/journal.pone.0089606</ext-link>
</mixed-citation>
</ref>
<pmc-comment>PageBreak</pmc-comment>
<ref id="B22">
<mixed-citation publication-type="journal">
<person-group>
<name>
<surname>Wieczorek</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Bloom</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Guralnick</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Döring</surname>
<given-names>R</given-names>
</name>
<name>
<surname>De Giovanni</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Robertson</surname>
</name>
<name>
<surname>Vieglais</surname>
<given-names>D</given-names>
</name>
</person-group>
(
<year>2012</year>
)
<article-title>Darwin Core: An Evolving Community-developed Biodiversity Data Standard.</article-title>
<source>PLoS ONE</source>
<volume>7</volume>
(
<issue>1</issue>
):
<elocation-id></elocation-id>
. doi:
<ext-link ext-link-type="doi" xlink:href="10.1371/journal.pone.0029715">10.1371/journal.pone.0029715</ext-link>
</mixed-citation>
</ref>
</ref-list>
<app-group>
<app id="App1">
<title>Appendix 1</title>
<table-wrap id="T5" orientation="portrait" position="anchor">
<caption>
<p>Participants in Identifiers Workshop held October 25–26, 2014 at the Stockholm Museum of Natural History.</p>
</caption>
<table frame="hsides" rules="all">
<tbody>
<tr>
<th rowspan="1" colspan="1">Name</th>
<th rowspan="1" colspan="1">Institution/Organization</th>
<th rowspan="1" colspan="1">email</th>
</tr>
<tr>
<td rowspan="1" colspan="1">Nico Cellinese</td>
<td rowspan="1" colspan="1">University of Florida</td>
<td rowspan="1" colspan="1">ncellinese@flmnh.ufl.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">John Deck</td>
<td rowspan="1" colspan="1">University of California, Berkeley</td>
<td rowspan="1" colspan="1">jdeck@berkeley.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Rob Guralnick</td>
<td rowspan="1" colspan="1">University of Colorado, Boulder</td>
<td rowspan="1" colspan="1">Robert.Guralnick@colorado.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Hilmar Lapp</td>
<td rowspan="1" colspan="1">NESCENT, Duke University</td>
<td rowspan="1" colspan="1">hlapp@nescent.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Michael Denslow</td>
<td rowspan="1" colspan="1">NEON</td>
<td rowspan="1" colspan="1">mdenslow@neoninc.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Richard Pyle</td>
<td rowspan="1" colspan="1">Bishop Museum, Honolulu</td>
<td rowspan="1" colspan="1">deepreef@bishopmuseum.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Donat Agosti</td>
<td rowspan="1" colspan="1">Plazi</td>
<td rowspan="1" colspan="1">agosti@plazi.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Joan Starr</td>
<td rowspan="1" colspan="1">California Digital Library</td>
<td rowspan="1" colspan="1">Joan.Starr@ucop.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Ramona Walls</td>
<td rowspan="1" colspan="1">iPlant Collaborative</td>
<td rowspan="1" colspan="1">rwalls@iplantcollaborative.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Kerstin Lehnert</td>
<td rowspan="1" colspan="1">IGSN</td>
<td rowspan="1" colspan="1">lehnert@ldeo.columbia.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Roderic Page</td>
<td rowspan="1" colspan="1">University of Glasgow</td>
<td rowspan="1" colspan="1">Roderic.Page@glasgow.ac.uk</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Karen Cranston</td>
<td rowspan="1" colspan="1">NESCENT</td>
<td rowspan="1" colspan="1">karen.cranston@nescent.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Terence Catapano</td>
<td rowspan="1" colspan="1">Plazi</td>
<td rowspan="1" colspan="1">catapanoth@gmail.com</td>
</tr>
<tr>
<td rowspan="1" colspan="1">John Kunze</td>
<td rowspan="1" colspan="1">California Digital Library</td>
<td rowspan="1" colspan="1">jak@ucop.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Markus Döring</td>
<td rowspan="1" colspan="1">GBIF</td>
<td rowspan="1" colspan="1">mdoering@gbif.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Lyubomir Penev</td>
<td rowspan="1" colspan="1">Pensoft</td>
<td rowspan="1" colspan="1">penev@pensoft.net</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Teodor Georgiev</td>
<td rowspan="1" colspan="1">Pensoft</td>
<td rowspan="1" colspan="1">preprint@pensoft.net</td>
</tr>
<tr>
<td rowspan="1" colspan="1">John Wieczorek</td>
<td rowspan="1" colspan="1">Museum of Vertebrate Zoology, University of California, Berkeley</td>
<td rowspan="1" colspan="1">tuco@berkeley.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Dag Endresen</td>
<td rowspan="1" colspan="1">Natural History Museum, Oslo</td>
<td rowspan="1" colspan="1">dag.endresen@nhm.uio.no</td>
</tr>
<tr>
<td rowspan="1" colspan="1">David Schindel</td>
<td rowspan="1" colspan="1">CBOL, Smithsonian</td>
<td rowspan="1" colspan="1">schindeld@si.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Greg Riccardi</td>
<td rowspan="1" colspan="1">Florida State University, iDigBio</td>
<td rowspan="1" colspan="1">griccardi@fsu.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Deb Paul</td>
<td rowspan="1" colspan="1">Florida State University, iDigBio</td>
<td rowspan="1" colspan="1">dpaul@fsu.edu</td>
</tr>
<tr>
<td rowspan="1" colspan="1">David Fichtmueller</td>
<td rowspan="1" colspan="1">Berlin Botanic Garden</td>
<td rowspan="1" colspan="1">d.fichtmueller@bgbm.org</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Falko Gloeckler</td>
<td rowspan="1" colspan="1">Natural History Museum, Berlin</td>
<td rowspan="1" colspan="1">falko.gloeckler@mfn-berlin.de</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Jana Hoffmann</td>
<td rowspan="1" colspan="1">Natural History Museum, Berlin</td>
<td rowspan="1" colspan="1">jana.hoffmann@mfn-berlin.de</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Elspeth Haston</td>
<td rowspan="1" colspan="1">Royal Botanic Garden, Edinburgh</td>
<td rowspan="1" colspan="1">e.haston@rbge.org.uk</td>
</tr>
</tbody>
</table>
</table-wrap>
</app>
</app-group>
</back>
</pmc>
<affiliations>
<list>
<country>
<li>Allemagne</li>
<li>Bulgarie</li>
<li>États-Unis</li>
</country>
<region>
<li>Arizona</li>
<li>Berlin</li>
<li>Californie</li>
<li>Floride</li>
<li>Sofia-ville (oblast)</li>
</region>
<settlement>
<li>Berlin</li>
<li>Sofia</li>
</settlement>
</list>
<tree>
<noCountry>
<name sortKey="Agosti, Donat" sort="Agosti, Donat" uniqKey="Agosti D" first="Donat" last="Agosti">Donat Agosti</name>
<name sortKey="Catapano, Terry" sort="Catapano, Terry" uniqKey="Catapano T" first="Terry" last="Catapano">Terry Catapano</name>
<name sortKey="Page, Roderic D M" sort="Page, Roderic D M" uniqKey="Page R" first="Roderic D. M." last="Page">Roderic D. M. Page</name>
<name sortKey="Pyle, Richard L" sort="Pyle, Richard L" uniqKey="Pyle R" first="Richard L." last="Pyle">Richard L. Pyle</name>
<name sortKey="Wieczorek, John" sort="Wieczorek, John" uniqKey="Wieczorek J" first="John" last="Wieczorek">John Wieczorek</name>
</noCountry>
<country name="États-Unis">
<region name="Floride">
<name sortKey="Guralnick, Robert P" sort="Guralnick, Robert P" uniqKey="Guralnick R" first="Robert P." last="Guralnick">Robert P. Guralnick</name>
</region>
<name sortKey="Cellinese, Nico" sort="Cellinese, Nico" uniqKey="Cellinese N" first="Nico" last="Cellinese">Nico Cellinese</name>
<name sortKey="Deck, John" sort="Deck, John" uniqKey="Deck J" first="John" last="Deck">John Deck</name>
<name sortKey="Kunze, John" sort="Kunze, John" uniqKey="Kunze J" first="John" last="Kunze">John Kunze</name>
<name sortKey="Walls, Ramona" sort="Walls, Ramona" uniqKey="Walls R" first="Ramona" last="Walls">Ramona Walls</name>
</country>
<country name="Bulgarie">
<region name="Sofia-ville (oblast)">
<name sortKey="Penev, Lyubomir" sort="Penev, Lyubomir" uniqKey="Penev L" first="Lyubomir" last="Penev">Lyubomir Penev</name>
</region>
</country>
<country name="Allemagne">
<region name="Berlin">
<name sortKey="Hagedorn, Gregor" sort="Hagedorn, Gregor" uniqKey="Hagedorn G" first="Gregor" last="Hagedorn">Gregor Hagedorn</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Ncbi/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000224 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 000224 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Ncbi
   |étape=   Merge
   |type=    RBID
   |clé=     PMC:4400380
   |texte=   Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i   -Sk "pubmed:25901117" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024