Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature

Identifieur interne : 000182 ( Pmc/Curation ); précédent : 000181; suivant : 000183

Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature

Auteurs : Wasila M. Dahdul [États-Unis] ; James P. Balhoff [États-Unis] ; Jeffrey Engeman [États-Unis] ; Terry Grande [États-Unis] ; Eric J. Hilton [États-Unis] ; Cartik Kothari [États-Unis] ; Hilmar Lapp [États-Unis] ; John G. Lundberg [États-Unis] ; Peter E. Midford [États-Unis] ; Todd J. Vision [États-Unis] ; Monte Westerfield [États-Unis] ; Paula M. Mabee [États-Unis]

Source :

RBID : PMC:2873956

Abstract

Background

The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies.

Methodology/Principal Findings

We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators.

Conclusions/Significance

The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.


Url:
DOI: 10.1371/journal.pone.0010708
PubMed: 20505755
PubMed Central: 2873956

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2873956

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature</title>
<author>
<name sortKey="Dahdul, Wasila M" sort="Dahdul, Wasila M" uniqKey="Dahdul W" first="Wasila M." last="Dahdul">Wasila M. Dahdul</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Balhoff, James P" sort="Balhoff, James P" uniqKey="Balhoff J" first="James P." last="Balhoff">James P. Balhoff</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Engeman, Jeffrey" sort="Engeman, Jeffrey" uniqKey="Engeman J" first="Jeffrey" last="Engeman">Jeffrey Engeman</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Grande, Terry" sort="Grande, Terry" uniqKey="Grande T" first="Terry" last="Grande">Terry Grande</name>
<affiliation wicri:level="1">
<nlm:aff id="aff4">
<addr-line>Department of Biology, Loyola University Chicago, Chicago, Illinois, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, Loyola University Chicago, Chicago, Illinois</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hilton, Eric J" sort="Hilton, Eric J" uniqKey="Hilton E" first="Eric J." last="Hilton">Eric J. Hilton</name>
<affiliation wicri:level="1">
<nlm:aff id="aff5">
<addr-line>Department of Fisheries Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Fisheries Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kothari, Cartik" sort="Kothari, Cartik" uniqKey="Kothari C" first="Cartik" last="Kothari">Cartik Kothari</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lapp, Hilmar" sort="Lapp, Hilmar" uniqKey="Lapp H" first="Hilmar" last="Lapp">Hilmar Lapp</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lundberg, John G" sort="Lundberg, John G" uniqKey="Lundberg J" first="John G." last="Lundberg">John G. Lundberg</name>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Academy of Natural Sciences, Philadelphia, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Academy of Natural Sciences, Philadelphia, Pennsylvania</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Midford, Peter E" sort="Midford, Peter E" uniqKey="Midford P" first="Peter E." last="Midford">Peter E. Midford</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Vision, Todd J" sort="Vision, Todd J" uniqKey="Vision T" first="Todd J." last="Vision">Todd J. Vision</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Westerfield, Monte" sort="Westerfield, Monte" uniqKey="Westerfield M" first="Monte" last="Westerfield">Monte Westerfield</name>
<affiliation wicri:level="1">
<nlm:aff id="aff7">
<addr-line>Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute of Neuroscience, University of Oregon, Eugene, Oregon</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Mabee, Paula M" sort="Mabee, Paula M" uniqKey="Mabee P" first="Paula M." last="Mabee">Paula M. Mabee</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">20505755</idno>
<idno type="pmc">2873956</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2873956</idno>
<idno type="RBID">PMC:2873956</idno>
<idno type="doi">10.1371/journal.pone.0010708</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000182</idno>
<idno type="wicri:Area/Pmc/Curation">000182</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature</title>
<author>
<name sortKey="Dahdul, Wasila M" sort="Dahdul, Wasila M" uniqKey="Dahdul W" first="Wasila M." last="Dahdul">Wasila M. Dahdul</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Balhoff, James P" sort="Balhoff, James P" uniqKey="Balhoff J" first="James P." last="Balhoff">James P. Balhoff</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Engeman, Jeffrey" sort="Engeman, Jeffrey" uniqKey="Engeman J" first="Jeffrey" last="Engeman">Jeffrey Engeman</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Grande, Terry" sort="Grande, Terry" uniqKey="Grande T" first="Terry" last="Grande">Terry Grande</name>
<affiliation wicri:level="1">
<nlm:aff id="aff4">
<addr-line>Department of Biology, Loyola University Chicago, Chicago, Illinois, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, Loyola University Chicago, Chicago, Illinois</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hilton, Eric J" sort="Hilton, Eric J" uniqKey="Hilton E" first="Eric J." last="Hilton">Eric J. Hilton</name>
<affiliation wicri:level="1">
<nlm:aff id="aff5">
<addr-line>Department of Fisheries Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Fisheries Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kothari, Cartik" sort="Kothari, Cartik" uniqKey="Kothari C" first="Cartik" last="Kothari">Cartik Kothari</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lapp, Hilmar" sort="Lapp, Hilmar" uniqKey="Lapp H" first="Hilmar" last="Lapp">Hilmar Lapp</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lundberg, John G" sort="Lundberg, John G" uniqKey="Lundberg J" first="John G." last="Lundberg">John G. Lundberg</name>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Academy of Natural Sciences, Philadelphia, Pennsylvania, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Academy of Natural Sciences, Philadelphia, Pennsylvania</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Midford, Peter E" sort="Midford, Peter E" uniqKey="Midford P" first="Peter E." last="Midford">Peter E. Midford</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Vision, Todd J" sort="Vision, Todd J" uniqKey="Vision T" first="Todd J." last="Vision">Todd J. Vision</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Evolutionary Synthesis Center, Durham, North Carolina</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Westerfield, Monte" sort="Westerfield, Monte" uniqKey="Westerfield M" first="Monte" last="Westerfield">Monte Westerfield</name>
<affiliation wicri:level="1">
<nlm:aff id="aff7">
<addr-line>Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute of Neuroscience, University of Oregon, Eugene, Oregon</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Mabee, Paula M" sort="Mabee, Paula M" uniqKey="Mabee P" first="Paula M." last="Mabee">Paula M. Mabee</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Biology, University of South Dakota, Vermillion, South Dakota</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies.</p>
</sec>
<sec>
<title>Methodology/Principal Findings</title>
<p>We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish,
<ext-link ext-link-type="uri" xlink:href="http://zfin.org">http://zfin.org</ext-link>
). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators.</p>
</sec>
<sec>
<title>Conclusions/Significance</title>
<p>The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Grande, L" uniqKey="Grande L">L Grande</name>
</author>
<author>
<name sortKey="Bemis, W" uniqKey="Bemis W">W Bemis</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mayden, Rl" uniqKey="Mayden R">RL Mayden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Royero, R" uniqKey="Royero R">R Royero</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kailola, Pj" uniqKey="Kailola P">PJ Kailola</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, B" uniqKey="Smith B">B Smith</name>
</author>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Rosse, C" uniqKey="Rosse C">C Rosse</name>
</author>
<author>
<name sortKey="Bard, J" uniqKey="Bard J">J Bard</name>
</author>
<author>
<name sortKey="Bug, W" uniqKey="Bug W">W Bug</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gruber, T" uniqKey="Gruber T">T Gruber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gkoutos, Gv" uniqKey="Gkoutos G">GV Gkoutos</name>
</author>
<author>
<name sortKey="Green, Ec" uniqKey="Green E">EC Green</name>
</author>
<author>
<name sortKey="Mallon, Am" uniqKey="Mallon A">AM Mallon</name>
</author>
<author>
<name sortKey="Hancock, Jm" uniqKey="Hancock J">JM Hancock</name>
</author>
<author>
<name sortKey="Davidson, D" uniqKey="Davidson D">D Davidson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sprague, J" uniqKey="Sprague J">J Sprague</name>
</author>
<author>
<name sortKey="Bayraktaroglu, L" uniqKey="Bayraktaroglu L">L Bayraktaroglu</name>
</author>
<author>
<name sortKey="Bradford, Y" uniqKey="Bradford Y">Y Bradford</name>
</author>
<author>
<name sortKey="Conlin, T" uniqKey="Conlin T">T Conlin</name>
</author>
<author>
<name sortKey="Dunn, N" uniqKey="Dunn N">N Dunn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mungall, C" uniqKey="Mungall C">C Mungall</name>
</author>
<author>
<name sortKey="Gkoutos, G" uniqKey="Gkoutos G">G Gkoutos</name>
</author>
<author>
<name sortKey="Washington, N" uniqKey="Washington N">N Washington</name>
</author>
<author>
<name sortKey="Lewis, S" uniqKey="Lewis S">S Lewis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Washington, Nl" uniqKey="Washington N">NL Washington</name>
</author>
<author>
<name sortKey="Haendel, Ma" uniqKey="Haendel M">MA Haendel</name>
</author>
<author>
<name sortKey="Mungall, Cj" uniqKey="Mungall C">CJ Mungall</name>
</author>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Westerfield, M" uniqKey="Westerfield M">M Westerfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mik, I" uniqKey="Mik I">I Mikó</name>
</author>
<author>
<name sortKey="Deans, Ar" uniqKey="Deans A">AR Deans</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sereno, Pc" uniqKey="Sereno P">PC Sereno</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dahdul, Wm" uniqKey="Dahdul W">WM Dahdul</name>
</author>
<author>
<name sortKey="Lundberg, Jg" uniqKey="Lundberg J">JG Lundberg</name>
</author>
<author>
<name sortKey="Midford, Pe" uniqKey="Midford P">PE Midford</name>
</author>
<author>
<name sortKey="Balhoff, Jp" uniqKey="Balhoff J">JP Balhoff</name>
</author>
<author>
<name sortKey="Lapp, H" uniqKey="Lapp H">H Lapp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mabee, Pm" uniqKey="Mabee P">PM Mabee</name>
</author>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Cronk, Q" uniqKey="Cronk Q">Q Cronk</name>
</author>
<author>
<name sortKey="Gkoutos, G" uniqKey="Gkoutos G">G Gkoutos</name>
</author>
<author>
<name sortKey="Haendel, M" uniqKey="Haendel M">M Haendel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mabee, Pm" uniqKey="Mabee P">PM Mabee</name>
</author>
<author>
<name sortKey="Arratia, G" uniqKey="Arratia G">G Arratia</name>
</author>
<author>
<name sortKey="Coburn, M" uniqKey="Coburn M">M Coburn</name>
</author>
<author>
<name sortKey="Haendel, M" uniqKey="Haendel M">M Haendel</name>
</author>
<author>
<name sortKey="Hilton, Ej" uniqKey="Hilton E">EJ Hilton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Muller, Hm" uniqKey="Muller H">HM Muller</name>
</author>
<author>
<name sortKey="Kenny, Ee" uniqKey="Kenny E">EE Kenny</name>
</author>
<author>
<name sortKey="Sternberg, Pw" uniqKey="Sternberg P">PW Sternberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lewis, Se" uniqKey="Lewis S">SE Lewis</name>
</author>
<author>
<name sortKey="Searle, Sm" uniqKey="Searle S">SM Searle</name>
</author>
<author>
<name sortKey="Harris, N" uniqKey="Harris N">N Harris</name>
</author>
<author>
<name sortKey="Gibson, M" uniqKey="Gibson M">M Gibson</name>
</author>
<author>
<name sortKey="Lyer, V" uniqKey="Lyer V">V Lyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Balhoff, Jp" uniqKey="Balhoff J">JP Balhoff</name>
</author>
<author>
<name sortKey="Dahdul, Wm" uniqKey="Dahdul W">WM Dahdul</name>
</author>
<author>
<name sortKey="Kothari, Cr" uniqKey="Kothari C">CR Kothari</name>
</author>
<author>
<name sortKey="Lapp, H" uniqKey="Lapp H">H Lapp</name>
</author>
<author>
<name sortKey="Lundberg, Jg" uniqKey="Lundberg J">JG Lundberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fink, Sv" uniqKey="Fink S">SV Fink</name>
</author>
<author>
<name sortKey="Fink, Wl" uniqKey="Fink W">WL Fink</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eschmeyer, Wn" uniqKey="Eschmeyer W">WN Eschmeyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Albert, Js" uniqKey="Albert J">JS Albert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Armbruster, Jw" uniqKey="Armbruster J">JW Armbruster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bornbusch, Ah" uniqKey="Bornbusch A">AH Bornbusch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Britto, Mr" uniqKey="Britto M">MR Britto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chang, Mm" uniqKey="Chang M">MM Chang</name>
</author>
<author>
<name sortKey="Maisey, Jg" uniqKey="Maisey J">JG Maisey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Pinna, Mcc" uniqKey="De Pinna M">MCC de Pinna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Pinna, Mcc" uniqKey="De Pinna M">MCC de Pinna</name>
</author>
<author>
<name sortKey="Ferraris, Cjj" uniqKey="Ferraris C">CJJ Ferraris</name>
</author>
<author>
<name sortKey="Vari, Rp" uniqKey="Vari R">RP Vari</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fang, F" uniqKey="Fang F">F Fang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grande, T" uniqKey="Grande T">T Grande</name>
</author>
<author>
<name sortKey="Laten, H" uniqKey="Laten H">H Laten</name>
</author>
<author>
<name sortKey="Lopez, Ja" uniqKey="Lopez J">JA Lopez</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grande, T" uniqKey="Grande T">T Grande</name>
</author>
<author>
<name sortKey="Poyato Ariza, Fj" uniqKey="Poyato Ariza F">FJ Poyato-Ariza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Johnson, Gd" uniqKey="Johnson G">GD Johnson</name>
</author>
<author>
<name sortKey="Patterson, C" uniqKey="Patterson C">C Patterson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Poyato Ariza, Fj" uniqKey="Poyato Ariza F">FJ Poyato-Ariza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sanger, Tj" uniqKey="Sanger T">TJ Sanger</name>
</author>
<author>
<name sortKey="Mccune, Ar" uniqKey="Mccune A">AR McCune</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sawada, Y" uniqKey="Sawada Y">Y Sawada</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schaefer, Sa" uniqKey="Schaefer S">SA Schaefer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schaefer, Sa" uniqKey="Schaefer S">SA Schaefer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sidlauskas, Bl" uniqKey="Sidlauskas B">BL Sidlauskas</name>
</author>
<author>
<name sortKey="Vari, Rp" uniqKey="Vari R">RP Vari</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Toledo Piza, M" uniqKey="Toledo Piza M">M Toledo-Piza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Toledo Piza, M" uniqKey="Toledo Piza M">M Toledo-Piza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vari, Rp" uniqKey="Vari R">RP Vari</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vari, Rp" uniqKey="Vari R">RP Vari</name>
</author>
<author>
<name sortKey="Harold, As" uniqKey="Harold A">AS Harold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vigliotta, Tr" uniqKey="Vigliotta T">TR Vigliotta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zanata, Am" uniqKey="Zanata A">AM Zanata</name>
</author>
<author>
<name sortKey="Vari, Rp" uniqKey="Vari R">RP Vari</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Arratia, G" uniqKey="Arratia G">G Arratia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Buckup, Pa" uniqKey="Buckup P">PA Buckup</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cavender, Tm" uniqKey="Cavender T">TM Cavender</name>
</author>
<author>
<name sortKey="Coburn, Mm" uniqKey="Coburn M">MM Coburn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Coburn, Mm" uniqKey="Coburn M">MM Coburn</name>
</author>
<author>
<name sortKey="Cavender, Tm" uniqKey="Cavender T">TM Cavender</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grande, T" uniqKey="Grande T">T Grande</name>
</author>
<author>
<name sortKey="Grande, L" uniqKey="Grande L">L Grande</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lucena, Cas" uniqKey="Lucena C">CAS Lucena</name>
</author>
<author>
<name sortKey="Menezes, Na" uniqKey="Menezes N">NA Menezes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lundberg, Jg" uniqKey="Lundberg J">JG Lundberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Malabarba, Lr" uniqKey="Malabarba L">LR Malabarba</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Gr" uniqKey="Smith G">GR Smith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Soares Porto, Lm" uniqKey="Soares Porto L">LM Soares-Porto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Weitzman, Sh" uniqKey="Weitzman S">SH Weitzman</name>
</author>
<author>
<name sortKey="Menezes, Na" uniqKey="Menezes N">NA Menezes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zaragueta Bagils, R" uniqKey="Zaragueta Bagils R">R Zaragueta Bagils</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bockmann, Fa" uniqKey="Bockmann F">FA Bockmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, X P" uniqKey="Chen X">X-P Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Pinna, Mcc" uniqKey="De Pinna M">MCC de Pinna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Friel, Jp" uniqKey="Friel J">JP Friel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mo, T" uniqKey="Mo T">T Mo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shibatta, Oa" uniqKey="Shibatta O">OA Shibatta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siebert, Dh" uniqKey="Siebert D">DH Siebert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Di Dario, F" uniqKey="Di Dario F">F Di Dario</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maddison, Wp" uniqKey="Maddison W">WP Maddison</name>
</author>
<author>
<name sortKey="Maddison, Dr" uniqKey="Maddison D">DR Maddison</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mungall, Cj" uniqKey="Mungall C">CJ Mungall</name>
</author>
<author>
<name sortKey="Gkoutos, Gv" uniqKey="Gkoutos G">GV Gkoutos</name>
</author>
<author>
<name sortKey="Smith, Cl" uniqKey="Smith C">CL Smith</name>
</author>
<author>
<name sortKey="Haendel, Ma" uniqKey="Haendel M">MA Haendel</name>
</author>
<author>
<name sortKey="Lewis, Se" uniqKey="Lewis S">SE Lewis</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fricke, R" uniqKey="Fricke R">R Fricke</name>
</author>
<author>
<name sortKey="Eschmeyer, Wn" uniqKey="Eschmeyer W">WN Eschmeyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Clarke, Ja" uniqKey="Clarke J">JA Clarke</name>
</author>
<author>
<name sortKey="Ksepka, Dt" uniqKey="Ksepka D">DT Ksepka</name>
</author>
<author>
<name sortKey="Smith, Na" uniqKey="Smith N">NA Smith</name>
</author>
<author>
<name sortKey="Norell, Ma" uniqKey="Norell M">MA Norell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Santagata, S" uniqKey="Santagata S">S Santagata</name>
</author>
<author>
<name sortKey="Cohen, Bl" uniqKey="Cohen B">BL Cohen</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">20505755</article-id>
<article-id pub-id-type="pmc">2873956</article-id>
<article-id pub-id-type="publisher-id">09-PONE-RA-14234R1</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0010708</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline">
<subject>Evolutionary Biology</subject>
<subject>Evolutionary Biology/Bioinformatics</subject>
<subject>Computational Biology/Bio-ontology</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature</article-title>
<alt-title alt-title-type="running-head">Curation of Evolutionary Data</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Dahdul</surname>
<given-names>Wasila M.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Balhoff</surname>
<given-names>James P.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Engeman</surname>
<given-names>Jeffrey</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Grande</surname>
<given-names>Terry</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hilton</surname>
<given-names>Eric J.</given-names>
</name>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kothari</surname>
<given-names>Cartik</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lapp</surname>
<given-names>Hilmar</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lundberg</surname>
<given-names>John G.</given-names>
</name>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Midford</surname>
<given-names>Peter E.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Vision</surname>
<given-names>Todd J.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Westerfield</surname>
<given-names>Monte</given-names>
</name>
<xref ref-type="aff" rid="aff7">
<sup>7</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Mabee</surname>
<given-names>Paula M.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>1</label>
<addr-line>Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>National Evolutionary Synthesis Center, Durham, North Carolina, United States of America</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>Department of Biology, Loyola University Chicago, Chicago, Illinois, United States of America</addr-line>
</aff>
<aff id="aff5">
<label>5</label>
<addr-line>Department of Fisheries Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia, United States of America</addr-line>
</aff>
<aff id="aff6">
<label>6</label>
<addr-line>Academy of Natural Sciences, Philadelphia, Pennsylvania, United States of America</addr-line>
</aff>
<aff id="aff7">
<label>7</label>
<addr-line>Institute of Neuroscience, University of Oregon, Eugene, Oregon, United States of America</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Kelso</surname>
<given-names>Janet</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">Max Planck Institute for Evolutionary Anthropology, Germany</aff>
<author-notes>
<corresp id="cor1">* E-mail:
<email>wasila.dahdul@usd.edu</email>
(WMD);
<email>pmabee@usd.edu</email>
(PMM)</corresp>
<fn fn-type="con">
<p>Wrote the paper: WMD JPB JE TG EJH CK HL JGL PEM TJV MW PMM. Curated phenotypic data: WMD JE TG EJH JGL PMM. Developed curation standards: WMD JE JGL PMM. Developed curation workflow: WMD JPB HL JGL PEM TJV MW PMM. Maintained and updated ontologies: WMD PEM. Developed the figures: WMD JPB PEM. Provided quantitative data descriptions: JPB CK.</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>20</day>
<month>5</month>
<year>2010</year>
</pub-date>
<volume>5</volume>
<issue>5</issue>
<elocation-id>e10708</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>11</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>6</day>
<month>4</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Dahdul et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</copyright-statement>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies.</p>
</sec>
<sec>
<title>Methodology/Principal Findings</title>
<p>We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish,
<ext-link ext-link-type="uri" xlink:href="http://zfin.org">http://zfin.org</ext-link>
). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators.</p>
</sec>
<sec>
<title>Conclusions/Significance</title>
<p>The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.</p>
</sec>
</abstract>
<counts>
<page-count count="12"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Variation in observable features, or phenotypes, is intensely studied and richly documented within and between species in the literature of systematic biology (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Grande1">[1]</xref>
_msocom_2), between wild-type and mutant lines in model organism databases (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Mouse1">[2]</xref>
), and among genetic phenotypes of humans (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Online1">[3]</xref>
). Although fundamentally important to our understanding of genetics, development, and evolutionary relationships, phenotypic descriptions exist almost exclusively in a free-text or natural language format that is not amenable to computational processing. For example, the diverse ways of describing the shape of the first infraorbital bone in fishes (“lacrymal bone … flat”
<xref ref-type="bibr" rid="pone.0010708-Mayden1">[4]</xref>
; “lacrimal … triangular”
<xref ref-type="bibr" rid="pone.0010708-Royero1">[5]</xref>
; “first infraorbital (lachrimal) shape…flattened”
<xref ref-type="bibr" rid="pone.0010708-Kailola1">[6]</xref>
) _msocom_3might seem obviously similar to a human but would not be recognized as similar by a computer. Natural language, although allowing the expressive and precise description of biological form, has serious limitations for comparing or integrating data across studies, linking to genetic databases, and data mining.</p>
<p>To facilitate comparison and integration of phenotypes across organisms, model organism communities have spearheaded the representation of mutant phenotypes using ontologies and formal semantics
<xref ref-type="bibr" rid="pone.0010708-Smith1">[7]</xref>
. An ontology extends the notion of a controlled vocabulary by associating names with formally defined entities, which include classes and relationships among those classes (c.f.,
<xref ref-type="bibr" rid="pone.0010708-Gruber1">[8]</xref>
). Here we use ‘term’ to refer to those names associated with classes, in contrast to names of relationships. The application of ontologies to the curation of phenotype data from the model organism literature and sharing of these annotations in community databases has promoted clarity in communication among researchers and allowed for integration of large quantities of data. In addition to facilitating interoperability among databases, ontologies allow users to query using very specific or broad anatomical terms and obtain organized groups of annotations. For example, a query with the term
<italic>dorsal fin</italic>
will also return
<italic>dorsal fin ray</italic>
because of its
<italic>part_of</italic>
relationship to
<italic>dorsal fin</italic>
. The Entity-Quality (EQ) formalism, which combines ‘entity’ terms from an anatomical or other ontology (e.g., ontologies that describe observable organism features such as behavior), with non-taxon-specific ‘quality’ terms from the Phenotype and Trait Ontology (PATO)
<xref ref-type="bibr" rid="pone.0010708-Gkoutos1">[9]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Sprague1">[10]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Mungall1">[11]</xref>
has been employed in phenotype descriptions of model organism mutants, where it has been shown to facilitate the identification of biologically similar phenotypes in different species
<xref ref-type="bibr" rid="pone.0010708-Washington1">[12]</xref>
. Ontologies and the EQ formalism have also recently been applied to the standardization of taxonomic descriptions
<xref ref-type="bibr" rid="pone.0010708-Mik1">[13]</xref>
. While the curation of data from the literature, and the annotation or tagging of those data using ontology terms, may be practices that are less familiar to evolutionary biologists than those in molecular genetics communities, they are nonetheless closely analogous to the curation of museum specimens and their associated metadata, such as locality.</p>
<p>Phenotypic variability across species has been documented in rich natural language in the comparative literature of evolutionary biology and most formally in phylogenetic systematics. This variability is described in systematic characters, which consist of two or more character states contrasting some aspect (e.g., morphology, behavior) of the taxa under study
<xref ref-type="bibr" rid="pone.0010708-Sereno1">[14]</xref>
. Character states are assigned to taxa in a character-by-taxon matrix that is analyzed with phylogenetic methods to infer hypotheses of evolutionary relationships. The EQ formalism has been suggested as a means to integrate data across systematic studies and with phenotypes and genetics of model organisms
<xref ref-type="bibr" rid="pone.0010708-Dahdul1">[15]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Mabee1">[16]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Mabee2">[17]</xref>
. For example, a character may describe how a structure (e.g., supraorbital bone) and its attribute (e.g., shape) vary among taxa (
<xref ref-type="fig" rid="pone-0010708-g001">Figure 1</xref>
); the character states specifying the value of the attribute (e.g., sigmoid). In comparing EQ syntax to systematic characters, the quality term represents the character state and, by implication through the subtype relationships of the quality ontology, the attribute of the character (e.g.,
<italic>sigmoid</italic>
is a subtype of
<italic>shape</italic>
,
<xref ref-type="fig" rid="pone-0010708-g001">Figure 1</xref>
).</p>
<fig id="pone-0010708-g001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0010708.g001</object-id>
<label>Figure 1</label>
<caption>
<title>A systematic character and state, compared to a phenotype represented by Entity-Quality syntax.</title>
<p>In EQ syntax, the entity being described is represented by a term from an anatomical ontology, and the variable (“characteristic”) aspect of the entity is represented using a term chosen from a quality ontology. Note that the “shape” attribute is an explicit part of the systematic character but is expressed only implicitly within the quality term.</p>
</caption>
<graphic xlink:href="pone.0010708.g001"></graphic>
</fig>
<p>Software tools specific to the type of data being curated have proven to be a critical ingredient of an efficient annotation workflow (e.g., for journal articles using Textpresso
<xref ref-type="bibr" rid="pone.0010708-Muller1">[18]</xref>
; or gene structures using Apollo
<xref ref-type="bibr" rid="pone.0010708-Lewis1">[19]</xref>
). To link ontology terms with phenotypic systematic characters and taxa, we developed the Phenex curation tool
<xref ref-type="bibr" rid="pone.0010708-Balhoff1">[20]</xref>
. Upon launch, Phenex automatically downloads the most recent versions of the required anatomy, quality, taxonomy, and other ontologies. Phenex allows curators to use EQ syntax to represent evolutionary characters
<xref ref-type="bibr" rid="pone.0010708-Mabee1">[16]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Mabee2">[17]</xref>
. Using Phenex, this simple combinatorial EQ syntax can be elaborated, for example, to accommodate multiple related entities and to describe complex entities.</p>
<p>As part of the Phenoscape Project (
<ext-link ext-link-type="uri" xlink:href="http://www.phenoscape.org">http://www.phenoscape.org</ext-link>
), which aims to integrate model organism with evolutionary phenotype data using ontologies, we mounted a large-scale initiative to curate a significant data set of evolutionarily-varying phenotypes from the phylogenetic systematic literature. Specifically, we curated 4,617 characters pertaining to a monophyletic group of teleostean fishes, the Ostariophysi (catfishes, characins, knifefishes, carps, and minnows;
<xref ref-type="bibr" rid="pone.0010708-Fink1">[21]</xref>
), which also includes the model organism, zebrafish (
<italic>Danio rerio</italic>
). There are many decisions and details involved in the implementation that are not necessarily intuitive or straightforward; these decisions affect how the annotated data can be used. As a result of this experience, we recommend standards and procedures that will enable more consistent and efficient translation of complex phenotype descriptions into computable data. These in turn will enable accurate character and phenotype comparisons and integration on a much broader scale. The principles and best practices for the curation of complex phenotypes that we have developed from this exercise are generally applicable, as are the challenges inherent in aligning rich textual descriptions with ontologies and syntactic relations.</p>
</sec>
<sec id="s2" sec-type="methods">
<title>Methods</title>
<sec id="s2a">
<title>Curation software and ontologies</title>
<p>For annotation of the 47 studies from the fish phylogenetic literature with EQ syntax, we used Phenex
<xref ref-type="bibr" rid="pone.0010708-Balhoff1">[20]</xref>
, the annotation software that we developed for evolutionary biologists to link phenotype descriptions (characters and character states) with ontology terms. Phenex can be configured to load any ontology in OBO (Open Biological and Biomedical Ontologies;
<xref ref-type="bibr" rid="pone.0010708-Smith1">[7]</xref>
) format. We configured it to load the Teleost Anatomy Ontology (TAO)
<xref ref-type="bibr" rid="pone.0010708-Dahdul1">[15]</xref>
and Teleost Taxonomy Ontology (TTO) (Midford et al., in prep), in addition to several other shared community ontologies including the Phenotype and Trait Ontology (PATO), Gene Ontology (GO), Spatial Ontology (BSPO), Relations Ontology (RO), Evidence Code Ontology (ECO), and Unit Ontology (UO). These ontologies are available for download from the OBO Foundry
<xref ref-type="bibr" rid="pone.0010708-Open1">[22]</xref>
. A list of museum codes
<xref ref-type="bibr" rid="pone.0010708-Available1">[23]</xref>
derived from the Catalog of Fishes
<xref ref-type="bibr" rid="pone.0010708-Eschmeyer1">[24]</xref>
was also loaded into Phenex. Phenex files were saved in NeXML format, a phylogenetic data exchange standard that permits systematic data to be tagged with ontology terms
<xref ref-type="bibr" rid="pone.0010708-Available2">[25]</xref>
.</p>
</sec>
<sec id="s2b">
<title>Literature selection and collection</title>
<p>An initial list of 420 studies, including published species descriptions, taxonomic revisions, phylogenetic studies, and unpublished theses and dissertations, was compiled from suggestions by 10 experts on the morphology of ostariophysan fishes and close relatives. These experts helped prioritize the list to emphasize studies on higher-level groups for broad taxonomic coverage, and studies that included data matrices for the efficient annotation of phenotypic characters. We describe here the results of curation of all characters reported in a total of 47 studies on ostariophysans, their clupeomorph relatives, and some euteleosts (percomorphs and salmoniforms). These studies were published between the years 1981–2008. They included 26 peer-reviewed publications
<xref ref-type="bibr" rid="pone.0010708-Mayden1">[4]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Kailola1">[6]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Fink1">[21]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Albert1">[26]</xref>
<xref ref-type="bibr" rid="pone.0010708-Zanata1">[48]</xref>
, 12 book chapters
<xref ref-type="bibr" rid="pone.0010708-Arratia1">[49]</xref>
<xref ref-type="bibr" rid="pone.0010708-ZaraguetaBagils1">[60]</xref>
, eight Ph.D. dissertations
<xref ref-type="bibr" rid="pone.0010708-Royero1">[5]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Bockmann1">[61]</xref>
<xref ref-type="bibr" rid="pone.0010708-Siebert1">[67]</xref>
, and one M.S. thesis
<xref ref-type="bibr" rid="pone.0010708-DiDario1">[68]</xref>
. Because this collection of studies spans the decades prior to the widespread availability of publications in electronic format, we obtained electronic versions from numerous sources: approximately half (23) were scanned from hard copies with text translated by Optical Character Recognition (OCR), 19 were obtained online from institutional libraries, two hard-copies were acquired through Interlibrary Loan, one was purchased from ProQuest UMI Dissertation Publishing, one was downloaded from the Biodiversity Heritage Library, and one was obtained from the corresponding author.</p>
</sec>
<sec id="s2c">
<title>Curator training, curation experiments, and quality control</title>
<p>Curation of the complex phenotypic descriptions contained in systematics publications required the input of domain experts who were knowledgeable on the anatomy of the target taxa. Curation was done by five ichthyologists under the direction of a lead curator (W. Dahdul). Curators were trained one-on-one by the lead curator at annotation workshops or remotely by conference calls. A Guide to Character Annotation was maintained on the Phenoscape wiki
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
that kept curators up-to-date on the developing best practices for curation. The phenoscape-curators mailing list
<xref ref-type="bibr" rid="pone.0010708-1">[70]</xref>
was used for discussion and communication of data curation issues, solutions, and progress. Participation in and discussion of issues on several OBO Foundry
<xref ref-type="bibr" rid="pone.0010708-Open1">[22]</xref>
community mailing lists, particularly obo-discuss
<xref ref-type="bibr" rid="pone.0010708-obodiscuss1">[71]</xref>
and obo-phenotype
<xref ref-type="bibr" rid="pone.0010708-obophenotype1">[72]</xref>
, also contributed to the development of standards for the curation process.</p>
<p>As part of our curation quality control, we conducted two annotation experiments at Phenoscape project workshops to identify areas of improvement in curator training, ontology development, and software tools. We wanted to determine how often, and for what reasons, curators choose divergent EQ statements for the same character and character states. Curator training consisted of a hands-on group annotation exercise, and at least one full day of individual work on each curator's own publications with assistance from the lead curator and other project personnel. The Guide to Character Annotation
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
, with examples of character types commonly encountered in the fish systematic literature, was also provided to the curators. In both curation experiments, the same 10 characters sampled from the ichthyological literature were annotated by 4 or 5 curators in parallel.</p>
</sec>
<sec id="s2d">
<title>Curation workflow</title>
<p>The workflow for curation of publications (
<xref ref-type="fig" rid="pone-0010708-g002">Figure 2</xref>
) required the coordinated activities of students, taxonomic and anatomical experts, the use of specialized curation software (Phenex), online tools, and community input.</p>
<fig id="pone-0010708-g002" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0010708.g002</object-id>
<label>Figure 2</label>
<caption>
<title>Workflow for the curation of phenotypic characters from systematic studies.</title>
</caption>
<graphic xlink:href="pone.0010708.g002"></graphic>
</fig>
<sec id="s2d1">
<title>Free text entry</title>
<p>Free-text data were manually entered into Phenex by undergraduate student workers. The data entered included character and character state descriptions, taxon names, phylogenetic matrices, and specimen collection numbers. Although a few matrices were obtained from authors, most were transcribed by undergraduate students from the original publications using Mesquite
<xref ref-type="bibr" rid="pone.0010708-Maddison1">[73]</xref>
, and subsequently imported as NEXUS files into Phenex. Matrices for our publications of interest had not been deposited into public data repositories such as TreeBASE (
<ext-link ext-link-type="uri" xlink:href="http://treebase.org">http://treebase.org</ext-link>
), an online source of user-contributed phylogenetic matrices.</p>
<p>Materials (species and specimen) lists were also manually recorded using Phenex. Specimen information in the Materials section in the systematic literature is customarily organized by species, and sometimes by a higher-level taxonomic category such as family. Species names are followed by the examined voucher specimens, their institutional catalogue acronyms and numbers, and often by the number of specimens (in parentheses). The number of specimens is frequently qualified by the number in the lot that were examined and/or the number that were prepared differently (e.g. cleared and stained for bone and/or cartilage, radiographed, dry skeleton, muscles, alcohol-preserved). We curated only those specimens that were prepared for the observations that the authors documented in the character statements (typically only skeletal morphology). Additionally, some authors provided the size range of individuals in the collection, and abbreviated locality information, and they sometimes indicated whether the specimen(s) forms part of the species type series; such data were not curated.</p>
</sec>
<sec id="s2d2">
<title>Selection of ontology terms for taxa</title>
<p>The Materials list and character matrix of each systematics publication contain names of the taxa and individual specimens (voucher specimens) that were examined by the authors. These form the basis of observations for phylogenetic characters. Taxon names from the Materials list and matrix were linked to currently accepted (according to the Catalog of Fishes, CoF,
<xref ref-type="bibr" rid="pone.0010708-Available2">[25]</xref>
) taxon names from the TTO by undergraduate student workers using Phenex. A taxonomic expert then reviewed the taxon list and, after verifying taxonomic status in the CoF, requested addition of names or synonyms missing from the TTO using the SourceForge term request tracker
<xref ref-type="bibr" rid="pone.0010708-Teleost1">[74]</xref>
. Unknown or unidentified species were added to the TTO with reference to the publication in parenthesis (e.g.,
<italic>Akysis sp. 1 (de Pinna 1996)</italic>
TTO:10000093,
<italic>Akysis sp. 2 (de Pinna 1996)</italic>
TTO:10000094). Most synonyms were added to the TTO with a scope of RELATED (rather than BROAD, NARROW or EXACT), indicating that the relationship between the synonym and its primary term was not known. Species names incorrectly spelled by an author were added to the TTO as synonyms with a scope of EXACT and an associated synonym category of ‘misspelling.’ The eleven misspellings and missing taxon names discovered in the CoF through this process were communicated to the CoF administrators for correction or addition.</p>
<p>Some publications partially or wholly replicated the species names from the Materials list in the phylogenetic matrix. However, many publications used higher-level taxa (e.g., genus, family) for the taxonomic units represented in the matrix. Because we recorded phenotypes as properties of species unless specifically asserted to a higher-level taxon by an author, any higher-level taxon used in a matrix was replaced by all the species within that taxon as listed in the Materials list. This procedure sometimes required contacting a taxon expert for assistance in assigning species to the correct higher-level taxon in the matrix.</p>
</sec>
<sec id="s2d3">
<title>Selection of anatomy ontology terms for representation of character states</title>
<p>When curators encountered a term in the literature that was not in an ontology, they first assessed its use and context in the publication to determine whether it was a new term or a synonym of an existing term (synonyms include misspellings). This involved reading all uses of the term in the paper and checking figures to see whether the author provided further information. Sometimes this also required searching the referenced literature pertaining to the term. If it was deemed to be a new term, the curator wrote a corresponding genus-differentia definition
<xref ref-type="bibr" rid="pone.0010708-Smith1">[7]</xref>
and proposed the relationships of that term to other terms in the ontology. For example, a term was requested for the hypomaxilla, a bone of the upper jaw in clupeomorph fishes. The request included a proposed definition, “Dermal bone found in the anterior margin of the upper jaw, posterior to the premaxilla,” and proposed relationships to other terms (
<italic>is_a dermal bone</italic>
,
<italic>part_of palatoquadrate arch</italic>
,
<italic>part_of dermatocranium</italic>
). The curator submitted this request through the TAO SourceForge Term Tracker
<xref ref-type="bibr" rid="pone.0010708-Teleost2">[75]</xref>
, which triggered an automated email to the community mailing list
<xref ref-type="bibr" rid="pone.0010708-Teleostdiscuss1">[76]</xref>
. The ontology administrator closed the request after the conclusion of mailing list discussion, and then updated the ontology to include the requested change and associated community comments. A similar term request procedure was followed for quality terms needed for PATO.</p>
<p>An alternative to adding terms to an ontology is to create a new term at the time of annotation by post-composition, which is the process of combining terms from one or more ontologies to create a new term (
<xref ref-type="bibr" rid="pone.0010708-Mungall2">[77]</xref>
; also see Guide to Character Annotation in
<xref ref-type="supplementary-material" rid="pone.0010708.s001">Text S1</xref>
). Frequently, curators needed terms for the processes, margins, and regions of specific structures. Rather than adding these directly to an ontology, relevant terms from the spatial ontology (e.g.,
<italic>anterior margin</italic>
) and anatomy ontology (
<italic>frontal bone</italic>
) can be joined by a relation (
<italic>part_of</italic>
) to create a post-composed term (the
<italic>anterior margin</italic>
that is
<italic>part_of</italic>
the
<italic>frontal bone</italic>
). Generally, terms were post-composed when they were not expected to be used repeatedly in annotation. Those known to exist in multiple species and referenced repeatedly in the literature were added to the anatomy ontology (e.g.,
<italic>supraoccipital process</italic>
).</p>
</sec>
<sec id="s2d4">
<title>Granularity of curation</title>
<p>To maximize curation consistency among curators and to meet the needs of the larger purpose of our work, which is to integrate the phenotypic data of evolutionary morphology with phenotype descriptions of zebrafish mutants, we did a ‘first pass’ curation of the characters to a coarse level of granularity. By coarse, we mean that we selected higher-level terms, or those with less specificity, for quality and sometimes entity. Coarse-level qualities from the Phenotype and Trait Ontology (PATO) are those at the attribute level such as
<italic>size</italic>
,
<italic>shape</italic>
, and
<italic>composition</italic>
(i.e., those terms in blue font in
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
). The supraorbital bone, for example, is described as having a sigmoid shape in some characiform fishes
<xref ref-type="bibr" rid="pone.0010708-Zanata1">[48]</xref>
. The coarse-level EQ annotation for this phenotype is
<monospace>E:supraorbital bone, Q:shape</monospace>
, whereas the fine-level annotation is
<monospace>E:supraorbital bone, Q:sigmoid </monospace>
(
<xref ref-type="fig" rid="pone-0010708-g004">Figure 4</xref>
). Coarse annotation meets the immediate use of linking to zebrafish genetic phenotypes in the Phenoscape Knowledgebase (
<ext-link ext-link-type="uri" xlink:href="http://kb.phenoscape.org">http://kb.phenoscape.org</ext-link>
), because most of the zebrafish phenotypes are currently annotated to a coarse level by ZFIN. In addition, the coarse-level annotations, though lacking the detail that free-text provides, do express the author's assertion that a change in some aspect of shape is evident between species. Annotations at this coarse level, i.e.,
<italic>shape</italic>
, allow aggregation of all entities and species that have experienced an evolutionary change in shape. After curation of the 47 papers at a coarse level was complete, we did a ‘second pass’ of finer-scale curation of qualities by selecting a more specific child term and finer-scale curation of some entities by using post-composition.</p>
<fig id="pone-0010708-g003" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0010708.g003</object-id>
<label>Figure 3</label>
<caption>
<title>Attribute-level quality terms from the Phenotype and Trait Ontology used to curate systematic characters.</title>
<p>Terms in blue font represent the higher-level concepts used to describe phenotypic variation at a coarse level (see
<xref ref-type="fig" rid="pone-0010708-g004">Figure 4</xref>
). Qualities are divided into those that inhere in a single entity (
<italic>quality of single physical entities</italic>
; green fill) and those that inhere in multiple entities (
<italic>quality of related physical entities</italic>
; red fill). Examples of children of some terms are also shown.</p>
</caption>
<graphic xlink:href="pone.0010708.g003"></graphic>
</fig>
<fig id="pone-0010708-g004" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0010708.g004</object-id>
<label>Figure 4</label>
<caption>
<title>Comparison of coarse-level and fine-level phenotype annotations for the observation of a sigmoid-shaped supraorbital bone.</title>
</caption>
<graphic xlink:href="pone.0010708.g004"></graphic>
</fig>
</sec>
<sec id="s2d5">
<title>Evidence codes for phenotype observations</title>
<p>We recorded phenotype descriptions as properties of species, and these were assigned one of three evidence codes based on the type of evidence given by an author. These codes are part of the Evidence Codes Ontology
<xref ref-type="bibr" rid="pone.0010708-Evidence1">[78]</xref>
, which is used by the broader biological community. A phenotype description that is explicitly tied to a specimen was assigned IVS (Inferred from Voucher Specimen); these referenced an institutional catalog number. A phenotype description in which the author does not reference a specimen was given one of two weaker evidence codes (NAS, Non-traceable Author Statement, or TAS, Traceable Author Statement); no catalog number could be associated. NAS is used for statements that an author makes with no results or citation presented. TAS is used for author statements that are attributable to another source. This same methodology was extended to statements about higher-level taxa. Here, species-level phenotype annotations were generated for every species included in the higher-level taxon (as listed in the Materials list). In this case, these particular species were given a strong evidence code (such as IVS) with catalog numbers attached. When the author did not reference the species that were observed to make character assertions about higher-level taxa, the higher-level taxa were assigned a weaker evidence code. The phenotypes described in most of the publications that we curated were based on observations of voucher specimens and so merited the strong IVS evidence code.</p>
</sec>
<sec id="s2d6">
<title>Review of annotations for consistency among curators</title>
<p>Annotation summary reports in the form of a spreadsheet containing annotations for all 47 publications were generated regularly from the Phenoscape Knowledgebase to review annotation consistency. Additionally, Phenex can export files in Excel format, so that consistency can be checked for individual files. We developed a series of review tasks that checked for proper EQ syntax and consistent annotation of different character types. These include checking that a related entity was recorded when a relational quality was used; checking that a related entity was not recorded with a quality of a single entity; checking for incomplete annotations (e.g., only one state annotated); and checking that post-composed terms were nested correctly (e.g.,
<italic>process</italic>
(
<italic>part_of</italic>
(
<italic>anterior region</italic>
(
<italic>part_of</italic>
(
<italic>maxilla</italic>
)))) versus
<italic>process</italic>
(
<italic>part_of</italic>
(
<italic>anterior region</italic>
))((
<italic>part_of</italic>
(
<italic>maxilla</italic>
))) and created with logically correct relations (e.g.,
<italic>part_of</italic>
,
<italic>connected_to</italic>
,
<italic>overlaps_with</italic>
,
<italic>adjacent_to</italic>
and spatial relations such as
<italic>anterior_to</italic>
). Undergraduate student workers also proofread the curated files to check for the correct transcription of data matrices and numerical values of counts.</p>
</sec>
<sec id="s2d7">
<title>Author contact and verification</title>
<p>fter the curation of each publication was completed and verified for consistency, the primary author was notified by email that their published data had been curated for inclusion in the Phenoscape Knowledgebase. Authors were sent a spreadsheet with the original character and taxonomic data and its ontological representation (e.g., free-text character descriptions vs. EQ phenotypes; published vs. currently accepted taxonomic names). Authors were invited to send suggestions or corrections prior to upload of the data to the public version of the Phenoscape Knowledgebase. Two authors returned corrections to their published data.</p>
</sec>
<sec id="s2d8">
<title>Data upload to Phenoscape Knowledgebase</title>
<p>Phenotypes and corresponding matrix information were loaded into the Phenoscape Knowledgebase, a relational database built on the Ontology-Based Database (OBD) schema, in which all data are represented as semantic links between ontology terms (Kothari et al., in preparation). The deductive reasoning and the query interface of OBD support analyses of the anatomical and taxonomic disposition of phenotypic annotations at any level of granularity that is present within the logical structure of the ontologies. In addition to the Knowledgebase itself, we created a web interface (
<ext-link ext-link-type="uri" xlink:href="http://kb.phenoscape.org/">http://kb.phenoscape.org/</ext-link>
) that allows users to browse and query the phenotype data in ways that exploit the ontological context.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>Results</title>
<sec id="s3a">
<title>Characteristics of the data reported in the source publications</title>
<p>From our comprehensive undertaking to represent complex phenotype data using ontologies, patterns emerged in how characters, taxa, specimens, and matrices were presented in the different studies. These led to the creation of annotation guidelines (summarized in
<xref ref-type="supplementary-material" rid="pone.0010708.s001">Text S1</xref>
;
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
). The standards and the variation encountered in the literature also drove the development of our annotation software (Phenex).</p>
<sec id="s3a1">
<title>Character and character states and EQ annotation</title>
<p>The curation of 4,617 characters, or 10,512 character states, resulted in 12,861 ontology-based phenotypes or Entity-Quality statements. Characters and character states were divided among several categories in the process of EQ annotation. First we distinguished among characters and states that involved a single entity vs. those that involved two or more entities, terminology that follows the division of quality terms in PATO
<xref ref-type="bibr" rid="pone.0010708-Mungall1">[11]</xref>
. For example, a character might involve a single anatomical structure, such as the shape of the dorsal fin (“dorsal fin … acuminate”
<xref ref-type="bibr" rid="pone.0010708-Arratia1">[49]</xref>
) versus a character that involves the relationship between two structures (“dorsal fin origin anterior to that of pelvic fin”
<xref ref-type="bibr" rid="pone.0010708-Arratia1">[49]</xref>
). Selection of specific entities from the appropriate ontology was generally the next step in the curation process (described further in Discussion). The third step was to determine the particular quality, initially at the attribute level, that is required to represent the phenotype described in the character state. If a single entity is involved, a monadic quality (
<italic>quality of a single physical entity</italic>
;
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
), i.e., one that inheres in a single entity is required. If two (or more) entities are involved in the phenotype, frequently a relational quality (
<italic>quality of related physical entities</italic>
;
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
), i.e., one that inheres between multiple entities, is required. Size comparisons among entities require special consideration, and involve monadic qualities (see
<xref ref-type="supplementary-material" rid="pone.0010708.s001">Text S1</xref>
). Last, we considered whether a character state contained single or multiple logical qualities and thus required single (non-composite characters) or multiple (composite characters) EQ statements. Frequently we found that several different attribute qualities (e.g.,
<italic>color</italic>
and
<italic>shape</italic>
;
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
) were required for the annotation of composite characters (see Systematic Character Types and Application of EQ Formalism).</p>
</sec>
<sec id="s3a2">
<title>Taxonomic names</title>
<p>From the 47 publications, we curated phenotype data to 3,449 taxa (mostly species), of which 2,682 were nonredundant names, of which there were a corresponding 2,410 valid names (according to the Catalog of Fishes, CoF
<xref ref-type="bibr" rid="pone.0010708-Eschmeyer1">[24]</xref>
). Of the 2,682 names, 729 are now invalid or were misspelled. The invalid names were annotated to the currently valid name as listed in CoF using the TTO, or the invalid or misspelled name was added as a synonym, if not already present in CoF. Three hundred and ten taxon names were added to the TTO as unknown, uncertainly identified, or unnamed taxa (out of 36,895 taxonomic terms total). We included reference to the author(s) and year in the term name for these 310 publication-specific taxa (e.g.,
<italic>Akysis sp. 1 (de Pinna 1996)</italic>
, TTO:10000093).</p>
</sec>
<sec id="s3a3">
<title>Specimens and materials examined</title>
<p>In most of the publications (42 of the 47) a list of materials was presented, giving the provenance and other information about the particular specimens that were examined. Synthesis papers
<xref ref-type="bibr" rid="pone.0010708-Lundberg1">[55]</xref>
or book chapters
<xref ref-type="bibr" rid="pone.0010708-Cavender1">[51]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Coburn1">[52]</xref>
frequently did not include a Materials list, but specimen collection numbers were sometimes provided in figure captions (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Johnson1">[36]</xref>
). We obtained a list of materials examined from the authors of these papers where possible. Some authors referred to previous publications for a full list of materials (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Bornbusch1">[28]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Weitzman1">[59]</xref>
), and where feasible, we curated specimens from these.</p>
</sec>
<sec id="s3a4">
<title>Character-by-taxon matrices</title>
<p>Forty-five of the 47 curated publications included data matrices. From the two publications that lacked them, one was supplied by the author
<xref ref-type="bibr" rid="pone.0010708-Mayden1">[4]</xref>
and the other was reconstructed from the text
<xref ref-type="bibr" rid="pone.0010708-Fink1">[21]</xref>
. Some matrices contained numerical character states that were not textually described. We annotated EQ statements for only those character states that were documented.</p>
<p>Higher-level taxa appeared in 35 of the 45 published data matrices. These taxa were expanded to the species level to represent the particular species examined in a publication. As mentioned previously, this was not always straightforward because some authors did not indicate which species belong to these higher-level taxa used in the matrix. For example, one author
<xref ref-type="bibr" rid="pone.0010708-dePinna3">[63]</xref>
categorized species by family in the Materials list, but additionally used subfamilies in the matrix. In this case, curation of species to the correct matrix subfamily (and thus to the correct phenotype descriptions) required personal communication with an expert taxonomist. Additionally, some authors (e.g.,
<xref ref-type="bibr" rid="pone.0010708-Mo1">[65]</xref>
) organized species in the Materials list by higher-level taxa proposed in their own study (and reflected in their matrices) instead of by currently recognized higher-level taxa. Again, curation of species to the correct matrix taxon, and thus to the correct character data, required personal communication with an expert taxonomist.</p>
</sec>
</sec>
<sec id="s3b">
<title>Ontology growth</title>
<p>As a result of literature curation, the ontologies that we used grew in number of new terms, synonyms, definitions, and relationships among terms. The TAO more than doubled its skeletal terms, from 253 in version 1
<xref ref-type="bibr" rid="pone.0010708-Dahdul1">[15]</xref>
to 644 skeletal terms of 2,662 total in the most recent version (March 2010). The TTO grew to 36,895 taxon terms, including 154 fossil taxa (from an initial 36,080 terms with no fossil taxa), 43,215 synonyms (from an initial 38,269), 30,865 species, 5,107 genera, and 551 families. Our curators contributed 16 terms, four synonyms, three name changes, and one relationship change to PATO. We added four terms (
<italic>parental care</italic>
,
<italic>oral incubation</italic>
,
<italic>adult foraging behavior</italic>
,
<italic>foraging by probing substrate</italic>
) to the Biological Process hierarchy of the Gene Ontology (GO-BP), and 41 terms and six synonyms to the Spatial Ontology. Museum codes were based on the CoF list
<xref ref-type="bibr" rid="pone.0010708-Fricke1">[79]</xref>
(479 entries) that was enhanced with 37 additional codes identified during the process of paper curation.</p>
</sec>
<sec id="s3c">
<title>Curation experiments and curator consistency</title>
<p>In the first curation experiment, only one of 10 characters was annotated identically among four curators. The reasons for this variability among curators included curation software bugs, difficult aspects of the ontologies (e.g., lack of appropriate quality terms from PATO), lack of standardized guidelines for unusual cases, and differing interpretations of the text descriptions. For the second experiment, curators were told to curate characters to a coarse level of granularity for quality. In this experiment, a greater proportion of characters were annotated correctly to the higher-level quality term (
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
) although only two of the 10 characters were annotated identically among curators for the more specific child term. The overall variability in annotation consistency resulted from different interpretations of shape and size descriptors, inexperience and unfamiliarity with the ontologies and software, difficulties in creating post-compositional terms, and lack of adequate terms in the ontologies, particularly for shape descriptors.</p>
</sec>
<sec id="s3d">
<title>Curation effort</title>
<p>The time required for curation of the chosen papers
<xref ref-type="bibr" rid="pone.0010708-Mayden1">[4]</xref>
<xref ref-type="bibr" rid="pone.0010708-Kailola1">[6]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Fink1">[21]</xref>
,
<xref ref-type="bibr" rid="pone.0010708-Albert1">[26]</xref>
<xref ref-type="bibr" rid="pone.0010708-DiDario1">[68]</xref>
to the level described herein was approximately 5 person-years. This included significant time investment by personnel in software development, testing and improvement, initiation of new ontologies, and development of curation standards and workflow.</p>
</sec>
</sec>
<sec id="s4">
<title>Discussion</title>
<p>Our experience in successfully transforming a large collection (10,512 character states) of legacy systematic character data into the ontology-based EQ syntax resulted in a recognition of several distinct systematic character types with respect to the logical categorization enforced by ontologies. It also contributed to the growth and improvement of several domain and community ontologies, and it resulted in the development of standards and best practices for phenotype curation. Moreover it offers a new view of morphological characters that is valuable for practicing systematists.</p>
<sec id="s4a">
<title>Systematic character types and application of EQ formalism</title>
<p>Systematists use the expressiveness and richness of natural language to describe precisely the morphological variation that they observe among species. These phenotype descriptions are represented in a somewhat formalized way as characters and character states in the systematic literature
<xref ref-type="bibr" rid="pone.0010708-Sereno1">[14]</xref>
. EQ formalism provides a rigorous yet flexible syntax for these data; it is to some extent, a ‘natural fit’. As previously described (
<xref ref-type="fig" rid="pone-0010708-g001">Figure 1</xref>
), systematic characters typically consist of a short character header (e.g., maxilla shape) denoting the relevant structure(s) (entity: maxilla) and attribute (high-level quality: shape) that varies among taxa, followed by several character states that specify the value of the quality (round, rectangular, triangular, etc…). Many systematic characters, however, do not follow this format (see
<xref ref-type="bibr" rid="pone.0010708-Sereno1">[14]</xref>
for review), and a standardized logical framework, consistent with EQ formalism, has been recently proposed
<xref ref-type="bibr" rid="pone.0010708-Sereno1">[14]</xref>
. In systematic characters, entities and qualities can be found in the character description, in the character state description, or in both. Irrespective of this, the morphological descriptions of variants among species in the literature conform to the general formalism of EQ syntax and semantics. From the breadth of our curation work emerged standards and recommendations for deploying this formalism, and we relate these below and in our Guide to Character Annotation (
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
;
<xref ref-type="supplementary-material" rid="pone.0010708.s001">Text S1</xref>
).</p>
<p>Systematists typically represent only one aspect of a structure in a character state. An example from bird systematics involves variation in the shape of the external naris in stem rollers: “ovoid (0); triangular with a flat ventral margin (1)”
<xref ref-type="bibr" rid="pone.0010708-Clarke1">[80]</xref>
. Here each character state corresponds to a single phenotype: state 0, for example, corresponds to the EQ
<monospace>: E:external naris, Q:ovoid</monospace>
. Occasionally, however, authors represent observations in ways that may be interpreted as either monadic or relational. An example from fishes
<xref ref-type="bibr" rid="pone.0010708-Zanata1">[48]</xref>
is a character involving two elements of the anal fin described as: “Presence or absence of fusion of medial and proximal anal-fin radials: (0) absent; (1) present.” Rather than annotate this as a monadic character by adding a new term to the anatomy ontology, i.e. “fused medial and proximal anal-fin radials”, which is a complex entity not named in the literature, we annotated this as a relational character using the qualities
<italic>fused with</italic>
and
<italic>separated from</italic>
to describe the relationship between two separate entities (state 0:
<monospace>E:medial anal-fin radial, Q:fused with, RE:proximal anal-fin radial</monospace>
; state 1:
<monospace>E:medial anal-fin radial, Q:separated from, RE:proximal anal-fin radial</monospace>
). The advantage of representing this using two separate recognized and defined entities is that they can thereby be linked to other annotations of these entities.</p>
<p>Character states were generally translated into a single EQ statement, but not uncommonly we noted that multiple aspects of a structure or multiple structures are described within a single character state, requiring the annotation of multiple EQ statements. This may reflect an investigator's observation that the structures co-vary, and perhaps an assumption that they are non-independent and thus represent a single character state. We termed these ‘composite’ character states. For example, variation in the pectoral fin is described as follows
<xref ref-type="bibr" rid="pone.0010708-Albert1">[26]</xref>
: “Pectoral fin size. 0: pectoral fin large and pigmented; more than 43% head length; membrane infused with numerous small chromatophores; 1: pectoral fin small and unpigmented; less than 43% head length; membrane without chromatophores.” Here each character state corresponds to two phenotypes (e.g., state 0, EQ1:
<monospace>E:pectoral fin, Q:size
<sup></sup>
increased_in_magnitude_relative_to (E:pectoral fin in_taxon X)</monospace>
and EQ2:
<monospace>E:pectoral fin, Q:pigmented</monospace>
). Dividing the distinct logical components (size and color in this case) into multiple EQ statements is necessary for reasoning with them independently using ontologies – and possibly, but not necessarily, for phylogenetic character construction. By recording separate EQ statements, one may query on independent logical qualities of a character state (e.g.,
<italic>size</italic>
) and expect to find similar annotations. However, if a systematist separates them into separate characters, it results in increasing the weight of potentially non-independent characters in the phylogenetic analysis. On the other hand, representing them as a single character may underrepresent them in the analysis.</p>
<p>Systematic characters are sometimes framed such that different character states involve different logical qualities (e.g.,
<italic>absent</italic>
and
<italic>shape</italic>
). For example, variation in the spermatophoral gland in brachiopods (a phylum of invertebrates) is described with three states
<xref ref-type="bibr" rid="pone.0010708-Santagata1">[81]</xref>
: “Absent, Simple, Composite”. Although some systematists have previously raised concerns about the logical structure of these characters, it does not pose a problem for phenotype annotation from a practical standpoint, e.g., the EQ statements for these character states are:
<monospace>E:spermatophoral gland, Q:absent; E: spermatophoral gland, Q:simple; E: spermatophoral gland, Q:composite</monospace>
(see
<xref ref-type="supplementary-material" rid="pone.0010708.s001">Text S1</xref>
for discussion of the semantics for
<italic>absent</italic>
and
<italic>present</italic>
). From the standpoint of reasoning with ontologies within a database, presence can be implied by an annotation to any quality term (e.g.,
<italic>tubular</italic>
or
<italic>lamellar</italic>
) other than
<italic>absent</italic>
.</p>
<p>The natural language used in the original description of systematic characters sometimes corresponds to a term in an ontology that is different from the author's intent; in other words, there is a mismatch between an author's free-text description and the literal match to an ontology term. For example, authors sometimes describe variability in the shape of a structure using terms that are not types of
<italic>shape</italic>
in the quality ontology. Variation in the pelvic bone, for example, is described as
<xref ref-type="bibr" rid="pone.0010708-Britto1">[29]</xref>
: “Shape of ischiac process: small or posteriorly elongate (0); falciform (1); falciform and strongly developed (2).” To represent “small or posteriorly elongate” using PATO qualities, the qualities
<italic>size∧decreased_in_magnitude_relative_to</italic>
(E in_taxon X) and
<italic>elongated</italic>
are applied. However, the parent of these two terms is
<italic>size</italic>
, not
<italic>shape</italic>
. A consequence of this mismatch between natural language and the ontology is that querying the resulting annotations for “pelvic bone” and “shape” will not return annotations corresponding to the author's state 0. Thus applying mutually exclusive terminology to annotate these aspects from a morphological description can be difficult and possibly result in misrepresentation of an author's intention. Many investigators, however, would recognize that many aspects of size variation also relate to shape and
<italic>vice versa</italic>
. Size and shape terms were a frequent source of inconsistency among curators. In cases such as the example above, we recommended that curators annotate coarsely to
<italic>shape</italic>
.</p>
<p>The process of dissociating the states of some characters (composite) into multiple, distinct EQ statements, and the states of other characters into EQ statements with different logical values has implications for their potential use in phylogenetic analysis. Characters are the units of homology in phylogenetic analysis, and the alternative character states have been judged by the researcher to be homologues of one another. To the extent that they are atomized using EQ, they may lose their genealogical connection. On the other hand, the task of homologizing complex phenotypic characters across different studies and taxa has proved difficult to impossible for phylogeneticists thus far, and it may be the case that EQ statements provide the broad initial grouping of characters and character states that facilitate subsequent broader-scope phylogenetic evaluation.</p>
</sec>
<sec id="s4b">
<title>Curation of taxa</title>
<p>Our finding that more than one quarter (729 of 2,682) of the species names used in the 47 curated publications were outdated was unexpected, given the recency of these publications (1981–2008). These fishes may present an unusual case, however, because two of the groups that we curated (catfishes and cypriniforms) have undergone extensive recent taxonomic revision. Given that some taxa will be revised more frequently than others, the rapid turnover in taxonomic names draws attention to the need for adaptable resources such as taxonomy ontologies like the TTO, that record the relationships among not only current, but also synonymous, taxonomic names thus supporting comparisons across studies in the literature. Ontologies provide the capability to accommodate the needs of the specific literature or type of data under curation.</p>
<p>Phenotypes are recorded from observations on individual organisms in both model organism genetics and evolutionary biology. Curation of the evolutionary literature, however, presented a special challenge because it required distinguishing between author statements that were based on direct observation of specimens and generalizations to higher-level taxa. We discovered that authors represented species observations using one or more higher-level taxa in more than 75% of the published data matrices that we curated. Generalizing to a higher-level taxon from observations on a single or only a few exemplar species is in fact common practice in systematic studies of many taxonomic groups. Sometimes an author explicitly asserts in the corresponding text that a particular phenotype pertains to all species included in a higher level taxon, but other times using a higher-level taxon in a matrix is simply shorthand for reference to the species that were actually examined. To compare data across multiple studies, however, it is critical to interpret appropriately the meaning of the author's use of these higher-level taxa.</p>
</sec>
<sec id="s4c">
<title>Curation of phenotypes in the legacy literature: challenges and feasibility</title>
<p>The rich literature that documents the similarities and differences among taxa goes back several centuries, spans many languages, and at first appearance, seems almost insurmountably large to render computable using ontology-based curation methods. By initially focusing on large-scale treatments where phenotypic descriptions are most formalized, i.e., the phylogenetic studies, we reduced the number of papers for more than 8,000 species of ostariophysan fishes to approximately 50. This struck us as surprisingly few such papers; however, given that the phylogenetic approach has been mainstream only over the past 30+ years, and that morphological treatments of this sort may take an author 6–10 years to produce, the number may well be representative for other taxonomic groups. If this is the case, with annotation software and ontologies in place at the outset, we estimate that similar phenotype annotation projects can be done in possibly half the time (approximately 2.5 person years). Additionally, by incorporating semi-automated methods to extract character states from the literature and associate ontology terms, the time involved could be further reduced. A significant level of phenotypic data, however, remains in non-phylogenetic studies, e.g., species descriptions, and methods for efficient EQ curation of this literature remain a critical challenge.</p>
<p>New attempts to curate phenotypes from the legacy literature of other taxonomic groups (phylogenetic treatments or not) will require overcoming the initial hurdle of creating new ontologies or expanding existing ones for the inclusion of terms required to represent the diversity of organism features and taxonomy under consideration. In our experience, we found it efficient to build from existing ontologies where available. We used, for example, the existing PATO ontology for annotation of qualities, and we built the Teleost Anatomy Ontology (TAO) from the Zebrafish Anatomical Ontology
<xref ref-type="bibr" rid="pone.0010708-Dahdul1">[15]</xref>
. As we annotated the literature, we concurrently added required terms and relationships to these and other ontologies. In this way, ontology growth and development was driven by active curation of the literature. For example, in the course of curating evolutionary phenotypes for the fishes in this study, we more than doubled the number of primary terms (not including synonyms) for the skeletal system axis of the TAO. In contrast, no new terms were proposed for cell types, embryonic structures, or the immune system, because no evolutionary variants of these anatomical structures were documented in the literature we curated. As a consequence, ontologies might appear to be incomplete or missing basic terms, and may not provide the encyclopedic knowledge that some may expect of an anatomy ontology. Although terms, definitions, and relationships can be supplied at any time, the effort required for curators to break away from direct annotation and turn to ontology development is significant; 15–50% of an individual curator's time might be spent on ontology development. In particular, curation of publications covering taxa that have not been previously annotated require the addition of new taxonomy and anatomy terms and is thus more time consuming. We anticipate that future curation of the fish literature will be more time efficient because of our significant refinement and enlargement of the core ontologies (TAO, TTO, PATO). For new efforts in different taxonomic domains, once the respective ontologies have been populated, the curation of additional publications will require less time. In summary, term addition to shared ontologies broadened their scope and provided greater utility to others in the community.</p>
<p>A significant general challenge for newly established curation projects is the consistent annotation of phenotypes among curators. In our experience, curator consistency is influenced by familiarity with the tools, ontologies, and syntax for creation of phenotypes. Consistency improved as curators gained familiarity with the ontologies, acquired experience using curation software and tools, became more aware of the developing annotation standards, and were restricted in their term choices. The almost daily updated documentation of annotation problems, examples, and standards in the Guide to Character Annotation
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
was important in promoting these annotation standards. High-level oversight of the process and manual and automated consistency checks by the lead curator before making the data public were critical to maintaining consistency and data quality, and contributed to improvements in consistency.</p>
<p>The results of our curation experiments and subsequent work with individual curators pinpointed several general problems that required improvement in curation procedures and software tools to increase curator consistency and efficiency. Importantly, we discovered that curators had a difficult time navigating large ontologies to determine whether an appropriate term was present. This is almost certainly a general problem in annotation of phenotype descriptions and not specific to systematic biology literature. The absence of the correct ontology term led to inconsistent use of existing terms by curators or a time-consuming change of focus to the process of term addition and definition. The solutions that we suggest and have at least partially implemented are generally useful for other phenotype efforts, and they are described below.</p>
<p>To help remediate the navigation of large ontologies problem, we implemented software restrictions to reduce the number of terms available for use in annotations. For example, if only the skeleton is being annotated, then a ‘slim’ version of the TAO containing only terms from the skeletal system might be made available. Restricting which terms are available is particularly important in large community ontologies (e.g., the Gene Ontology) that contain many terms not applicable to the particular data under curation. We, for example, implemented a ‘slim’ version of the Relations Ontology, with only a small subset of relations (such as
<italic>part_of, towards</italic>
, etc.) available for use by our curators. We found that it is critical for relations to be restricted for use in post-composition. Curators, for example, improperly used
<italic>left_of</italic>
rather than
<italic>in_left_side_of</italic>
and
<italic>contained_in</italic>
rather than
<italic>located_in</italic>
. In the future we feel it will be useful to further restrict availability of particular terms and relations, depending on the literature under curation, so that for example, the relation
<italic>connected_to</italic>
is the only one available for use when annotating the relation of scales to the body of fishes. Thus a curator would be forced to annotate
<italic>scale connected_to head</italic>
versus (incorrectly)
<italic>scale part_of head</italic>
. Such restrictions are expected to decrease the effort required by a curator to find an appropriate term or relation, and increase significantly the consistency of curation and thus the logical value of the annotations.</p>
<p>The second part of the curation problem is that after a curator determined that a term was truly missing from an ontology (versus in some part of the ontology that they may not have browsed), they then needed to request that a new term be added to the ontology. This was a significant interruption in the curation process, as curators turned their attention from the phenotype description to composing a new term definition. To help expedite the term addition process for curators, we provided easy links to term trackers on the Phenoscape wiki
<xref ref-type="bibr" rid="pone.0010708-Phenoscape1">[69]</xref>
for different ontologies. We also encouraged curators to provide basic definitions that could later be improved upon by community feedback on the term request mailing list
<xref ref-type="bibr" rid="pone.0010708-Dahdul1">[15]</xref>
.</p>
<p>Missing terms also led to curatorial inconsistency. The curation experiments showed that curators frequently could not find the appropriate fine-scale quality term in PATO, mainly due to the incomplete development of this ontology (i.e., the term was missing). This led to individual curators choosing different, fine-scale terms that approximated the author's intent, but did not fully or adequately represent it. Rather than expedite the term addition process here, we used the Phenex software to restrict term choices primarily to higher-level attribute qualities (
<xref ref-type="fig" rid="pone-0010708-g003">Figure 3</xref>
). Consistency and efficiency improved as a result. Additionally, term restriction had an educational and training value in that our curators learned to abstract quickly the essence of the varying quality from complex descriptions. The negatives of this approach are that the PATO ontology did not grow at the leaf-node level, at least initially, as a result of our work, and that queries across qualities cannot be made at a fine scale. That is, querying for all ‘elongate’ jaws, for example, would return jaws of all shapes because all descriptions of jaw shape variation are annotated to the high level term
<italic>shape</italic>
. The cost of further annotation refinement must be judged against the intended use of the annotations.</p>
<p>It is a significant remaining challenge to curate complex phenotype descriptions using EQ syntax fully. This challenge extends beyond systematic biology to all curation efforts (e.g., Human Phenotype Ontology;
<ext-link ext-link-type="uri" xlink:href="http://www.human-phenotype-ontology.org">http://www.human-phenotype-ontology.org</ext-link>
) that seek fine-scale representation of phenotypes, including those that compare human genetic phenotypes to those of model organisms. This is because full or fine-scale curation of complex phenotype descriptions (e.g., the antero-dorsally projecting process of the posterior maxilla is located posterior to the laterally projecting and bent knob of the ethmoid in species X) requires elaborate post-compositional combinations of ordered sets of terms from multiple ontologies. All efforts to represent complex phenotypes will require multiple ontologies and post-composition, and they will thus experience the same general problems. Although the Phenex software supports such compositions
<xref ref-type="bibr" rid="pone.0010708-Balhoff1">[20]</xref>
, and although sophisticated and biologically relevant reasoning across these compositions is feasible
<xref ref-type="bibr" rid="pone.0010708-Mungall2">[77]</xref>
, the curatorial burden of accurate and consistent annotation at this scale is high. Our work to reduce the burden of curation of these phenotypes by developing standards, restricting ontology terms available for annotation, and making it easier to add new terms, represents a significant step forward.</p>
</sec>
<sec id="s4d">
<title>Conclusions</title>
<p>The benefits of using an ontology in communication in any discipline include standardization of terminology, explicit definitions of concepts, logical relations among concepts, and the creation of structured and precise representations of information that facilitate computability. From a practical standpoint, communities benefit because communication is clearer and less ambiguous. Using multiple ontologies to describe more complex concepts such as phenotypes can promote similar benefits at a broader level and to a broader community, promote comparisons of phenotypes across studies and taxonomic groups, and allow interoperability with different data types. Currently the multiple ways that investigators describe their observations makes it difficult to combine or compare data across studies, and renders the observations vulnerable to misinterpretation. Many of the issues we encountered in curation (different terminologies, noncomparable attributes among character states) could be avoided prospectively if systematists are provided access to data collection tools that link to community anatomy, quality, and taxonomic ontologies. Use of ontologies for complex phenotype descriptions has the potential to clarify the identity of structures under consideration, allow comparison of similar phenotypes, and facilitate the application of characters across studies and taxonomic groups. Moreover, use of a mapping to EQ syntax during the course of a study can generally promote higher levels of standardization.</p>
<p>Curated data that a computer can understand and reason with facilitates the aggregation and comparison of data on a scale that is unmanageable for individual researchers. The expressiveness, creativity, and precise descriptions possible with natural language, however, are not easily replaced, despite the promise and advantages of computational methods. The inherent human ability to describe and interpret complex phenotypes will always be an essential element in biological fields that involve comparisons of the visible phenotype. These, however, are complemented by computational tools such as ontologies that promote clarity and communication among researchers and interoperability of data.</p>
</sec>
</sec>
<sec sec-type="supplementary-material" id="s5">
<title>Supporting Information</title>
<supplementary-material content-type="local-data" id="pone.0010708.s001">
<label>Text S1</label>
<caption>
<p>Phenoscape Guide to Character Annotation. We describe our standard practices for annotation of entities and qualities in the Guide to Character Annotation. Our online version describes more specialized cases and issues (
<ext-link ext-link-type="uri" xlink:href="https://www.phenoscape.org/wiki/Guide_to_Character_Annotation">https://www.phenoscape.org/wiki/Guide_to_Character_Annotation</ext-link>
).</p>
<p>(0.09 MB DOC)</p>
</caption>
<media xlink:href="pone.0010708.s001.doc" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>We thank the many taxonomic experts and other contributors to the development of the ontologies, their curation work, and participation at Phenoscape workshops (
<ext-link ext-link-type="uri" xlink:href="http://kb.phenoscape.org/contributors/">http://kb.phenoscape.org/contributors/</ext-link>
). J. Blake, M. Haendel, S. Lewis, C. Mungall, M. Ringwald, and N. Washington provided useful advice in the course of this work. We also thank the two anonymous reviewers for their helpful comments on the manuscript.</p>
</ack>
<fn-group>
<fn fn-type="conflict">
<p>
<bold>Competing Interests: </bold>
The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="financial-disclosure">
<p>
<bold>Funding: </bold>
The authors thank National Science Foundation (DBI 0641025), National Institutes of Health (HG002659), and the National Evolutionary Synthesis Center (NSF EF-0423641) for the support of this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="pone.0010708-Grande1">
<label>1</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grande</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bemis</surname>
<given-names>W</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>A comprehensive phylogenetic study of amiid fishes (Amiidae) based on comparative skeletal anatomy. An empirical search for interconnected patterns of natural history.</article-title>
<source>Society of Vertebrate Paleontology Memoir 4: supplement to Journal of Vertebrate Paleontology</source>
<fpage>1</fpage>
<lpage>690</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mouse1">
<label>2</label>
<mixed-citation publication-type="other">Mouse Genome Informatics. Available:
<ext-link ext-link-type="uri" xlink:href="http://www.informatics.jax.org/">http://www.informatics.jax.org/</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Online1">
<label>3</label>
<mixed-citation publication-type="other">Online Mendelian Inheritance in Man. Available:
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/omim/">http://www.ncbi.nlm.nih.gov/omim/</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mayden1">
<label>4</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mayden</surname>
<given-names>RL</given-names>
</name>
</person-group>
<year>1989</year>
<article-title>Phylogenetic studies of North American minnows, with emphasis on the genus
<italic>Cyprinella</italic>
(Teleostei: Cypriniformes).</article-title>
<source>University of Kansas Museum of Natural History Miscellaneous Publications</source>
<volume>80</volume>
<fpage>1</fpage>
<lpage>189</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Royero1">
<label>5</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Royero</surname>
<given-names>R</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>Studies on the systematics and phylogeny of the catfish family Auchenipteridae (Teleostei: Siluriformes).</article-title>
<publisher-loc>Bristol</publisher-loc>
<publisher-name>University of Bristol</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Kailola1">
<label>6</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kailola</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>A phylogenetic exploration of the catfish family Ariidae (Otophysi; Siluriformes).</article-title>
<source>The Beagle, Records of the Museums and Art Galleries of the Northern Territory</source>
<volume>20</volume>
<fpage>87</fpage>
<lpage>166</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Smith1">
<label>7</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rosse</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Bard</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bug</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration.</article-title>
<source>Nature Biotechnology</source>
<volume>25</volume>
<fpage>1251</fpage>
<lpage>1255</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Gruber1">
<label>8</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gruber</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>Toward pjrinciples for the design of ontologies used for knowledge sharing.</article-title>
<source>International Journal of Human-Computer Studies</source>
<volume>43</volume>
<fpage>907</fpage>
<lpage>928</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Gkoutos1">
<label>9</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gkoutos</surname>
<given-names>GV</given-names>
</name>
<name>
<surname>Green</surname>
<given-names>EC</given-names>
</name>
<name>
<surname>Mallon</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Hancock</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Davidson</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Using ontologies to describe mouse phenotypes.</article-title>
<source>Genome Biology</source>
<volume>6</volume>
<fpage>R8</fpage>
<pub-id pub-id-type="pmid">15642100</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0010708-Sprague1">
<label>10</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sprague</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bayraktaroglu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bradford</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Conlin</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Dunn</surname>
<given-names>N</given-names>
</name>
<etal></etal>
</person-group>
<year>2008</year>
<article-title>The Zebrafish Information Network: the zebrafish model organism database provides expanded support for genotypes and phenotypes.</article-title>
<source>Nucleic Acids Research</source>
<volume>36</volume>
<fpage>D768</fpage>
<lpage>D772</lpage>
<pub-id pub-id-type="pmid">17991680</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mungall1">
<label>11</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Mungall</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gkoutos</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Washington</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>S</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Representing phenotypes in OWL.</article-title>
<publisher-loc>OWL</publisher-loc>
<publisher-name>Experiences and Directions (OWLED 2007), Innsbruk, Austria</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Washington1">
<label>12</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Washington</surname>
<given-names>NL</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Mungall</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Westerfield</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Linking human diseases to animal models using ontology-based phenotype annotation.</article-title>
<source>PLoS Biology</source>
<volume>7</volume>
<fpage>1</fpage>
<lpage>20</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mik1">
<label>13</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mikó</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Deans</surname>
<given-names>AR</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>
<italic>Masner</italic>
, a new genus of Ceraphronidae (Hymenoptera, Ceraphronoidea) described using controlled vocabularies.</article-title>
<source>ZooKeys</source>
<volume>20</volume>
<fpage>127</fpage>
<lpage>153</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Sereno1">
<label>14</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sereno</surname>
<given-names>PC</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Logical basis for morphological characters in phylogenetics.</article-title>
<source>Cladistics</source>
<volume>23</volume>
<fpage>565</fpage>
<lpage>587</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Dahdul1">
<label>15</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Dahdul</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Lundberg</surname>
<given-names>JG</given-names>
</name>
<name>
<surname>Midford</surname>
<given-names>PE</given-names>
</name>
<name>
<surname>Balhoff</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Lapp</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>The Teleost Anatomy Ontology: Anatomical representation for the genomics age.</article-title>
<publisher-name>Systematic Biology Advance Access published online on March 29, 2010, doi:10.1093/sysbio/syq013</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mabee1">
<label>16</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mabee</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cronk</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Gkoutos</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>Phenotype ontologies: the bridge between genomics and evolution.</article-title>
<source>Trends in Ecology & Evolution</source>
<volume>22</volume>
<fpage>345</fpage>
<lpage>350</lpage>
<pub-id pub-id-type="pmid">17416439</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mabee2">
<label>17</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mabee</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Arratia</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Coburn</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hilton</surname>
<given-names>EJ</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>Connecting evolutionary morphology to genomics using ontologies: A case study from Cypriniformes including zebrafish.</article-title>
<source>Journal of Experimental Zoology Part B-Molecular and Developmental Evolution</source>
<volume>308B</volume>
<fpage>655</fpage>
<lpage>668</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Muller1">
<label>18</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Muller</surname>
<given-names>HM</given-names>
</name>
<name>
<surname>Kenny</surname>
<given-names>EE</given-names>
</name>
<name>
<surname>Sternberg</surname>
<given-names>PW</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Textpresso: An ontology-based information retrieval and extraction system for biological literature.</article-title>
<source>Plos Biology</source>
<volume>2</volume>
<fpage>1984</fpage>
<lpage>1998</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Lewis1">
<label>19</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lewis</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Searle</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lyer</surname>
<given-names>V</given-names>
</name>
<etal></etal>
</person-group>
<year>2002</year>
<article-title>Apollo: a sequence annotation editor.</article-title>
<source>Genome Biol</source>
<volume>3</volume>
<fpage>RESEARCH0082</fpage>
<pub-id pub-id-type="pmid">12537571</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0010708-Balhoff1">
<label>20</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Balhoff</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Dahdul</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Kothari</surname>
<given-names>CR</given-names>
</name>
<name>
<surname>Lapp</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Lundberg</surname>
<given-names>JG</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>Phenex: Ontological annotation of phenotypic diversity.</article-title>
<publisher-name>PLoS ONE in press</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Fink1">
<label>21</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fink</surname>
<given-names>SV</given-names>
</name>
<name>
<surname>Fink</surname>
<given-names>WL</given-names>
</name>
</person-group>
<year>1981</year>
<article-title>Interrelationships of the Ostariophysan fishes (Teleostei).</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>72</volume>
<fpage>297</fpage>
<lpage>353</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Open1">
<label>22</label>
<mixed-citation publication-type="other">Open Biological and Biomedical Ontologies Foundry. Available:
<ext-link ext-link-type="uri" xlink:href="http://www.obofoundry.org">http://www.obofoundry.org</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Available1">
<label>23</label>
<mixed-citation publication-type="other">Available:
<ext-link ext-link-type="uri" xlink:href="http://phenoscape.svn.sourceforge.net/viewvc/phenoscape/trunk/vocab/fish_collection_abbreviation.obo">http://phenoscape.svn.sourceforge.net/viewvc/phenoscape/trunk/vocab/fish_collection_abbreviation.obo</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Eschmeyer1">
<label>24</label>
<mixed-citation publication-type="other">
<person-group person-group-type="editor">
<name>
<surname>Eschmeyer</surname>
<given-names>WN</given-names>
</name>
</person-group>
<year>2010</year>
<comment>Catalog of Fishes electronic version (19 February 2010). Available:
<ext-link ext-link-type="uri" xlink:href="http://research.calacademy.org/ichthyology/catalog/fishcatmain.asp">http://research.calacademy.org/ichthyology/catalog/fishcatmain.asp</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0010708-Available2">
<label>25</label>
<mixed-citation publication-type="other">Available:
<ext-link ext-link-type="uri" xlink:href="http://www.nexml.org">http://www.nexml.org</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Albert1">
<label>26</label>
<mixed-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Albert</surname>
<given-names>JS</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>Species Diversity and Phylogenetic systematics of American Knifefishes (Gymnotiformes, Teleostei).</article-title>
<comment>Miscellaneous Publications, Museum of Zoology, University of Michigan no. 190: 140 pages</comment>
</mixed-citation>
</ref>
<ref id="pone.0010708-Armbruster1">
<label>27</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Armbruster</surname>
<given-names>JW</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Phylogenetic relationships of the suckermouth armoured catfishes (Loricariidae) with emphasis on the Hypostominae and the Ancistrinae.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>141</volume>
<fpage>1</fpage>
<lpage>80</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Bornbusch1">
<label>28</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bornbusch</surname>
<given-names>AH</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>Phylogenetic relationships within the Eurasian catfish family Siluridae (Pisces: Siluriformes), with comments on generic validities and biogeography.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>115</volume>
<fpage>1</fpage>
<lpage>46</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Britto1">
<label>29</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Britto</surname>
<given-names>MR</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Phylogeny of the subfamily Corydoradinae Hoedeman, 1952 (Siluriformes: Callichthyidae), with a definition of its genera.</article-title>
<source>Proceedings of the Academy of Natural Sciences, Philadelphia</source>
<volume>153</volume>
<fpage>119</fpage>
<lpage>154</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Chang1">
<label>30</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Maisey</surname>
<given-names>JG</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Redescription of
<italic>Ellimma branneri</italic>
and
<italic>Diplomystus shengliensis</italic>
, and relationships of some basal Clupeomorphs.</article-title>
<source>American Museum Novitates</source>
<volume>3404</volume>
<fpage>1</fpage>
<lpage>35</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-dePinna1">
<label>31</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Pinna</surname>
<given-names>MCC</given-names>
</name>
</person-group>
<year>1996</year>
<article-title>A phylogenetic analysis of the Asian catfish families Sisoridae, Akysidae, and Amblycipitidae, with a hypothesis on the relationships of the neotropical Aspredinidae (Teleostei, Ostariophysi).</article-title>
<source>Fieldiana: Zoology (New Series)</source>
<volume>84</volume>
<fpage>i–iv +1-83</fpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-dePinna2">
<label>32</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Pinna</surname>
<given-names>MCC</given-names>
</name>
<name>
<surname>Ferraris</surname>
<given-names>CJJ</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>A phylogenetic study of the neotropical catfish family Cetopsidae (Osteichthys, Ostariophysi, Siluriformes), with a new classification.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>150</volume>
<fpage>755</fpage>
<lpage>813</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Fang1">
<label>33</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fang</surname>
<given-names>F</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Phylogenetic analysis of the Asian Cyprinid genus
<italic>Danio</italic>
(Teleostei, Cyprinidae).</article-title>
<source>Copeia</source>
<volume>2003</volume>
<fpage>714</fpage>
<lpage>728</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Grande2">
<label>34</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grande</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Laten</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Lopez</surname>
<given-names>JA</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Phylogenetic relationships of extant esocid species (Teleostei: Salmoniformes) based on mophological and molecular characters.</article-title>
<source>Copeia</source>
<fpage>743</fpage>
<lpage>757</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Grande3">
<label>35</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grande</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Poyato-Ariza</surname>
<given-names>FJ</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>Phylogenetic relationships of fossil and Recent gonorynchiform fishes (Teleostei: Ostariophysi).</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>125</volume>
<fpage>197</fpage>
<lpage>238</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Johnson1">
<label>36</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Johnson</surname>
<given-names>GD</given-names>
</name>
<name>
<surname>Patterson</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>1993</year>
<article-title>Percomorph phylogeny: a survey of acanthomorphs and a new proposal.</article-title>
<source>Bulletin of Marine Science</source>
<volume>52</volume>
<fpage>554</fpage>
<lpage>626</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-PoyatoAriza1">
<label>37</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Poyato-Ariza</surname>
<given-names>FJ</given-names>
</name>
</person-group>
<year>1996</year>
<article-title>A revision of the ostariophysan fish family Chanidae, with special reference to the Mesozoic forms.</article-title>
<source>Palaeo Ichthyologica</source>
<volume>6</volume>
<fpage>1</fpage>
<lpage>52</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Sanger1">
<label>38</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanger</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>McCune</surname>
<given-names>AR</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>Comparative osteology of the
<italic>Danio</italic>
(Cyprinidae: Ostariophysi) axial skeleton with comments on
<italic>Danio</italic>
relationships based on molecules and morphology.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>135</volume>
<fpage>529</fpage>
<lpage>546</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Sawada1">
<label>39</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sawada</surname>
<given-names>Y</given-names>
</name>
</person-group>
<year>1982</year>
<article-title>Phylogeny and zoogeography of superfamily Cobitoidea (Cyprinoidei, Cypriniformes).</article-title>
<source>Mem Fac Fish Hokkaido Univ</source>
<volume>28</volume>
<fpage>65</fpage>
<lpage>223</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Schaefer1">
<label>40</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schaefer</surname>
<given-names>SA</given-names>
</name>
</person-group>
<year>1987</year>
<article-title>Osteology of
<italic>Hypostomus plecostomus</italic>
(Linnaeus), with a phylogenetic analysis of the loricariid subfamilies (Pisces: Siluriformes).</article-title>
<source>Contributions in Science, Los Angeles County Museum</source>
<fpage>1</fpage>
<lpage>31</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Schaefer2">
<label>41</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schaefer</surname>
<given-names>SA</given-names>
</name>
</person-group>
<year>1991</year>
<article-title>Phylogenetic analysis of the loricariid subfamily Hypoptopomatinae (Pisces: Siluroidei: Loricariidae), with comments on generic diagnoses and geographic distribution.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>102</volume>
<fpage>1</fpage>
<lpage>41</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Sidlauskas1">
<label>42</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sidlauskas</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Phylogenetic relationships within the South American fish family Anostomidae (Teleostei, Ostariophysi, Characiformes).</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>154</volume>
<fpage>70</fpage>
<lpage>210</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-ToledoPiza1">
<label>43</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Toledo-Piza</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>The Neotropical fish subfamily Cynodontinae (Teleostei: Ostariophysi: Characiformes): A phylogenetic study and revision of
<italic>Cynodon</italic>
and
<italic>Rhaphiodon</italic>
.</article-title>
<source>American Museum Novitates</source>
<volume>3286</volume>
<fpage>1</fpage>
<lpage>88</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-ToledoPiza2">
<label>44</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Toledo-Piza</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Phylogenetic relationships among
<italic>Acestrorhynchus</italic>
species (Ostariophysi: Characiformes: Acestrorhynchidae).</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>151</volume>
<fpage>691</fpage>
<lpage>757</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Vari1">
<label>45</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>The Neotropical Fish Family Ctenoluciidae (Teleostei: Ostariophysi: Characiformes): Supra and Intrafamilial Phylogenetic Relationships, with a Revisionary Study.</article-title>
<source>Smithsonian Contributions to Zoology</source>
<volume>564</volume>
<fpage>1</fpage>
<lpage>97</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Vari2">
<label>46</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Harold</surname>
<given-names>AS</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>Phylogenetic study of the neotropical fish genera
<italic>Creagrutus</italic>
Günther and
<italic>Piabina</italic>
Reinhardt (Teleostei: Ostariophysi: Characiformes), with a revision of the cis-Andean species.</article-title>
<source>Smithsonian Contributions to Zoology </source>
<volume>613</volume>
<fpage>v+1-239</fpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Vigliotta1">
<label>47</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vigliotta</surname>
<given-names>TR</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>A phylogenetic study of the African catfish family Mochokidae (Osteichthyes, Ostariophysi, Siluriformes), with a key to its genera.</article-title>
<source>Proceedings of the Academy of Natural Sciences, Philadelphia</source>
<volume>157</volume>
<fpage>73</fpage>
<lpage>136</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Zanata1">
<label>48</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zanata</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>The family Alestidae (Ostariophysi, Characiformes): a phylogenetic analysis of a trans-Atlantic clade.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>145</volume>
<fpage>1</fpage>
<lpage>144</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Arratia1">
<label>49</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Arratia</surname>
<given-names>G</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>The monophyly of Teleostei and stem-group teleosts. Consensus and disagreements.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Arratia</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Schultze</surname>
<given-names>H-P</given-names>
</name>
</person-group>
<source>Mesozoic Fishes 2- Systematics and Fossil Record</source>
<publisher-loc>Munchen</publisher-loc>
<publisher-name>Verlag Dr. F. Pfeil</publisher-name>
<fpage>265</fpage>
<lpage>334</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Buckup1">
<label>50</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Buckup</surname>
<given-names>PA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Relationships of the Characidiinae and phylogeny of characiform fishes (Teleostei: Ostariophysi).</article-title>
<person-group person-group-type="editor">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>ZMS</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
</person-group>
<source>Phylogeny and Classification of Neotropical Fishes</source>
<publisher-loc>Porto Alegre, Brazil</publisher-loc>
<publisher-name>EDIPUCRS</publisher-name>
<fpage>123</fpage>
<lpage>144</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Cavender1">
<label>51</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Cavender</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Coburn</surname>
<given-names>MM</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>Phylogenetic relationships of North American Cyprinidae.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Mayden</surname>
<given-names>RL</given-names>
</name>
</person-group>
<source>Systematics, Historical Ecology, and North American Freshwater Fishes</source>
<publisher-loc>Stanford</publisher-loc>
<publisher-name>Stanford University Press</publisher-name>
<fpage>293</fpage>
<lpage>327</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Coburn1">
<label>52</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Coburn</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Cavender</surname>
<given-names>TM</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>Interrelationships of North American cyprinid fishes.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Mayden</surname>
<given-names>RL</given-names>
</name>
</person-group>
<source>Systematics, Historical Ecology, and North American Freshwater Fishes</source>
<publisher-loc>Stanford, California</publisher-loc>
<publisher-name>Stanford University Press</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Grande4">
<label>53</label>
<mixed-citation publication-type="book">
<person-group person-group-type="editor">
<name>
<surname>Grande</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Grande</surname>
<given-names>L</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Reevaluation of the gonorynchiform genera
<italic>Ramallichthys</italic>
,
<italic>Judeichthys</italic>
and
<italic>Notogoneus</italic>
, with comments on the families Charitosomidae and Gonorynchidae.</article-title>
<publisher-loc>München, Germany</publisher-loc>
<publisher-name>Verlag Dr. Friedrich Pfeil</publisher-name>
<fpage>295</fpage>
<lpage>310</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Lucena1">
<label>54</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
<name>
<surname>Menezes</surname>
<given-names>NA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>A phylogenetic analysis of
<italic>Roestes</italic>
Günther and
<italic>Gilbertolus</italic>
Eigenmann with a hypothesis on the relationships of the Cynodontidae and Acestrorhynchidae (Teleostei, Ostariophysi, Characiformes).</article-title>
<person-group person-group-type="editor">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>ZMS</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
</person-group>
<source>Phylogeny and Classification of Neotropical Fishes</source>
<publisher-loc>Porto Alegre, Brazil</publisher-loc>
<publisher-name>EDIPUCRS</publisher-name>
<fpage>261</fpage>
<lpage>278</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Lundberg1">
<label>55</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Lundberg</surname>
<given-names>JG</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>The phylogeny of ictalurid catfishes: A synthesis of recent work.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Mayden</surname>
<given-names>RL</given-names>
</name>
</person-group>
<source>Systematics, Historical Ecology, and North American Freshwater Fishes</source>
<publisher-loc>Stanford</publisher-loc>
<publisher-name>Stanford University Press</publisher-name>
<fpage>392</fpage>
<lpage>420</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Malabarba1">
<label>56</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Monophyly of the Cheirodontinae, characters and major clades (Ostariophysi: Characidae).</article-title>
<person-group person-group-type="editor">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>ZMS</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
</person-group>
<source>Phylogeny and Classification of Neotropical Fishes</source>
<publisher-loc>Porto Alegre, Brazil</publisher-loc>
<publisher-name>EDIPUCRS</publisher-name>
<fpage>193</fpage>
<lpage>234</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Smith2">
<label>57</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>GR</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>Phylogeny and biogeography of the Catostomidae, freshwater fishes of North America and Asia.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Mayden</surname>
<given-names>RL</given-names>
</name>
</person-group>
<source>Systematics, HIstorical Ecology, and North American Freshwater Fishes</source>
<publisher-loc>Stanford, CA</publisher-loc>
<publisher-name>Stanford University Press</publisher-name>
<fpage>778</fpage>
<lpage>826</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-SoaresPorto1">
<label>58</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Soares-Porto</surname>
<given-names>LM</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Monophyly and interrelationships of the Centromochlinae (Siluriformes: Auchenipteridae).</article-title>
<person-group person-group-type="editor">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>ZMS</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
</person-group>
<source>Phylogeny and Classification of Neotropical Fishes</source>
<publisher-loc>Porto Alegre</publisher-loc>
<publisher-name>Edipucrs</publisher-name>
<fpage>331</fpage>
<lpage>350</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Weitzman1">
<label>59</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Weitzman</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Menezes</surname>
<given-names>NA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Relationships of the tribes and genera of the Glandulocaudiinae (Ostariophysi: Characiformes: Characidae), with a description of a new genus.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Malabarba</surname>
<given-names>LR</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Vari</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>ZMS</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>CAS</given-names>
</name>
</person-group>
<source>Phylogeny and Classification of Neotropical Fishes</source>
<publisher-loc>Porto Alegre</publisher-loc>
<publisher-name>EDIPUCRS</publisher-name>
<fpage>171</fpage>
<lpage>192</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-ZaraguetaBagils1">
<label>60</label>
<mixed-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Zaragueta Bagils</surname>
<given-names>R</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Basal clupeomorphs and ellimmichthyiform phylogeny.</article-title>
<person-group person-group-type="editor">
<name>
<surname>Arratia</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Tintori</surname>
<given-names>A</given-names>
</name>
</person-group>
<fpage>391</fpage>
<lpage>404</lpage>
<comment>Mesozoic Fishes 3 - Systematics, Paleoenvironments and Biodiversity Munich: Verlag Dr. F. Pfeil</comment>
</mixed-citation>
</ref>
<ref id="pone.0010708-Bockmann1">
<label>61</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Bockmann</surname>
<given-names>FA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Análise Filogenética da Família Heptapteridae (Teleostei, Ostariophysi, Siluriformes) e Redefenição de seus Gêneros.</article-title>
<publisher-loc>São Paulo</publisher-loc>
<publisher-name>Universidade de São Paulo</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Chen1">
<label>62</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>X-P</given-names>
</name>
</person-group>
<year>1994</year>
<article-title>Phylogenetic Studies of the Amblycipitid Catfishes (Teleostei, Siluriformes) with Species Accounts.</article-title>
<publisher-loc>Durham</publisher-loc>
<publisher-name>Duke University</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-dePinna3">
<label>63</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>de Pinna</surname>
<given-names>MCC</given-names>
</name>
</person-group>
<year>1993</year>
<article-title>Higher-level Phylogeny of Siluriformes (Teleostei: Ostariophysi), with a New Classification of the Order.</article-title>
<publisher-loc>New York</publisher-loc>
<publisher-name>City University of New York</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Friel1">
<label>64</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Friel</surname>
<given-names>JP</given-names>
</name>
</person-group>
<year>1994</year>
<article-title>A Phylogenetic Study of the Neotropical Banjo Catfishes (Teleostei: Siluriformes: Aspredinidae).</article-title>
<publisher-loc>Durham</publisher-loc>
<publisher-name>Duke University</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mo1">
<label>65</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Mo</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>1991</year>
<article-title>Anatomy, Relationships and Systematics of the Bagridae (Teleostei: Siluroidei) with a Hypothesis of Siluroid Phylogeny.</article-title>
<publisher-name>Koeltz, Koenigstein</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Shibatta1">
<label>66</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Shibatta</surname>
<given-names>OA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Sistemática e Evolução da Família Pseudopimelodidae (Ostariophysi, Siluriformes), com a Revisão Taxonômica de Gênero Pseudopimelodus.</article-title>
<publisher-loc>São Paulo</publisher-loc>
<publisher-name>Universidade Federal São Carlos</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Siebert1">
<label>67</label>
<mixed-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Siebert</surname>
<given-names>DH</given-names>
</name>
</person-group>
<year>1987</year>
<source>Interrelationships among families of the order Cypriniformes (Teleostei) [Ph.D.]: City University of New York, New York, New York</source>
</mixed-citation>
</ref>
<ref id="pone.0010708-DiDario1">
<label>68</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Di Dario</surname>
<given-names>F</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>Filogenia de Pristigasteroidea (Teleostei, Clupeomorpha).</article-title>
<publisher-loc>São Paulo</publisher-loc>
<publisher-name>Universidade de São Paulo</publisher-name>
<size units="page"></size>
</mixed-citation>
</ref>
<ref id="pone.0010708-Phenoscape1">
<label>69</label>
<mixed-citation publication-type="other">Phenoscape Guide to Character Annotation. Available:
<ext-link ext-link-type="uri" xlink:href="https://www.phenoscape.org/wiki/Guide_to_Character_Annotation">https://www.phenoscape.org/wiki/Guide_to_Character_Annotation</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-1">
<label>70</label>
<mixed-citation publication-type="book">
<article-title>Phenoscape curators mailing list.</article-title>
<publisher-loc>Available</publisher-loc>
<publisher-name>sourceforge.net/mailarchive/forum.php?forum_name = phenoscape-curators</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-obodiscuss1">
<label>71</label>
<mixed-citation publication-type="other">obo-discuss mailing list archives. Available:
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/mailarchive/forum.php?forum_name=obo-discuss">http://sourceforge.net/mailarchive/forum.php?forum_name=obo-discuss</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-obophenotype1">
<label>72</label>
<mixed-citation publication-type="other">obo-phenotype mailing list archives. Available:
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/mailarchive/forum.php?forum_name=obo-phenotype">http://sourceforge.net/mailarchive/forum.php?forum_name=obo-phenotype</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Maddison1">
<label>73</label>
<mixed-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Maddison</surname>
<given-names>WP</given-names>
</name>
<name>
<surname>Maddison</surname>
<given-names>DR</given-names>
</name>
</person-group>
<year>2009</year>
<comment>Mesquite: a modular system for evolutionary analysis. Version 2.6
<ext-link ext-link-type="uri" xlink:href="http://mesquiteproject.org">http://mesquiteproject.org</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0010708-Teleost1">
<label>74</label>
<mixed-citation publication-type="other">Teleost Taxonomy Ontology Term Request Tracker. Available:
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/tracker/?atid=1046550&group_id=76834&func=browse">http://sourceforge.net/tracker/?atid=1046550&group_id=76834&func=browse</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Teleost2">
<label>75</label>
<mixed-citation publication-type="other">Teleost Anatomy Ontology Term Request Tracker. Available:
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/tracker/?group_id=76834&atid=994764">http://sourceforge.net/tracker/?group_id=76834&atid=994764</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Teleostdiscuss1">
<label>76</label>
<mixed-citation publication-type="other">Teleost-discuss mailing list archives. Available:
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/mailarchive/forum.php?forum_name=obo-teleost-discuss">http://sourceforge.net/mailarchive/forum.php?forum_name=obo-teleost-discuss</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Mungall2">
<label>77</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mungall</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Gkoutos</surname>
<given-names>GV</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Haendel</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>SE</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>Integrating phenotype ontologies across multiple species.</article-title>
<source>Genome Biology</source>
<volume>11</volume>
<fpage>R2</fpage>
<pub-id pub-id-type="pmid">20064205</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0010708-Evidence1">
<label>78</label>
<mixed-citation publication-type="other">Evidence Codes Ontology. Available:
<ext-link ext-link-type="uri" xlink:href="http://obofoundry.org/cgi-bin/detail.cgi?id=evidence_code">http://obofoundry.org/cgi-bin/detail.cgi?id=evidence_code</ext-link>
</mixed-citation>
</ref>
<ref id="pone.0010708-Fricke1">
<label>79</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Fricke</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Eschmeyer</surname>
<given-names>WN</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>A guide to Fish Collections in the Catalog of Fishes database.</article-title>
<publisher-name>On-line version of 15 January 2010</publisher-name>
</mixed-citation>
</ref>
<ref id="pone.0010708-Clarke1">
<label>80</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clarke</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Ksepka</surname>
<given-names>DT</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Norell</surname>
<given-names>MA</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Combined phylogenetic analysis of a new North American fossil species confirms widespread Eocene distribution for stem rollers (Aves, Coracii).</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>157</volume>
<fpage>586</fpage>
<lpage>611</lpage>
</mixed-citation>
</ref>
<ref id="pone.0010708-Santagata1">
<label>81</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Santagata</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cohen</surname>
<given-names>BL</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Phoronid phylogenetics (Brachiopoda; Phoronata): evidence from morphological cladistics, small and large subunit rDNA sequences, and mitochondrial
<italic>cox1</italic>
.</article-title>
<source>Zoological Journal of the Linnean Society</source>
<volume>157</volume>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000182 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000182 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:2873956
   |texte=   Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:20505755" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024