Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria

Identifieur interne : 000F87 ( Pmc/Curation ); précédent : 000F86; suivant : 000F88

A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria

Auteurs : Erki Aun [Estonie] ; Age Brauer [Estonie] ; Veljo Kisand [Estonie] ; Tanel Tenson [Estonie] ; Maido Remm [Estonie]

Source :

RBID : PMC:6211763

Abstract

We have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) identifies phenotype-specific k-mers, (b) generates a k-mer-based statistical model for predicting a given phenotype and (c) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167 Klebsiella pneumoniae isolates (virulence), 200 Pseudomonas aeruginosa isolates (ciprofloxacin resistance) and 459 Clostridium difficile isolates (azithromycin resistance). The phenotype prediction models trained from these datasets obtained the F1-measure of 0.88 on the K. pneumoniae test set, 0.88 on the P. aeruginosa test set and 0.97 on the C. difficile test set. The F1-measures were the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets. PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (https://github.com/bioinfo-ut/PhenotypeSeeker/).


Url:
DOI: 10.1371/journal.pcbi.1006434
PubMed: 30346947
PubMed Central: 6211763

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:6211763

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A
<italic>k</italic>
-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria</title>
<author>
<name sortKey="Aun, Erki" sort="Aun, Erki" uniqKey="Aun E" first="Erki" last="Aun">Erki Aun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Brauer, Age" sort="Brauer, Age" uniqKey="Brauer A" first="Age" last="Brauer">Age Brauer</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kisand, Veljo" sort="Kisand, Veljo" uniqKey="Kisand V" first="Veljo" last="Kisand">Veljo Kisand</name>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Institute of Technology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Institute of Technology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Tenson, Tanel" sort="Tenson, Tanel" uniqKey="Tenson T" first="Tanel" last="Tenson">Tanel Tenson</name>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Institute of Technology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Institute of Technology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Remm, Maido" sort="Remm, Maido" uniqKey="Remm M" first="Maido" last="Remm">Maido Remm</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">30346947</idno>
<idno type="pmc">6211763</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211763</idno>
<idno type="RBID">PMC:6211763</idno>
<idno type="doi">10.1371/journal.pcbi.1006434</idno>
<date when="2018">2018</date>
<idno type="wicri:Area/Pmc/Corpus">000F87</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F87</idno>
<idno type="wicri:Area/Pmc/Curation">000F87</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000F87</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A
<italic>k</italic>
-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria</title>
<author>
<name sortKey="Aun, Erki" sort="Aun, Erki" uniqKey="Aun E" first="Erki" last="Aun">Erki Aun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Brauer, Age" sort="Brauer, Age" uniqKey="Brauer A" first="Age" last="Brauer">Age Brauer</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kisand, Veljo" sort="Kisand, Veljo" uniqKey="Kisand V" first="Veljo" last="Kisand">Veljo Kisand</name>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Institute of Technology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Institute of Technology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Tenson, Tanel" sort="Tenson, Tanel" uniqKey="Tenson T" first="Tanel" last="Tenson">Tanel Tenson</name>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Institute of Technology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Institute of Technology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Remm, Maido" sort="Remm, Maido" uniqKey="Remm M" first="Maido" last="Remm">Maido Remm</name>
<affiliation wicri:level="1">
<nlm:aff id="aff001">
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</nlm:aff>
<country xml:lang="fr">Estonie</country>
<wicri:regionArea>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS Computational Biology</title>
<idno type="ISSN">1553-734X</idno>
<idno type="eISSN">1553-7358</idno>
<imprint>
<date when="2018">2018</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>We have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) identifies phenotype-specific k-mers, (b) generates a
<italic>k</italic>
-mer-based statistical model for predicting a given phenotype and (c) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167
<italic>Klebsiella pneumoniae</italic>
isolates (virulence), 200
<italic>Pseudomonas aeruginosa</italic>
isolates (ciprofloxacin resistance) and 459
<italic>Clostridium difficile</italic>
isolates (azithromycin resistance). The phenotype prediction models trained from these datasets obtained the F1-measure of 0.88 on the
<italic>K</italic>
.
<italic>pneumoniae</italic>
test set, 0.88 on the
<italic>P</italic>
.
<italic>aeruginosa</italic>
test set and 0.97 on the
<italic>C</italic>
.
<italic>difficile</italic>
test set. The F1-measures were the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets. PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (
<ext-link ext-link-type="uri" xlink:href="https://github.com/bioinfo-ut/PhenotypeSeeker/">https://github.com/bioinfo-ut/PhenotypeSeeker/</ext-link>
).</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Kisand, V" uniqKey="Kisand V">V Kisand</name>
</author>
<author>
<name sortKey="Lettieri, T" uniqKey="Lettieri T">T Lettieri</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Crofts, Ts" uniqKey="Crofts T">TS Crofts</name>
</author>
<author>
<name sortKey="Gasparrini, Aj" uniqKey="Gasparrini A">AJ Gasparrini</name>
</author>
<author>
<name sortKey="Dantas, G" uniqKey="Dantas G">G Dantas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bakour, S" uniqKey="Bakour S">S Bakour</name>
</author>
<author>
<name sortKey="Sankar, Sa" uniqKey="Sankar S">SA Sankar</name>
</author>
<author>
<name sortKey="Rathored, J" uniqKey="Rathored J">J Rathored</name>
</author>
<author>
<name sortKey="Biagini, P" uniqKey="Biagini P">P Biagini</name>
</author>
<author>
<name sortKey="Raoult, D" uniqKey="Raoult D">D Raoult</name>
</author>
<author>
<name sortKey="Fournier, P E" uniqKey="Fournier P">P-E Fournier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wheeler, Ne" uniqKey="Wheeler N">NE Wheeler</name>
</author>
<author>
<name sortKey="Gardner, Pp" uniqKey="Gardner P">PP Gardner</name>
</author>
<author>
<name sortKey="Barquist, L" uniqKey="Barquist L">L Barquist</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Metcalf, Bj" uniqKey="Metcalf B">BJ Metcalf</name>
</author>
<author>
<name sortKey="Chochua, S" uniqKey="Chochua S">S Chochua</name>
</author>
<author>
<name sortKey="Li, Z" uniqKey="Li Z">Z Li</name>
</author>
<author>
<name sortKey="Gertz, Re" uniqKey="Gertz R">RE Gertz</name>
</author>
<author>
<name sortKey="Walker, H" uniqKey="Walker H">H Walker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lees, Ja" uniqKey="Lees J">JA Lees</name>
</author>
<author>
<name sortKey="Vehkala, M" uniqKey="Vehkala M">M Vehkala</name>
</author>
<author>
<name sortKey="V Lim Ki, N" uniqKey="V Lim Ki N">N Välimäki</name>
</author>
<author>
<name sortKey="Harris, Sr" uniqKey="Harris S">SR Harris</name>
</author>
<author>
<name sortKey="Chewapreecha, C" uniqKey="Chewapreecha C">C Chewapreecha</name>
</author>
<author>
<name sortKey="Croucher, Nj" uniqKey="Croucher N">NJ Croucher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nguyen, M" uniqKey="Nguyen M">M Nguyen</name>
</author>
<author>
<name sortKey="Brettin, T" uniqKey="Brettin T">T Brettin</name>
</author>
<author>
<name sortKey="Long, Sw" uniqKey="Long S">SW Long</name>
</author>
<author>
<name sortKey="Musser, Jm" uniqKey="Musser J">JM Musser</name>
</author>
<author>
<name sortKey="Olsen, Rj" uniqKey="Olsen R">RJ Olsen</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davis, Jj" uniqKey="Davis J">JJ Davis</name>
</author>
<author>
<name sortKey="Boisvert, S" uniqKey="Boisvert S">S Boisvert</name>
</author>
<author>
<name sortKey="Brettin, T" uniqKey="Brettin T">T Brettin</name>
</author>
<author>
<name sortKey="Kenyon, Rw" uniqKey="Kenyon R">RW Kenyon</name>
</author>
<author>
<name sortKey="Mao, C" uniqKey="Mao C">C Mao</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Drouin, A" uniqKey="Drouin A">A Drouin</name>
</author>
<author>
<name sortKey="Giguere, S" uniqKey="Giguere S">S Giguère</name>
</author>
<author>
<name sortKey="Deraspe, M" uniqKey="Deraspe M">M Déraspe</name>
</author>
<author>
<name sortKey="Marchand, M" uniqKey="Marchand M">M Marchand</name>
</author>
<author>
<name sortKey="Tyers, M" uniqKey="Tyers M">M Tyers</name>
</author>
<author>
<name sortKey="Loo, Vg" uniqKey="Loo V">VG Loo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marinier, E" uniqKey="Marinier E">E Marinier</name>
</author>
<author>
<name sortKey="Zaheer, R" uniqKey="Zaheer R">R Zaheer</name>
</author>
<author>
<name sortKey="Berry, C" uniqKey="Berry C">C Berry</name>
</author>
<author>
<name sortKey="Weedmark, Ka" uniqKey="Weedmark K">KA Weedmark</name>
</author>
<author>
<name sortKey="Domaratzki, M" uniqKey="Domaratzki M">M Domaratzki</name>
</author>
<author>
<name sortKey="Mabon, P" uniqKey="Mabon P">P Mabon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kaplinski, L" uniqKey="Kaplinski L">L Kaplinski</name>
</author>
<author>
<name sortKey="Lepamets, M" uniqKey="Lepamets M">M Lepamets</name>
</author>
<author>
<name sortKey="Remm, M" uniqKey="Remm M">M Remm</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ondov, Bd" uniqKey="Ondov B">BD Ondov</name>
</author>
<author>
<name sortKey="Treangen, Tj" uniqKey="Treangen T">TJ Treangen</name>
</author>
<author>
<name sortKey="Melsted, P" uniqKey="Melsted P">P Melsted</name>
</author>
<author>
<name sortKey="Mallonee, Ab" uniqKey="Mallonee A">AB Mallonee</name>
</author>
<author>
<name sortKey="Bergman, Nh" uniqKey="Bergman N">NH Bergman</name>
</author>
<author>
<name sortKey="Koren, S" uniqKey="Koren S">S Koren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gerstein, M" uniqKey="Gerstein M">M Gerstein</name>
</author>
<author>
<name sortKey="Sonnhammer, El" uniqKey="Sonnhammer E">EL Sonnhammer</name>
</author>
<author>
<name sortKey="Chothia, C" uniqKey="Chothia C">C Chothia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pajuste, F D" uniqKey="Pajuste F">F-D Pajuste</name>
</author>
<author>
<name sortKey="Kaplinski, L" uniqKey="Kaplinski L">L Kaplinski</name>
</author>
<author>
<name sortKey="Mols, M" uniqKey="Mols M">M Möls</name>
</author>
<author>
<name sortKey="Puurand, T" uniqKey="Puurand T">T Puurand</name>
</author>
<author>
<name sortKey="Lepamets, M" uniqKey="Lepamets M">M Lepamets</name>
</author>
<author>
<name sortKey="Remm, M" uniqKey="Remm M">M Remm</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barker, Kf" uniqKey="Barker K">KF Barker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Susceptibilitytesting Ec On, A" uniqKey="Susceptibilitytesting Ec On A">A SusceptibilityTesting EC on</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fabrega, A" uniqKey="Fabrega A">A Fàbrega</name>
</author>
<author>
<name sortKey="Madurga, S" uniqKey="Madurga S">S Madurga</name>
</author>
<author>
<name sortKey="Giralt, E" uniqKey="Giralt E">E Giralt</name>
</author>
<author>
<name sortKey="Vila, J" uniqKey="Vila J">J Vila</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jalal, S" uniqKey="Jalal S">S Jalal</name>
</author>
<author>
<name sortKey="Wretlind, B" uniqKey="Wretlind B">B Wretlind</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kaminska, Kh" uniqKey="Kaminska K">KH Kaminska</name>
</author>
<author>
<name sortKey="Purta, E" uniqKey="Purta E">E Purta</name>
</author>
<author>
<name sortKey="Hansen, Lh" uniqKey="Hansen L">LH Hansen</name>
</author>
<author>
<name sortKey="Bujnicki, Jm" uniqKey="Bujnicki J">JM Bujnicki</name>
</author>
<author>
<name sortKey="Vester, B" uniqKey="Vester B">B Vester</name>
</author>
<author>
<name sortKey="Long, Ks" uniqKey="Long K">KS Long</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carniel, E" uniqKey="Carniel E">E Carniel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, Yt" uniqKey="Chen Y">YT Chen</name>
</author>
<author>
<name sortKey="Chang, Hy" uniqKey="Chang H">HY Chang</name>
</author>
<author>
<name sortKey="Lai, Yc" uniqKey="Lai Y">YC Lai</name>
</author>
<author>
<name sortKey="Pan, Cc" uniqKey="Pan C">CC Pan</name>
</author>
<author>
<name sortKey="Tsai, Sf" uniqKey="Tsai S">SF Tsai</name>
</author>
<author>
<name sortKey="Peng, Hl" uniqKey="Peng H">HL Peng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lagos, R" uniqKey="Lagos R">R Lagos</name>
</author>
<author>
<name sortKey="Baeza, M" uniqKey="Baeza M">M Baeza</name>
</author>
<author>
<name sortKey="Corsini, G" uniqKey="Corsini G">G Corsini</name>
</author>
<author>
<name sortKey="Hetz, C" uniqKey="Hetz C">C Hetz</name>
</author>
<author>
<name sortKey="Strahsburger, E" uniqKey="Strahsburger E">E Strahsburger</name>
</author>
<author>
<name sortKey="Castillo, Ja" uniqKey="Castillo J">JA Castillo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nassif, X" uniqKey="Nassif X">X Nassif</name>
</author>
<author>
<name sortKey="Sansonetti, Pj" uniqKey="Sansonetti P">PJ Sansonetti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Putze, J" uniqKey="Putze J">J Putze</name>
</author>
<author>
<name sortKey="Hennequin, C" uniqKey="Hennequin C">C Hennequin</name>
</author>
<author>
<name sortKey="Nougayrede, Jp" uniqKey="Nougayrede J">JP Nougayrède</name>
</author>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
<author>
<name sortKey="Homburg, S" uniqKey="Homburg S">S Homburg</name>
</author>
<author>
<name sortKey="Karch, H" uniqKey="Karch H">H Karch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chou, Hc" uniqKey="Chou H">HC Chou</name>
</author>
<author>
<name sortKey="Lee, Cz" uniqKey="Lee C">CZ Lee</name>
</author>
<author>
<name sortKey="Ma, Lc" uniqKey="Ma L">LC Ma</name>
</author>
<author>
<name sortKey="Fang, Ct" uniqKey="Fang C">CT Fang</name>
</author>
<author>
<name sortKey="Chang, Sc" uniqKey="Chang S">SC Chang</name>
</author>
<author>
<name sortKey="Wang, Jt" uniqKey="Wang J">JT Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cheng, Hy" uniqKey="Cheng H">HY Cheng</name>
</author>
<author>
<name sortKey="Chen, Ys" uniqKey="Chen Y">YS Chen</name>
</author>
<author>
<name sortKey="Wu, Cy" uniqKey="Wu C">CY Wu</name>
</author>
<author>
<name sortKey="Chang, Hy" uniqKey="Chang H">HY Chang</name>
</author>
<author>
<name sortKey="Lai, Yc" uniqKey="Lai Y">YC Lai</name>
</author>
<author>
<name sortKey="Peng, Hl" uniqKey="Peng H">HL Peng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lai, Y" uniqKey="Lai Y">Y Lai</name>
</author>
<author>
<name sortKey="Peng, H" uniqKey="Peng H">H Peng</name>
</author>
<author>
<name sortKey="Chang, H" uniqKey="Chang H">H Chang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, L C" uniqKey="Ma L">L-C Ma</name>
</author>
<author>
<name sortKey="Fang, C T" uniqKey="Fang C">C-T Fang</name>
</author>
<author>
<name sortKey="Lee, C Z" uniqKey="Lee C">C-Z Lee</name>
</author>
<author>
<name sortKey="Shun, C T" uniqKey="Shun C">C-T Shun</name>
</author>
<author>
<name sortKey="Wang, J T" uniqKey="Wang J">J-T Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lai, Yc" uniqKey="Lai Y">YC Lai</name>
</author>
<author>
<name sortKey="Lin, G T" uniqKey="Lin G">G-T Lin</name>
</author>
<author>
<name sortKey="Yang, S L" uniqKey="Yang S">S-L Yang</name>
</author>
<author>
<name sortKey="Chang, H Y" uniqKey="Chang H">H-Y Chang</name>
</author>
<author>
<name sortKey="Peng, H L" uniqKey="Peng H">H-L Peng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bankevich, A" uniqKey="Bankevich A">A Bankevich</name>
</author>
<author>
<name sortKey="Nurk, S" uniqKey="Nurk S">S Nurk</name>
</author>
<author>
<name sortKey="Antipov, D" uniqKey="Antipov D">D Antipov</name>
</author>
<author>
<name sortKey="Gurevich, Aa" uniqKey="Gurevich A">AA Gurevich</name>
</author>
<author>
<name sortKey="Dvorkin, M" uniqKey="Dvorkin M">M Dvorkin</name>
</author>
<author>
<name sortKey="Kulikov, As" uniqKey="Kulikov A">AS Kulikov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holt, Ke" uniqKey="Holt K">KE Holt</name>
</author>
<author>
<name sortKey="Wertheim, H" uniqKey="Wertheim H">H Wertheim</name>
</author>
<author>
<name sortKey="Zadoks, Rn" uniqKey="Zadoks R">RN Zadoks</name>
</author>
<author>
<name sortKey="Baker, S" uniqKey="Baker S">S Baker</name>
</author>
<author>
<name sortKey="Whitehouse, Ca" uniqKey="Whitehouse C">CA Whitehouse</name>
</author>
<author>
<name sortKey="Dance, D" uniqKey="Dance D">D Dance</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
<author>
<name sortKey="Maxwell, P" uniqKey="Maxwell P">P Maxwell</name>
</author>
<author>
<name sortKey="Birmingham, A" uniqKey="Birmingham A">A Birmingham</name>
</author>
<author>
<name sortKey="Carnes, J" uniqKey="Carnes J">J Carnes</name>
</author>
<author>
<name sortKey="Caporaso, Jg" uniqKey="Caporaso J">JG Caporaso</name>
</author>
<author>
<name sortKey="Easton, Bc" uniqKey="Easton B">BC Easton</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Josh Pasek, A" uniqKey="Josh Pasek A">A Josh Pasek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pedregosa, F" uniqKey="Pedregosa F">F Pedregosa</name>
</author>
<author>
<name sortKey="Varoquaux, G" uniqKey="Varoquaux G">G Varoquaux</name>
</author>
<author>
<name sortKey="Gramfort, A" uniqKey="Gramfort A">A Gramfort</name>
</author>
<author>
<name sortKey="Michel, V" uniqKey="Michel V">V Michel</name>
</author>
<author>
<name sortKey="Thirion, B" uniqKey="Thirion B">B Thirion</name>
</author>
<author>
<name sortKey="Grisel, O" uniqKey="Grisel O">O Grisel</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS Comput Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS Comput. Biol</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">ploscomp</journal-id>
<journal-title-group>
<journal-title>PLoS Computational Biology</journal-title>
</journal-title-group>
<issn pub-type="ppub">1553-734X</issn>
<issn pub-type="epub">1553-7358</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">30346947</article-id>
<article-id pub-id-type="pmc">6211763</article-id>
<article-id pub-id-type="doi">10.1371/journal.pcbi.1006434</article-id>
<article-id pub-id-type="publisher-id">PCOMPBIOL-D-18-00544</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Medical Microbiology</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Pseudomonas Aeruginosa</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Pathology and Laboratory Medicine</subject>
<subj-group>
<subject>Pathogens</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Pseudomonas Aeruginosa</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
<subj-group>
<subject>Pseudomonas</subject>
<subj-group>
<subject>Pseudomonas Aeruginosa</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Research and Analysis Methods</subject>
<subj-group>
<subject>Mathematical and Statistical Techniques</subject>
<subj-group>
<subject>Statistical Methods</subject>
<subj-group>
<subject>Forecasting</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Physical Sciences</subject>
<subj-group>
<subject>Mathematics</subject>
<subj-group>
<subject>Statistics</subject>
<subj-group>
<subject>Statistical Methods</subject>
<subj-group>
<subject>Forecasting</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Microbial Control</subject>
<subj-group>
<subject>Antimicrobial Resistance</subject>
<subj-group>
<subject>Antibiotic Resistance</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Pharmacology</subject>
<subj-group>
<subject>Antimicrobial Resistance</subject>
<subj-group>
<subject>Antibiotic Resistance</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
<subj-group>
<subject>Gut Bacteria</subject>
<subj-group>
<subject>Clostridium Difficile</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Computational Biology</subject>
<subj-group>
<subject>Genome Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Genetics</subject>
<subj-group>
<subject>Genomics</subject>
<subj-group>
<subject>Genome Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
<subj-group>
<subject>Klebsiella</subject>
<subj-group>
<subject>Klebsiella Pneumoniae</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Medical Microbiology</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Klebsiella</subject>
<subj-group>
<subject>Klebsiella Pneumoniae</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Pathology and Laboratory Medicine</subject>
<subj-group>
<subject>Pathogens</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Klebsiella</subject>
<subj-group>
<subject>Klebsiella Pneumoniae</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Molecular Biology</subject>
<subj-group>
<subject>Molecular Biology Techniques</subject>
<subj-group>
<subject>Sequencing Techniques</subject>
<subj-group>
<subject>Genome Sequencing</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Research and Analysis Methods</subject>
<subj-group>
<subject>Molecular Biology Techniques</subject>
<subj-group>
<subject>Sequencing Techniques</subject>
<subj-group>
<subject>Genome Sequencing</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Genetics</subject>
<subj-group>
<subject>Gene Identification and Analysis</subject>
<subj-group>
<subject>Mutation Detection</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A
<italic>k</italic>
-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria</article-title>
<alt-title alt-title-type="running-head">A method to identify phenotype-specific
<italic>k</italic>
-mers and predict phenotypes</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0001-7446-3524</contrib-id>
<name>
<surname>Aun</surname>
<given-names>Erki</given-names>
</name>
<role content-type="http://credit.casrai.org/">Software</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="cor001">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Brauer</surname>
<given-names>Age</given-names>
</name>
<role content-type="http://credit.casrai.org/">Investigation</role>
<role content-type="http://credit.casrai.org/">Validation</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kisand</surname>
<given-names>Veljo</given-names>
</name>
<role content-type="http://credit.casrai.org/">Data curation</role>
<role content-type="http://credit.casrai.org/">Resources</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tenson</surname>
<given-names>Tanel</given-names>
</name>
<role content-type="http://credit.casrai.org/">Data curation</role>
<role content-type="http://credit.casrai.org/">Resources</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0003-3966-8422</contrib-id>
<name>
<surname>Remm</surname>
<given-names>Maido</given-names>
</name>
<role content-type="http://credit.casrai.org/">Conceptualization</role>
<role content-type="http://credit.casrai.org/">Funding acquisition</role>
<role content-type="http://credit.casrai.org/">Methodology</role>
<role content-type="http://credit.casrai.org/">Project administration</role>
<role content-type="http://credit.casrai.org/">Supervision</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff001">
<label>1</label>
<addr-line>Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia</addr-line>
</aff>
<aff id="aff002">
<label>2</label>
<addr-line>Institute of Technology, University of Tartu, Tartu, Estonia</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Ouzounis</surname>
<given-names>Christos A.</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>CPERI, GREECE</addr-line>
</aff>
<author-notes>
<fn fn-type="COI-statement" id="coi001">
<p>The authors have declared that no competing interests exist.</p>
</fn>
<corresp id="cor001">* E-mail:
<email>erki.aun@ut.ee</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>22</day>
<month>10</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<month>10</month>
<year>2018</year>
</pub-date>
<volume>14</volume>
<issue>10</issue>
<elocation-id>e1006434</elocation-id>
<history>
<date date-type="received">
<day>9</day>
<month>4</month>
<year>2018</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>8</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>© 2018 Aun et al</copyright-statement>
<copyright-year>2018</copyright-year>
<copyright-holder>Aun et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open access article distributed under the terms of the
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="pcbi.1006434.pdf"></self-uri>
<abstract>
<p>We have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) identifies phenotype-specific k-mers, (b) generates a
<italic>k</italic>
-mer-based statistical model for predicting a given phenotype and (c) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167
<italic>Klebsiella pneumoniae</italic>
isolates (virulence), 200
<italic>Pseudomonas aeruginosa</italic>
isolates (ciprofloxacin resistance) and 459
<italic>Clostridium difficile</italic>
isolates (azithromycin resistance). The phenotype prediction models trained from these datasets obtained the F1-measure of 0.88 on the
<italic>K</italic>
.
<italic>pneumoniae</italic>
test set, 0.88 on the
<italic>P</italic>
.
<italic>aeruginosa</italic>
test set and 0.97 on the
<italic>C</italic>
.
<italic>difficile</italic>
test set. The F1-measures were the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets. PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (
<ext-link ext-link-type="uri" xlink:href="https://github.com/bioinfo-ut/PhenotypeSeeker/">https://github.com/bioinfo-ut/PhenotypeSeeker/</ext-link>
).</p>
</abstract>
<abstract abstract-type="summary">
<title>Author summary</title>
<p>Predicting phenotypic properties of bacterial isolates from their genomic sequences has numerous potential applications. A good example would be prediction of antimicrobial resistance and virulence phenotypes for use in medical diagnostics. We have developed a method that is able to predict phenotypes of interest from the genomic sequence of the isolate within seconds. The method uses a statistical model that can be trained automatically on isolates with known phenotype. The method is implemented in Python programming language and can be run on low-end Linux server and/or on laptop computers.</p>
</abstract>
<funding-group>
<award-group id="award001">
<funding-source>
<institution>Estonian Research Council</institution>
</funding-source>
<award-id>IUT2-22</award-id>
<principal-award-recipient>
<name>
<surname>Tenson</surname>
<given-names>Tanel</given-names>
</name>
</principal-award-recipient>
</award-group>
<award-group id="award002">
<funding-source>
<institution>Estonian Research Council</institution>
</funding-source>
<award-id>IUT34-11</award-id>
<principal-award-recipient>
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0003-3966-8422</contrib-id>
<name>
<surname>Remm</surname>
<given-names>Maido</given-names>
</name>
</principal-award-recipient>
</award-group>
<award-group id="award003">
<funding-source>
<institution>EU ERDF</institution>
</funding-source>
<award-id>2014-2020.4.01.15-0012</award-id>
</award-group>
<award-group id="award004">
<funding-source>
<institution>EU ERDF</institution>
</funding-source>
<award-id>2014-2020.4.01.15-0013</award-id>
</award-group>
<funding-statement>This work was funded by institutional grants IUT34-11 (MR) and IUT2-22 (TT) from Estonian Research Council (
<ext-link ext-link-type="uri" xlink:href="http://www.etag.ee/en/estonian-research-council/">http://www.etag.ee/en/estonian-research-council/</ext-link>
) and by the grants No. 2014-2020.4.01.15-0012 (to Estonian Centre of Excellence in Genomics and Translational Medicine) and No. 2014-2020.4.01.15-0013 (to Estonian Centre of Excellence in Molecular Cell Engineering) from European Regional Development Fond (
<ext-link ext-link-type="uri" xlink:href="http://ec.europa.eu/regional_policy/en/funding/erdf/">http://ec.europa.eu/regional_policy/en/funding/erdf/</ext-link>
). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts>
<fig-count count="4"></fig-count>
<table-count count="2"></table-count>
<page-count count="17"></page-count>
</counts>
<custom-meta-group>
<custom-meta>
<meta-name>PLOS Publication Stage</meta-name>
<meta-value>vor-update-to-uncorrected-proof</meta-value>
</custom-meta>
<custom-meta>
<meta-name>Publication Update</meta-name>
<meta-value>2018-11-01</meta-value>
</custom-meta>
<custom-meta id="data-availability">
<meta-name>Data Availability</meta-name>
<meta-value>PhenotypeSeeker software is available at GitHub (
<ext-link ext-link-type="uri" xlink:href="https://github.com/bioinfo-ut/PhenotypeSeeker">https://github.com/bioinfo-ut/PhenotypeSeeker</ext-link>
). C.difficile genomes used for software validation are available from European Nucleotide Archive [EMBL:PRJEB11776 ((
<ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk/ena/data/view/PRJEB11776">http://www.ebi.ac.uk/ena/data/view/PRJEB11776</ext-link>
)]. The binary phenotypes of azithromycin resistance for these C. difficile genomes are from Drouin et al. 2016 (Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons. BMC Genomics [Internet]. 17(1):754. Available from:
<ext-link ext-link-type="uri" xlink:href="http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2889-6">http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2889-6</ext-link>
). K. pneumoniae genomes used for software validation are available from European Nucleotide Archive [EMBL:PRJEB2111 ((
<ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/data/view/PRJEB2111">https://www.ebi.ac.uk/ena/data/view/PRJEB2111</ext-link>
)]. The binary phenotypes of infection status (infection/carriage) for these K.pneumoniae genomes are from Holt et al. 2015 (Genomic analysis of diversity, population structure, virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to public health. Proc Natl Acad Sci [Internet]. 112(27):E3574–81. Available from:
<ext-link ext-link-type="uri" xlink:href="http://www.pnas.org/lookup/doi/10.1073/pnas.1501049112">http://www.pnas.org/lookup/doi/10.1073/pnas.1501049112</ext-link>
). The P. aeruginosa dataset used for software validation is available from the NCBI's BioProject database [Accession: PRJNA244279 (
<ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA244279">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA244279</ext-link>
)].</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes>
<title>Data Availability</title>
<p>PhenotypeSeeker software is available at GitHub (
<ext-link ext-link-type="uri" xlink:href="https://github.com/bioinfo-ut/PhenotypeSeeker">https://github.com/bioinfo-ut/PhenotypeSeeker</ext-link>
). C.difficile genomes used for software validation are available from European Nucleotide Archive [EMBL:PRJEB11776 ((
<ext-link ext-link-type="uri" xlink:href="http://www.ebi.ac.uk/ena/data/view/PRJEB11776">http://www.ebi.ac.uk/ena/data/view/PRJEB11776</ext-link>
)]. The binary phenotypes of azithromycin resistance for these C. difficile genomes are from Drouin et al. 2016 (Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons. BMC Genomics [Internet]. 17(1):754. Available from:
<ext-link ext-link-type="uri" xlink:href="http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2889-6">http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2889-6</ext-link>
). K. pneumoniae genomes used for software validation are available from European Nucleotide Archive [EMBL:PRJEB2111 ((
<ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/data/view/PRJEB2111">https://www.ebi.ac.uk/ena/data/view/PRJEB2111</ext-link>
)]. The binary phenotypes of infection status (infection/carriage) for these K.pneumoniae genomes are from Holt et al. 2015 (Genomic analysis of diversity, population structure, virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to public health. Proc Natl Acad Sci [Internet]. 112(27):E3574–81. Available from:
<ext-link ext-link-type="uri" xlink:href="http://www.pnas.org/lookup/doi/10.1073/pnas.1501049112">http://www.pnas.org/lookup/doi/10.1073/pnas.1501049112</ext-link>
). The P. aeruginosa dataset used for software validation is available from the NCBI's BioProject database [Accession: PRJNA244279 (
<ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA244279">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA244279</ext-link>
)].</p>
</notes>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F87 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000F87 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:6211763
   |texte=   A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:30346947" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021