Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST

Identifieur interne : 000E05 ( Main/Curation ); précédent : 000E04; suivant : 000E06

EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST

Auteurs : Yuan Jiang [République populaire de Chine] ; Jun Wang [République populaire de Chine] ; Dawen Xia [République populaire de Chine] ; Guoxian Yu [République populaire de Chine]

Source :

RBID : PMC:5573435

Descripteurs français

English descriptors

Abstract

Metagenomics brings in new discoveries and insights into the uncultured microbial world. One fundamental task in metagenomics analysis is to determine the taxonomy of raw sequence fragments. Modern sequencing technologies produce relatively short fragments and greatly increase the number of fragments, and thus make the taxonomic classification considerably more difficult than before. Therefore, fast and accurate techniques are called to classify large-scale fragments. We propose EnSVM (Ensemble Support Vector Machine) and its advanced method called EnSVMB (EnSVM with BLAST) to accurately classify fragments. EnSVM divides fragments into a large confident (or small diffident) set, based on whether the fragments get consistent (or inconsistent) predictions from linear SVMs trained with different k-mers. Empirical study shows that sensitivity and specificity of EnSVM on confident set are higher than 90% and 97%, but on diffident set are lower than 60% and 75%. To further improve the performance on diffident set, EnSVMB takes advantage of best hits of BLAST to reclassify fragments in that set. Experimental results show EnSVM can efficiently and effectively divide fragments into confident and diffident sets, and EnSVMB achieves higher accuracy, sensitivity and more true positives than related state-of-the-art methods and holds comparable specificity with the best of them.


Url:
DOI: 10.1038/s41598-017-09947-y
PubMed: 28842700
PubMed Central: 5573435

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:5573435

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST</title>
<author>
<name sortKey="Jiang, Yuan" sort="Jiang, Yuan" uniqKey="Jiang Y" first="Yuan" last="Jiang">Yuan Jiang</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jun" sort="Wang, Jun" uniqKey="Wang J" first="Jun" last="Wang">Jun Wang</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Xia, Dawen" sort="Xia, Dawen" uniqKey="Xia D" first="Dawen" last="Xia">Dawen Xia</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff2">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.443389.1</institution-id>
<institution></institution>
<institution>College of Data Science and Information Engineering, Guizhou Minzu University,</institution>
</institution-wrap>
Guiyang, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Guiyang</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="Aff3">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.443389.1</institution-id>
<institution></institution>
<institution>College of National Culture and Cognitive Science, Guizhou Minzu University,</institution>
</institution-wrap>
Guiyang, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Guiyang</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yu, Guoxian" sort="Yu, Guoxian" uniqKey="Yu G" first="Guoxian" last="Yu">Guoxian Yu</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">28842700</idno>
<idno type="pmc">5573435</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5573435</idno>
<idno type="RBID">PMC:5573435</idno>
<idno type="doi">10.1038/s41598-017-09947-y</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000445</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000445</idno>
<idno type="wicri:Area/Pmc/Curation">000445</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000445</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000869</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000869</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:28842700</idno>
<idno type="wicri:Area/PubMed/Corpus">000B84</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000B84</idno>
<idno type="wicri:Area/PubMed/Curation">000B84</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000B84</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000D21</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000D21</idno>
<idno type="wicri:Area/Ncbi/Merge">001B51</idno>
<idno type="wicri:Area/Ncbi/Curation">001B51</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001B51</idno>
<idno type="wicri:Area/Main/Merge">000E08</idno>
<idno type="wicri:Area/Main/Curation">000E05</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST</title>
<author>
<name sortKey="Jiang, Yuan" sort="Jiang, Yuan" uniqKey="Jiang Y" first="Yuan" last="Jiang">Yuan Jiang</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jun" sort="Wang, Jun" uniqKey="Wang J" first="Jun" last="Wang">Jun Wang</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Xia, Dawen" sort="Xia, Dawen" uniqKey="Xia D" first="Dawen" last="Xia">Dawen Xia</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff2">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.443389.1</institution-id>
<institution></institution>
<institution>College of Data Science and Information Engineering, Guizhou Minzu University,</institution>
</institution-wrap>
Guiyang, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Guiyang</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="Aff3">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.443389.1</institution-id>
<institution></institution>
<institution>College of National Culture and Cognitive Science, Guizhou Minzu University,</institution>
</institution-wrap>
Guiyang, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Guiyang</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yu, Guoxian" sort="Yu, Guoxian" uniqKey="Yu G" first="Guoxian" last="Yu">Guoxian Yu</name>
<affiliation wicri:level="1">
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.263906.8</institution-id>
<institution></institution>
<institution>College of Computer and Information Science, Southwest University,</institution>
</institution-wrap>
Chongqing, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Chongqing</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Scientific Reports</title>
<idno type="eISSN">2045-2322</idno>
<imprint>
<date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence (genetics)</term>
<term>Classification</term>
<term>Humans</term>
<term>Metagenomics</term>
<term>Sensitivity and Specificity</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
<term>Support Vector Machine</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Classification</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Machine à vecteur de support</term>
<term>Métagénomique</term>
<term>Sensibilité et spécificité</term>
<term>Séquence nucléotidique (génétique)</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en">
<term>Base Sequence</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>Séquence nucléotidique</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Classification</term>
<term>Humans</term>
<term>Metagenomics</term>
<term>Sensitivity and Specificity</term>
<term>Software</term>
<term>Support Vector Machine</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Classification</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Machine à vecteur de support</term>
<term>Métagénomique</term>
<term>Sensibilité et spécificité</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p id="Par1">Metagenomics brings in new discoveries and insights into the uncultured microbial world. One fundamental task in metagenomics analysis is to determine the taxonomy of raw sequence fragments. Modern sequencing technologies produce relatively short fragments and greatly increase the number of fragments, and thus make the taxonomic classification considerably more difficult than before. Therefore, fast and accurate techniques are called to classify large-scale fragments. We propose EnSVM (
<italic>En</italic>
semble
<italic>S</italic>
upport
<italic>V</italic>
ector
<italic>M</italic>
achine) and its advanced method called EnSVMB (
<italic>EnSVM</italic>
with
<italic>B</italic>
LAST) to accurately classify fragments. EnSVM divides fragments into a large confident (or small diffident) set, based on whether the fragments get consistent (or inconsistent) predictions from linear SVMs trained with different
<italic>k</italic>
-mers. Empirical study shows that sensitivity and specificity of EnSVM on confident set are higher than 90% and 97%, but on diffident set are lower than 60% and 75%. To further improve the performance on diffident set, EnSVMB takes advantage of best hits of BLAST to reclassify fragments in that set. Experimental results show EnSVM can efficiently and effectively divide fragments into confident and diffident sets, and EnSVMB achieves higher accuracy, sensitivity and more true positives than related state-of-the-art methods and holds comparable specificity with the best of them.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tito, Ry" uniqKey="Tito R">RY Tito</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wood, De" uniqKey="Wood D">DE Wood</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brady, A" uniqKey="Brady A">A Brady</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brady, A" uniqKey="Brady A">A Brady</name>
</author>
<author>
<name sortKey="Salzberg, S" uniqKey="Salzberg S">S Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
<author>
<name sortKey="Reichenberger, Er" uniqKey="Reichenberger E">ER Reichenberger</name>
</author>
<author>
<name sortKey="Rosenfeld, Am" uniqKey="Rosenfeld A">AM Rosenfeld</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sandberg, R" uniqKey="Sandberg R">R Sandberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Parks, Dh" uniqKey="Parks D">DH Parks</name>
</author>
<author>
<name sortKey="Macdonald, Nj" uniqKey="Macdonald N">NJ Macdonald</name>
</author>
<author>
<name sortKey="Beiko, Rg" uniqKey="Beiko R">RG Beiko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Patil, Kr" uniqKey="Patil K">KR Patil</name>
</author>
<author>
<name sortKey="Roune, L" uniqKey="Roune L">L Roune</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cui, H" uniqKey="Cui H">H Cui</name>
</author>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mardis, Er" uniqKey="Mardis E">ER Mardis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vervier, K" uniqKey="Vervier K">K Vervier</name>
</author>
<author>
<name sortKey="Mah, P" uniqKey="Mah P">P Mah</name>
</author>
<author>
<name sortKey="Tournoud, M" uniqKey="Tournoud M">M Tournoud</name>
</author>
<author>
<name sortKey="Veyrieras, Jb" uniqKey="Veyrieras J">JB Veyrieras</name>
</author>
<author>
<name sortKey="Vert, Jp" uniqKey="Vert J">JP Vert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Schwartz, S" uniqKey="Schwartz S">S Schwartz</name>
</author>
<author>
<name sortKey="Wagner, L" uniqKey="Wagner L">L Wagner</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fan, Re" uniqKey="Fan R">RE Fan</name>
</author>
<author>
<name sortKey="Chang, Kw" uniqKey="Chang K">KW Chang</name>
</author>
<author>
<name sortKey="Hsieh, Cj" uniqKey="Hsieh C">CJ Hsieh</name>
</author>
<author>
<name sortKey="Wang, Xr" uniqKey="Wang X">XR Wang</name>
</author>
<author>
<name sortKey="Lin, Cj" uniqKey="Lin C">CJ Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Angly, Fe" uniqKey="Angly F">FE Angly</name>
</author>
<author>
<name sortKey="Willner, D" uniqKey="Willner D">D Willner</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T Tatusova</name>
</author>
<author>
<name sortKey="Brown, Gr" uniqKey="Brown G">GR Brown</name>
</author>
<author>
<name sortKey="Maglott, Dr" uniqKey="Maglott D">DR Maglott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author>
<name sortKey="Wu, J" uniqKey="Wu J">J Wu</name>
</author>
<author>
<name sortKey="Gu, F" uniqKey="Gu F">F Gu</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
<author>
<name sortKey="He, Z" uniqKey="He Z">Z He</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kuncheva, Li" uniqKey="Kuncheva L">LI Kuncheva</name>
</author>
<author>
<name sortKey="Whitaker, Cj" uniqKey="Whitaker C">CJ Whitaker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, G" uniqKey="Yu G">G Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chang, Cc" uniqKey="Chang C">CC Chang</name>
</author>
<author>
<name sortKey="Lin, Cj" uniqKey="Lin C">CJ Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000E05 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 000E05 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:5573435
   |texte=   EnSVMB: Metagenomics Fragments Classification using Ensemble SVM and BLAST
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Curation/RBID.i   -Sk "pubmed:28842700" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021