Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.

Identifieur interne : 001C11 ( PubMed/Checkpoint ); précédent : 001C10; suivant : 001C12

A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.

Auteurs : Sebastian Maurer-Stroh [Singapour] ; Vithiagaran Gunalan ; Wing-Cheong Wong ; Frank Eisenhaber

Source :

RBID : pubmed:24372034

Descripteurs français

English descriptors

Abstract

We propose an extension to alignment-free approaches that can produce reasonably accurate phylogenetic groupings starting from unaligned genomes, for example, as fast as 1 min on a standard desktop computer for 25 bacterial genomes. A 6-fold speed-up and 11-fold reduction in memory requirements compared to previous alignment-free methods is achieved by reducing the comparison space to a representative sample of k-mers of optimal length and with specific tag motifs. This approach was applied to the test case of fitting the enterohemorrhagic O104:H4 E.coli strain from the 2011 outbreak in Germany into the phylogenetic network of previously known E.coli-related strains and extend the method to allow assigning any new strain to the correct phylogenetic group even directly from unassembled short sequence reads from next generation sequencing data. Hence, this approach is also useful to quickly identify the most suitable reference genome for subsequent assembly steps.

DOI: 10.1142/S0219720013430051
PubMed: 24372034


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:24372034

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.</title>
<author>
<name sortKey="Maurer Stroh, Sebastian" sort="Maurer Stroh, Sebastian" uniqKey="Maurer Stroh S" first="Sebastian" last="Maurer-Stroh">Sebastian Maurer-Stroh</name>
<affiliation wicri:level="1">
<nlm:affiliation>Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Singapore.</nlm:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551</wicri:regionArea>
<wicri:noRegion>Singapore 637551</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Gunalan, Vithiagaran" sort="Gunalan, Vithiagaran" uniqKey="Gunalan V" first="Vithiagaran" last="Gunalan">Vithiagaran Gunalan</name>
</author>
<author>
<name sortKey="Wong, Wing Cheong" sort="Wong, Wing Cheong" uniqKey="Wong W" first="Wing-Cheong" last="Wong">Wing-Cheong Wong</name>
</author>
<author>
<name sortKey="Eisenhaber, Frank" sort="Eisenhaber, Frank" uniqKey="Eisenhaber F" first="Frank" last="Eisenhaber">Frank Eisenhaber</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2013">2013</date>
<idno type="RBID">pubmed:24372034</idno>
<idno type="pmid">24372034</idno>
<idno type="doi">10.1142/S0219720013430051</idno>
<idno type="wicri:Area/PubMed/Corpus">001B01</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001B01</idno>
<idno type="wicri:Area/PubMed/Curation">001B01</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001B01</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001C11</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001C11</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.</title>
<author>
<name sortKey="Maurer Stroh, Sebastian" sort="Maurer Stroh, Sebastian" uniqKey="Maurer Stroh S" first="Sebastian" last="Maurer-Stroh">Sebastian Maurer-Stroh</name>
<affiliation wicri:level="1">
<nlm:affiliation>Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Singapore.</nlm:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551</wicri:regionArea>
<wicri:noRegion>Singapore 637551</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Gunalan, Vithiagaran" sort="Gunalan, Vithiagaran" uniqKey="Gunalan V" first="Vithiagaran" last="Gunalan">Vithiagaran Gunalan</name>
</author>
<author>
<name sortKey="Wong, Wing Cheong" sort="Wong, Wing Cheong" uniqKey="Wong W" first="Wing-Cheong" last="Wong">Wing-Cheong Wong</name>
</author>
<author>
<name sortKey="Eisenhaber, Frank" sort="Eisenhaber, Frank" uniqKey="Eisenhaber F" first="Frank" last="Eisenhaber">Frank Eisenhaber</name>
</author>
</analytic>
<series>
<title level="j">Journal of bioinformatics and computational biology</title>
<idno type="eISSN">1757-6334</idno>
<imprint>
<date when="2013" type="published">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Enterohemorrhagic Escherichia coli (genetics)</term>
<term>Expressed Sequence Tags</term>
<term>Genome</term>
<term>Genome, Bacterial</term>
<term>Genomics (methods)</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Phylogeny</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Escherichia coli entérohémorrhagique (génétique)</term>
<term>Génome</term>
<term>Génome bactérien</term>
<term>Génomique ()</term>
<term>Phylogénie</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Étiquettes de séquences exprimées</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en">
<term>Enterohemorrhagic Escherichia coli</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>Escherichia coli entérohémorrhagique</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Genomics</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Expressed Sequence Tags</term>
<term>Genome</term>
<term>Genome, Bacterial</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Phylogeny</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Génome</term>
<term>Génome bactérien</term>
<term>Génomique</term>
<term>Phylogénie</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Étiquettes de séquences exprimées</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We propose an extension to alignment-free approaches that can produce reasonably accurate phylogenetic groupings starting from unaligned genomes, for example, as fast as 1 min on a standard desktop computer for 25 bacterial genomes. A 6-fold speed-up and 11-fold reduction in memory requirements compared to previous alignment-free methods is achieved by reducing the comparison space to a representative sample of k-mers of optimal length and with specific tag motifs. This approach was applied to the test case of fitting the enterohemorrhagic O104:H4 E.coli strain from the 2011 outbreak in Germany into the phylogenetic network of previously known E.coli-related strains and extend the method to allow assigning any new strain to the correct phylogenetic group even directly from unassembled short sequence reads from next generation sequencing data. Hence, this approach is also useful to quickly identify the most suitable reference genome for subsequent assembly steps. </div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">24372034</PMID>
<DateCompleted>
<Year>2014</Year>
<Month>09</Month>
<Day>30</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>08</Month>
<Day>10</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1757-6334</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>11</Volume>
<Issue>6</Issue>
<PubDate>
<Year>2013</Year>
<Month>Dec</Month>
</PubDate>
</JournalIssue>
<Title>Journal of bioinformatics and computational biology</Title>
<ISOAbbreviation>J Bioinform Comput Biol</ISOAbbreviation>
</Journal>
<ArticleTitle>A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.</ArticleTitle>
<Pagination>
<MedlinePgn>1343005</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1142/S0219720013430051</ELocationID>
<Abstract>
<AbstractText>We propose an extension to alignment-free approaches that can produce reasonably accurate phylogenetic groupings starting from unaligned genomes, for example, as fast as 1 min on a standard desktop computer for 25 bacterial genomes. A 6-fold speed-up and 11-fold reduction in memory requirements compared to previous alignment-free methods is achieved by reducing the comparison space to a representative sample of k-mers of optimal length and with specific tag motifs. This approach was applied to the test case of fitting the enterohemorrhagic O104:H4 E.coli strain from the 2011 outbreak in Germany into the phylogenetic network of previously known E.coli-related strains and extend the method to allow assigning any new strain to the correct phylogenetic group even directly from unassembled short sequence reads from next generation sequencing data. Hence, this approach is also useful to quickly identify the most suitable reference genome for subsequent assembly steps. </AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Maurer-Stroh</LastName>
<ForeName>Sebastian</ForeName>
<Initials>S</Initials>
<AffiliationInfo>
<Affiliation>Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, Singapore 138671, Singapore , School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60 Nanyang Drive, Singapore 637551, Singapore.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Gunalan</LastName>
<ForeName>Vithiagaran</ForeName>
<Initials>V</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Wong</LastName>
<ForeName>Wing-Cheong</ForeName>
<Initials>WC</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Eisenhaber</LastName>
<ForeName>Frank</ForeName>
<Initials>F</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2013</Year>
<Month>12</Month>
<Day>02</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Singapore</Country>
<MedlineTA>J Bioinform Comput Biol</MedlineTA>
<NlmUniqueID>101187344</NlmUniqueID>
<ISSNLinking>0219-7200</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D054324" MajorTopicYN="N">Enterohemorrhagic Escherichia coli</DescriptorName>
<QualifierName UI="Q000235" MajorTopicYN="N">genetics</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D020224" MajorTopicYN="N">Expressed Sequence Tags</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016678" MajorTopicYN="Y">Genome</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016680" MajorTopicYN="N">Genome, Bacterial</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D023281" MajorTopicYN="N">Genomics</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="Y">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D010802" MajorTopicYN="Y">Phylogeny</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2013</Year>
<Month>12</Month>
<Day>31</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2014</Year>
<Month>1</Month>
<Day>1</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2014</Year>
<Month>10</Month>
<Day>1</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">24372034</ArticleId>
<ArticleId IdType="doi">10.1142/S0219720013430051</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>Singapour</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Eisenhaber, Frank" sort="Eisenhaber, Frank" uniqKey="Eisenhaber F" first="Frank" last="Eisenhaber">Frank Eisenhaber</name>
<name sortKey="Gunalan, Vithiagaran" sort="Gunalan, Vithiagaran" uniqKey="Gunalan V" first="Vithiagaran" last="Gunalan">Vithiagaran Gunalan</name>
<name sortKey="Wong, Wing Cheong" sort="Wong, Wing Cheong" uniqKey="Wong W" first="Wing-Cheong" last="Wong">Wing-Cheong Wong</name>
</noCountry>
<country name="Singapour">
<noRegion>
<name sortKey="Maurer Stroh, Sebastian" sort="Maurer Stroh, Sebastian" uniqKey="Maurer Stroh S" first="Sebastian" last="Maurer-Stroh">Sebastian Maurer-Stroh</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001C11 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 001C11 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:24372034
   |texte=   A simple shortcut to unsupervised alignment-free phylogenetic genome groupings, even from unassembled sequencing reads.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:24372034" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021