Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.

Identifieur interne : 000C50 ( PubMed/Checkpoint ); précédent : 000C49; suivant : 000C51

K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.

Auteurs : Chang Sik Kim [Royaume-Uni] ; Martyn D. Winn [Royaume-Uni] ; Vipin Sachdeva [États-Unis] ; Kirk E. Jordan [États-Unis]

Source :

RBID : pubmed:29100493

Descripteurs français

English descriptors

Abstract

De novo transcriptome assembly is an important technique for understanding gene expression in non-model organisms. Many de novo assemblers using the de Bruijn graph of a set of the RNA sequences rely on in-memory representation of this graph. However, current methods analyse the complete set of read-derived k-mer sequence at once, resulting in the need for computer hardware with large shared memory.

DOI: 10.1186/s12859-017-1881-8
PubMed: 29100493


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:29100493

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.</title>
<author>
<name sortKey="Kim, Chang Sik" sort="Kim, Chang Sik" uniqKey="Kim C" first="Chang Sik" last="Kim">Chang Sik Kim</name>
<affiliation wicri:level="1">
<nlm:affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK.</nlm:affiliation>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD</wicri:regionArea>
<wicri:noRegion>WA4 4AD</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Winn, Martyn D" sort="Winn, Martyn D" uniqKey="Winn M" first="Martyn D" last="Winn">Martyn D. Winn</name>
<affiliation wicri:level="1">
<nlm:affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK. martyn.winn@stfc.ac.uk.</nlm:affiliation>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD</wicri:regionArea>
<wicri:noRegion>WA4 4AD</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Sachdeva, Vipin" sort="Sachdeva, Vipin" uniqKey="Sachdeva V" first="Vipin" last="Sachdeva">Vipin Sachdeva</name>
<affiliation wicri:level="2">
<nlm:affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jordan, Kirk E" sort="Jordan, Kirk E" uniqKey="Jordan K" first="Kirk E" last="Jordan">Kirk E. Jordan</name>
<affiliation wicri:level="2">
<nlm:affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2017">2017</date>
<idno type="RBID">pubmed:29100493</idno>
<idno type="pmid">29100493</idno>
<idno type="doi">10.1186/s12859-017-1881-8</idno>
<idno type="wicri:Area/PubMed/Corpus">000B01</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000B01</idno>
<idno type="wicri:Area/PubMed/Curation">000B01</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000B01</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000C50</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000C50</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.</title>
<author>
<name sortKey="Kim, Chang Sik" sort="Kim, Chang Sik" uniqKey="Kim C" first="Chang Sik" last="Kim">Chang Sik Kim</name>
<affiliation wicri:level="1">
<nlm:affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK.</nlm:affiliation>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD</wicri:regionArea>
<wicri:noRegion>WA4 4AD</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Winn, Martyn D" sort="Winn, Martyn D" uniqKey="Winn M" first="Martyn D" last="Winn">Martyn D. Winn</name>
<affiliation wicri:level="1">
<nlm:affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK. martyn.winn@stfc.ac.uk.</nlm:affiliation>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD</wicri:regionArea>
<wicri:noRegion>WA4 4AD</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Sachdeva, Vipin" sort="Sachdeva, Vipin" uniqKey="Sachdeva V" first="Vipin" last="Sachdeva">Vipin Sachdeva</name>
<affiliation wicri:level="2">
<nlm:affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jordan, Kirk E" sort="Jordan, Kirk E" uniqKey="Jordan K" first="Kirk E" last="Jordan">Kirk E. Jordan</name>
<affiliation wicri:level="2">
<nlm:affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2017" type="published">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>RNA (chemistry)</term>
<term>RNA (genetics)</term>
<term>Sequence Analysis, RNA</term>
<term>Transcriptome</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>ARN ()</term>
<term>ARN (génétique)</term>
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Analyse de séquence d'ARN</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Transcriptome</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="chemistry" xml:lang="en">
<term>RNA</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en">
<term>RNA</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>ARN</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, RNA</term>
<term>Transcriptome</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>ARN</term>
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Analyse de séquence d'ARN</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Transcriptome</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">De novo transcriptome assembly is an important technique for understanding gene expression in non-model organisms. Many de novo assemblers using the de Bruijn graph of a set of the RNA sequences rely on in-memory representation of this graph. However, current methods analyse the complete set of read-derived k-mer sequence at once, resulting in the need for computer hardware with large shared memory.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM">
<PMID Version="1">29100493</PMID>
<DateCompleted>
<Year>2018</Year>
<Month>02</Month>
<Day>06</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>12</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Electronic">
<Journal>
<ISSN IssnType="Electronic">1471-2105</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>18</Volume>
<Issue>1</Issue>
<PubDate>
<Year>2017</Year>
<Month>Nov</Month>
<Day>03</Day>
</PubDate>
</JournalIssue>
<Title>BMC bioinformatics</Title>
<ISOAbbreviation>BMC Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.</ArticleTitle>
<Pagination>
<MedlinePgn>467</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1186/s12859-017-1881-8</ELocationID>
<Abstract>
<AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">De novo transcriptome assembly is an important technique for understanding gene expression in non-model organisms. Many de novo assemblers using the de Bruijn graph of a set of the RNA sequences rely on in-memory representation of this graph. However, current methods analyse the complete set of read-derived k-mer sequence at once, resulting in the need for computer hardware with large shared memory.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">We introduce a novel approach that clusters k-mers as the first step. The clusters correspond to small sets of gene products, which can be processed quickly to give candidate transcripts. We implement the clustering step using the MapReduce approach for parallelising the analysis of large datasets, which enables the use of compute clusters. The computational task is distributed across the compute system using the industry-standard MPI protocol, and no specialised hardware is required. Using this approach, we have re-implemented the Inchworm module from the widely used Trinity pipeline, and tested the method in the context of the full Trinity pipeline. Validation tests on a range of real datasets show large reductions in the runtime and per-node memory requirements, when making use of a compute cluster.</AbstractText>
<AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">Our study shows that MapReduce-based clustering has great potential for distributing challenging sequencing problems, without loss of accuracy. Although we have focussed on the Trinity package, we propose that such clustering is a useful initial step for other assembly pipelines.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Kim</LastName>
<ForeName>Chang Sik</ForeName>
<Initials>CS</Initials>
<AffiliationInfo>
<Affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Present addresse Cancer Research UK Manchester Institute, The University of Manchester, M20 4BX, Manchester, UK.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Winn</LastName>
<ForeName>Martyn D</ForeName>
<Initials>MD</Initials>
<AffiliationInfo>
<Affiliation>The Hartree Centre and Scientific Computing Department, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK. martyn.winn@stfc.ac.uk.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Sachdeva</LastName>
<ForeName>Vipin</ForeName>
<Initials>V</Initials>
<AffiliationInfo>
<Affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Present addresse Silicon Therapeutics, 300 A Street, Boston, MA, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Jordan</LastName>
<ForeName>Kirk E</ForeName>
<Initials>KE</Initials>
<AffiliationInfo>
<Affiliation>Computational Science Center, IBM T.J. Watson Research, Cambridge, MA, USA.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2017</Year>
<Month>11</Month>
<Day>03</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>BMC Bioinformatics</MedlineTA>
<NlmUniqueID>100965194</NlmUniqueID>
<ISSNLinking>1471-2105</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList>
<Chemical>
<RegistryNumber>63231-63-0</RegistryNumber>
<NameOfSubstance UI="D012313">RNA</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016000" MajorTopicYN="N">Cluster Analysis</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012313" MajorTopicYN="N">RNA</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="N">chemistry</QualifierName>
<QualifierName UI="Q000235" MajorTopicYN="N">genetics</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017423" MajorTopicYN="N">Sequence Analysis, RNA</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059467" MajorTopicYN="N">Transcriptome</DescriptorName>
</MeshHeading>
</MeshHeadingList>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">De novo sequence assembly</Keyword>
<Keyword MajorTopicYN="N">MapReduce</Keyword>
<Keyword MajorTopicYN="N">RNA-Seq</Keyword>
<Keyword MajorTopicYN="N">Trinity</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2017</Year>
<Month>06</Month>
<Day>19</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2017</Year>
<Month>10</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2017</Year>
<Month>11</Month>
<Day>5</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2017</Year>
<Month>11</Month>
<Day>5</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2018</Year>
<Month>2</Month>
<Day>7</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">29100493</ArticleId>
<ArticleId IdType="doi">10.1186/s12859-017-1881-8</ArticleId>
<ArticleId IdType="pii">10.1186/s12859-017-1881-8</ArticleId>
<ArticleId IdType="pmc">PMC5670514</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Cell. 2013 Jul 3;154(1):26-46</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23827673</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2013 Jul;20(7):540-50</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23829653</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2014 Jun 15;30(12):1660-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24532719</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS Comput Biol. 2016 Feb 19;12(2):e1004772</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26894997</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2011 Dec 14;12 Suppl 14:S2</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22373417</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BioData Min. 2014 Oct 29;7:22</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25383096</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2002 Apr;12(4):656-64</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11932250</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2011 Aug 04;12:323</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21816040</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2012 Apr 15;28(8):1086-92</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22368243</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2008 May;18(5):821-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18349386</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2013 Mar 21;495(7441):333-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23446348</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Mar 15;27(6):764-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21217122</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2009 Mar 3;106(9):3264-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19208812</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2014 Dec 21;15(12):553</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25608678</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2010 May;28(5):511-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20436464</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Biotechniques. 2008 Jul;45(1):81-94</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18611170</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2010 Sep;20(9):1297-303</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20644199</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2010;11(12):220</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21176179</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2011 May 15;29(7):644-52</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21572440</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PeerJ. 2017 Feb 16;5:e2988</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28224052</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Biochem Biophys Res Commun. 2012 Sep 28;426(3):395-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22960169</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2015 Feb 11;16:30</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25723335</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2010 Feb 15;26(4):493-500</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20022975</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Jun;19(6):1117-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19251739</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Methods Mol Biol. 2012;822:273-88</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22144206</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Hum Genomics. 2014 Jan 21;8:3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24447644</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Algorithms Mol Biol. 2013 Sep 16;8(1):22</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24040893</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Apr;19(4):521-32</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19339662</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2013 Dec 1;29(23):3003-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24037212</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2008 Nov 27;456(7221):470-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18978772</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2010 May;28(5):503-10</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20436462</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Front Plant Sci. 2012 Sep 25;3:220</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23056003</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2009 Nov 1;25(21):2872-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19528083</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Front Plant Sci. 2012 Feb 07;3:18</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22645572</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22847406</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Jul 1;27(13):i94-101</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21685107</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Genomics. 2014;15 Suppl 5:S6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25082000</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2010 Feb;20(2):265-72</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20019144</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>Royaume-Uni</li>
<li>États-Unis</li>
</country>
<region>
<li>Massachusetts</li>
</region>
</list>
<tree>
<country name="Royaume-Uni">
<noRegion>
<name sortKey="Kim, Chang Sik" sort="Kim, Chang Sik" uniqKey="Kim C" first="Chang Sik" last="Kim">Chang Sik Kim</name>
</noRegion>
<name sortKey="Winn, Martyn D" sort="Winn, Martyn D" uniqKey="Winn M" first="Martyn D" last="Winn">Martyn D. Winn</name>
</country>
<country name="États-Unis">
<region name="Massachusetts">
<name sortKey="Sachdeva, Vipin" sort="Sachdeva, Vipin" uniqKey="Sachdeva V" first="Vipin" last="Sachdeva">Vipin Sachdeva</name>
</region>
<name sortKey="Jordan, Kirk E" sort="Jordan, Kirk E" uniqKey="Jordan K" first="Kirk E" last="Jordan">Kirk E. Jordan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C50 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 000C50 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:29100493
   |texte=   K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:29100493" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021