Computational solutions to large-scale data management and analysis
Identifieur interne : 000153 ( Ncbi/Merge ); précédent : 000152; suivant : 000154Computational solutions to large-scale data management and analysis
Auteurs : Eric E. Schadt [États-Unis] ; Michael D. Linderman [États-Unis] ; Jon Sorenson [États-Unis] ; Lawrence Lee [États-Unis] ; Garry P. Nolan [États-Unis]Source :
- Nature reviews. Genetics [ 1471-0056 ] ; 2010.
Abstract
Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems.
Url:
DOI: 10.1038/nrg2857
PubMed: 20717155
PubMed Central: 3124937
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000317
- to stream Pmc, to step Curation: 000317
- to stream Pmc, to step Checkpoint: 000590
Links to Exploration step
PMC:3124937Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Computational solutions to large-scale data management and analysis</title>
<author><name sortKey="Schadt, Eric E" sort="Schadt, Eric E" uniqKey="Schadt E" first="Eric E." last="Schadt">Eric E. Schadt</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Linderman, Michael D" sort="Linderman, Michael D" uniqKey="Linderman M" first="Michael D." last="Linderman">Michael D. Linderman</name>
<affiliation wicri:level="1"><nlm:aff id="A2">Computer Systems Laboratory, 420 Via Palou Mall, Stanford, California 94305-4070, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Computer Systems Laboratory, 420 Via Palou Mall, Stanford, California 94305-4070</wicri:regionArea>
<wicri:noRegion>California 94305-4070</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sorenson, Jon" sort="Sorenson, Jon" uniqKey="Sorenson J" first="Jon" last="Sorenson">Jon Sorenson</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Lee, Lawrence" sort="Lee, Lawrence" uniqKey="Lee L" first="Lawrence" last="Lee">Lawrence Lee</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Nolan, Garry P" sort="Nolan, Garry P" uniqKey="Nolan G" first="Garry P." last="Nolan">Garry P. Nolan</name>
<affiliation wicri:level="2"><nlm:aff id="A3">Department of Microbiology and Immunology, Stanford University, 300 Pasteur Drive, Stanford, California 94305-5124 USA.</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>Department of Microbiology and Immunology, Stanford University, 300 Pasteur Drive, Stanford</wicri:cityArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">20717155</idno>
<idno type="pmc">3124937</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3124937</idno>
<idno type="RBID">PMC:3124937</idno>
<idno type="doi">10.1038/nrg2857</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000317</idno>
<idno type="wicri:Area/Pmc/Curation">000317</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000590</idno>
<idno type="wicri:Area/Ncbi/Merge">000153</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Computational solutions to large-scale data management and analysis</title>
<author><name sortKey="Schadt, Eric E" sort="Schadt, Eric E" uniqKey="Schadt E" first="Eric E." last="Schadt">Eric E. Schadt</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Linderman, Michael D" sort="Linderman, Michael D" uniqKey="Linderman M" first="Michael D." last="Linderman">Michael D. Linderman</name>
<affiliation wicri:level="1"><nlm:aff id="A2">Computer Systems Laboratory, 420 Via Palou Mall, Stanford, California 94305-4070, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Computer Systems Laboratory, 420 Via Palou Mall, Stanford, California 94305-4070</wicri:regionArea>
<wicri:noRegion>California 94305-4070</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sorenson, Jon" sort="Sorenson, Jon" uniqKey="Sorenson J" first="Jon" last="Sorenson">Jon Sorenson</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Lee, Lawrence" sort="Lee, Lawrence" uniqKey="Lee L" first="Lawrence" last="Lee">Lawrence Lee</name>
<affiliation wicri:level="1"><nlm:aff id="A1">Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</nlm:aff>
<country xml:lang="fr" wicri:curation="lc">États-Unis</country>
<wicri:regionArea>Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025</wicri:regionArea>
<wicri:noRegion>California 94025</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Nolan, Garry P" sort="Nolan, Garry P" uniqKey="Nolan G" first="Garry P." last="Nolan">Garry P. Nolan</name>
<affiliation wicri:level="2"><nlm:aff id="A3">Department of Microbiology and Immunology, Stanford University, 300 Pasteur Drive, Stanford, California 94305-5124 USA.</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<placeName><region type="state">Californie</region>
</placeName>
<wicri:cityArea>Department of Microbiology and Immunology, Stanford University, 300 Pasteur Drive, Stanford</wicri:cityArea>
</affiliation>
</author>
</analytic>
<series><title level="j">Nature reviews. Genetics</title>
<idno type="ISSN">1471-0056</idno>
<idno type="eISSN">1471-0064</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p id="P1">Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems.</p>
</div>
</front>
</TEI>
<pmc article-type="research-article" xml:lang="en"><pmc-comment>The publisher of this article does not allow downloading of the full text in XML form.</pmc-comment>
<pmc-dir>properties manuscript</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-journal-id">100962779</journal-id>
<journal-id journal-id-type="pubmed-jr-id">22269</journal-id>
<journal-id journal-id-type="nlm-ta">Nat Rev Genet</journal-id>
<journal-title-group><journal-title>Nature reviews. Genetics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1471-0056</issn>
<issn pub-type="epub">1471-0064</issn>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">20717155</article-id>
<article-id pub-id-type="pmc">3124937</article-id>
<article-id pub-id-type="doi">10.1038/nrg2857</article-id>
<article-id pub-id-type="manuscript">NIHMS304947</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Article</subject>
</subj-group>
</article-categories>
<title-group><article-title>Computational solutions to large-scale data management and analysis</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Schadt</surname>
<given-names>Eric E.</given-names>
</name>
<xref ref-type="aff" rid="A1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Linderman</surname>
<given-names>Michael D.</given-names>
</name>
<xref ref-type="aff" rid="A2">‡</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Sorenson</surname>
<given-names>Jon</given-names>
</name>
<xref ref-type="aff" rid="A1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Lee</surname>
<given-names>Lawrence</given-names>
</name>
<xref ref-type="aff" rid="A1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Nolan</surname>
<given-names>Garry P.</given-names>
</name>
<xref ref-type="aff" rid="A3">§</xref>
</contrib>
</contrib-group>
<aff id="A1"><label>*</label>
Pacific Biosciences, 1505 Adams Drive, Menlo Park, California 94025, USA.</aff>
<aff id="A2"><label>‡</label>
Computer Systems Laboratory, 420 Via Palou Mall, Stanford, California 94305-4070, USA.</aff>
<aff id="A3"><label>§</label>
Department of Microbiology and Immunology, Stanford University, 300 Pasteur Drive, Stanford, California 94305-5124 USA.</aff>
<author-notes><corresp id="cor1">Correspondence to E.E.S. <email>eschadt@pacificbiosciences.com</email>
</corresp>
</author-notes>
<pub-date pub-type="nihms-submitted"><day>20</day>
<month>6</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="ppub"><month>9</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>28</day>
<month>6</month>
<year>2011</year>
</pub-date>
<volume>11</volume>
<issue>9</issue>
<fpage>647</fpage>
<lpage>657</lpage>
<permissions><copyright-statement>© 2010 Macmillan Publishers Limited. All rights reserved</copyright-statement>
<copyright-year>2010</copyright-year>
</permissions>
<abstract><p id="P1">Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems.</p>
</abstract>
<funding-group><award-group><funding-source country="United States">National Cancer Institute : NCI</funding-source>
<award-id>R01 CA130826-04 || CA</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
</pmc>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Californie</li>
</region>
</list>
<tree><country name="États-Unis"><noRegion><name sortKey="Schadt, Eric E" sort="Schadt, Eric E" uniqKey="Schadt E" first="Eric E." last="Schadt">Eric E. Schadt</name>
</noRegion>
<name sortKey="Lee, Lawrence" sort="Lee, Lawrence" uniqKey="Lee L" first="Lawrence" last="Lee">Lawrence Lee</name>
<name sortKey="Linderman, Michael D" sort="Linderman, Michael D" uniqKey="Linderman M" first="Michael D." last="Linderman">Michael D. Linderman</name>
<name sortKey="Nolan, Garry P" sort="Nolan, Garry P" uniqKey="Nolan G" first="Garry P." last="Nolan">Garry P. Nolan</name>
<name sortKey="Sorenson, Jon" sort="Sorenson, Jon" uniqKey="Sorenson J" first="Jon" last="Sorenson">Jon Sorenson</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Ncbi/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000153 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 000153 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= CyberinfraV1 |flux= Ncbi |étape= Merge |type= RBID |clé= PMC:3124937 |texte= Computational solutions to large-scale data management and analysis }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i -Sk "pubmed:20717155" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd \ | NlmPubMed2Wicri -a CyberinfraV1
This area was generated with Dilib version V0.6.25. |