Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

CSAM: Compressed SAM format.

Identifieur interne : 001545 ( PubMed/Corpus ); précédent : 001544; suivant : 001546

CSAM: Compressed SAM format.

Auteurs : Rodrigo Cánovas ; Alistair Moffat ; Andrew Turpin

Source :

RBID : pubmed:27540265

English descriptors

Abstract

Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.

DOI: 10.1093/bioinformatics/btw543
PubMed: 27540265

Links to Exploration step

pubmed:27540265

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">CSAM: Compressed SAM format.</title>
<author>
<name sortKey="Canovas, Rodrigo" sort="Canovas, Rodrigo" uniqKey="Canovas R" first="Rodrigo" last="Cánovas">Rodrigo Cánovas</name>
<affiliation>
<nlm:affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Moffat, Alistair" sort="Moffat, Alistair" uniqKey="Moffat A" first="Alistair" last="Moffat">Alistair Moffat</name>
<affiliation>
<nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<affiliation>
<nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2016">2016</date>
<idno type="RBID">pubmed:27540265</idno>
<idno type="pmid">27540265</idno>
<idno type="doi">10.1093/bioinformatics/btw543</idno>
<idno type="wicri:Area/PubMed/Corpus">001545</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001545</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">CSAM: Compressed SAM format.</title>
<author>
<name sortKey="Canovas, Rodrigo" sort="Canovas, Rodrigo" uniqKey="Canovas R" first="Rodrigo" last="Cánovas">Rodrigo Cánovas</name>
<affiliation>
<nlm:affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Moffat, Alistair" sort="Moffat, Alistair" uniqKey="Moffat A" first="Alistair" last="Moffat">Alistair Moffat</name>
<affiliation>
<nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<affiliation>
<nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2016" type="published">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Data Compression (methods)</term>
<term>Genome</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Data Compression</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Genome</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">27540265</PMID>
<DateCreated>
<Year>2016</Year>
<Month>09</Month>
<Day>08</Day>
</DateCreated>
<DateCompleted>
<Year>2017</Year>
<Month>08</Month>
<Day>16</Day>
</DateCompleted>
<DateRevised>
<Year>2017</Year>
<Month>08</Month>
<Day>16</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>32</Volume>
<Issue>24</Issue>
<PubDate>
<Year>2016</Year>
<Month>Dec</Month>
<Day>15</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>CSAM: Compressed SAM format.</ArticleTitle>
<Pagination>
<MedlinePgn>3709-3716</MedlinePgn>
</Pagination>
<Abstract>
<AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">We describe CSAM (Compressed SAM format), a compression approach offering lossless and lossy compression for SAM files. The structures and techniques proposed are suitable for representing SAM files, as well as supporting fast access to the compressed information. They generate more compact lossless representations than BAM, which is currently the preferred lossless compressed SAM-equivalent format; and are self-contained, that is, they do not depend on any external resources to compress or decompress SAM files.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION" NlmCategory="METHODS">An implementation is available at https://github.com/rcanovas/libCSAM CONTACT: canovas-ba@lirmm.frSupplementary Information: Supplementary data is available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Cánovas</LastName>
<ForeName>Rodrigo</ForeName>
<Initials>R</Initials>
<AffiliationInfo>
<Affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Moffat</LastName>
<ForeName>Alistair</ForeName>
<Initials>A</Initials>
<AffiliationInfo>
<Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Turpin</LastName>
<ForeName>Andrew</ForeName>
<Initials>A</Initials>
<AffiliationInfo>
<Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>08</Month>
<Day>18</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D044962" MajorTopicYN="N">Data Compression</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016678" MajorTopicYN="N">Genome</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D023281" MajorTopicYN="Y">Genomics</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="Y">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2016</Year>
<Month>05</Month>
<Day>18</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised">
<Year>2016</Year>
<Month>08</Month>
<Day>13</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>08</Month>
<Day>15</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>8</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2017</Year>
<Month>8</Month>
<Day>17</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>8</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27540265</ArticleId>
<ArticleId IdType="pii">btw543</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btw543</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001545 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 001545 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:27540265
   |texte=   CSAM: Compressed SAM format.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:27540265" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a AustralieFrV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024