CSAM: Compressed SAM format.
Identifieur interne : 001545 ( PubMed/Corpus ); précédent : 001544; suivant : 001546CSAM: Compressed SAM format.
Auteurs : Rodrigo Cánovas ; Alistair Moffat ; Andrew TurpinSource :
- Bioinformatics (Oxford, England) [ 1367-4811 ] ; 2016.
English descriptors
- KwdEn :
- MESH :
- methods : Data Compression.
- Genome, Genomics, High-Throughput Nucleotide Sequencing, Software.
Abstract
Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.
DOI: 10.1093/bioinformatics/btw543
PubMed: 27540265
Links to Exploration step
pubmed:27540265Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">CSAM: Compressed SAM format.</title>
<author><name sortKey="Canovas, Rodrigo" sort="Canovas, Rodrigo" uniqKey="Canovas R" first="Rodrigo" last="Cánovas">Rodrigo Cánovas</name>
<affiliation><nlm:affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Moffat, Alistair" sort="Moffat, Alistair" uniqKey="Moffat A" first="Alistair" last="Moffat">Alistair Moffat</name>
<affiliation><nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<affiliation><nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2016">2016</date>
<idno type="RBID">pubmed:27540265</idno>
<idno type="pmid">27540265</idno>
<idno type="doi">10.1093/bioinformatics/btw543</idno>
<idno type="wicri:Area/PubMed/Corpus">001545</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001545</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">CSAM: Compressed SAM format.</title>
<author><name sortKey="Canovas, Rodrigo" sort="Canovas, Rodrigo" uniqKey="Canovas R" first="Rodrigo" last="Cánovas">Rodrigo Cánovas</name>
<affiliation><nlm:affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Moffat, Alistair" sort="Moffat, Alistair" uniqKey="Moffat A" first="Alistair" last="Moffat">Alistair Moffat</name>
<affiliation><nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Turpin, Andrew" sort="Turpin, Andrew" uniqKey="Turpin A" first="Andrew" last="Turpin">Andrew Turpin</name>
<affiliation><nlm:affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2016" type="published">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Data Compression (methods)</term>
<term>Genome</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Data Compression</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Genome</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" Owner="NLM"><PMID Version="1">27540265</PMID>
<DateCreated><Year>2016</Year>
<Month>09</Month>
<Day>08</Day>
</DateCreated>
<DateCompleted><Year>2017</Year>
<Month>08</Month>
<Day>16</Day>
</DateCompleted>
<DateRevised><Year>2017</Year>
<Month>08</Month>
<Day>16</Day>
</DateRevised>
<Article PubModel="Print-Electronic"><Journal><ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>32</Volume>
<Issue>24</Issue>
<PubDate><Year>2016</Year>
<Month>Dec</Month>
<Day>15</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>CSAM: Compressed SAM format.</ArticleTitle>
<Pagination><MedlinePgn>3709-3716</MedlinePgn>
</Pagination>
<Abstract><AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">Next generation sequencing machines produce vast amounts of genomic data. For the data to be useful, it is essential that it can be stored and manipulated efficiently. This work responds to the combined challenge of compressing genomic data, while providing fast access to regions of interest, without necessitating decompression of whole files.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">We describe CSAM (Compressed SAM format), a compression approach offering lossless and lossy compression for SAM files. The structures and techniques proposed are suitable for representing SAM files, as well as supporting fast access to the compressed information. They generate more compact lossless representations than BAM, which is currently the preferred lossless compressed SAM-equivalent format; and are self-contained, that is, they do not depend on any external resources to compress or decompress SAM files.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION" NlmCategory="METHODS">An implementation is available at https://github.com/rcanovas/libCSAM CONTACT: canovas-ba@lirmm.frSupplementary Information: Supplementary data is available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Cánovas</LastName>
<ForeName>Rodrigo</ForeName>
<Initials>R</Initials>
<AffiliationInfo><Affiliation>L.I.R.M.M. and Institut Biologie Computationnelle, Université de Montpellier, Montpellier Cedex 5, CNRS F-34392, France.</Affiliation>
</AffiliationInfo>
<AffiliationInfo><Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Moffat</LastName>
<ForeName>Alistair</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Turpin</LastName>
<ForeName>Andrew</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic"><Year>2016</Year>
<Month>08</Month>
<Day>18</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo><Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D044962" MajorTopicYN="N">Data Compression</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016678" MajorTopicYN="N">Genome</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D023281" MajorTopicYN="Y">Genomics</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D059014" MajorTopicYN="Y">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2016</Year>
<Month>05</Month>
<Day>18</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised"><Year>2016</Year>
<Month>08</Month>
<Day>13</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2016</Year>
<Month>08</Month>
<Day>15</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2016</Year>
<Month>8</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2017</Year>
<Month>8</Month>
<Day>17</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2016</Year>
<Month>8</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">27540265</ArticleId>
<ArticleId IdType="pii">btw543</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btw543</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001545 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 001545 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Asie |area= AustralieFrV1 |flux= PubMed |étape= Corpus |type= RBID |clé= pubmed:27540265 |texte= CSAM: Compressed SAM format. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i -Sk "pubmed:27540265" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd \ | NlmPubMed2Wicri -a AustralieFrV1
This area was generated with Dilib version V0.6.33. |