Zseq: An Approach for Preprocessing Next-Generation Sequencing Data
Identifieur interne : 000B14 ( Main/Exploration ); précédent : 000B13; suivant : 000B15Zseq: An Approach for Preprocessing Next-Generation Sequencing Data
Auteurs : Abedalrhman Alkhateeb ; Luis RuedaSource :
- Journal of Computational Biology [ 1066-5277 ] ; 2017.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
Abstract
Url:
DOI: 10.1089/cmb.2017.0021
PubMed: 28414515
PubMed Central: 5563921
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000D99
- to stream Pmc, to step Curation: 000D99
- to stream Pmc, to step Checkpoint: 000668
- to stream PubMed, to step Corpus: 000D17
- to stream PubMed, to step Curation: 000D17
- to stream PubMed, to step Checkpoint: 000A56
- to stream Ncbi, to step Merge: 001A15
- to stream Ncbi, to step Curation: 001A15
- to stream Ncbi, to step Checkpoint: 001A15
- to stream Main, to step Merge: 000B17
- to stream Main, to step Curation: 000B14
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Zseq: An Approach for Preprocessing Next-Generation Sequencing Data</title>
<author><name sortKey="Alkhateeb, Abedalrhman" sort="Alkhateeb, Abedalrhman" uniqKey="Alkhateeb A" first="Abedalrhman" last="Alkhateeb">Abedalrhman Alkhateeb</name>
</author>
<author><name sortKey="Rueda, Luis" sort="Rueda, Luis" uniqKey="Rueda L" first="Luis" last="Rueda">Luis Rueda</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">28414515</idno>
<idno type="pmc">5563921</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5563921</idno>
<idno type="RBID">PMC:5563921</idno>
<idno type="doi">10.1089/cmb.2017.0021</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000D99</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000D99</idno>
<idno type="wicri:Area/Pmc/Curation">000D99</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000D99</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000668</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000668</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:28414515</idno>
<idno type="wicri:Area/PubMed/Corpus">000D17</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000D17</idno>
<idno type="wicri:Area/PubMed/Curation">000D17</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000D17</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000A56</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000A56</idno>
<idno type="wicri:Area/Ncbi/Merge">001A15</idno>
<idno type="wicri:Area/Ncbi/Curation">001A15</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001A15</idno>
<idno type="wicri:doubleKey">1066-5277:2017:Alkhateeb A:zseq:an:approach</idno>
<idno type="wicri:Area/Main/Merge">000B17</idno>
<idno type="wicri:Area/Main/Curation">000B14</idno>
<idno type="wicri:Area/Main/Exploration">000B14</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Zseq: An Approach for Preprocessing Next-Generation Sequencing Data</title>
<author><name sortKey="Alkhateeb, Abedalrhman" sort="Alkhateeb, Abedalrhman" uniqKey="Alkhateeb A" first="Abedalrhman" last="Alkhateeb">Abedalrhman Alkhateeb</name>
</author>
<author><name sortKey="Rueda, Luis" sort="Rueda, Luis" uniqKey="Rueda L" first="Luis" last="Rueda">Luis Rueda</name>
</author>
</analytic>
<series><title level="j">Journal of Computational Biology</title>
<idno type="ISSN">1066-5277</idno>
<idno type="eISSN">1557-8666</idno>
<imprint><date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Genome, Human</term>
<term>Genomics (methods)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Génome humain</term>
<term>Génomique ()</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Genome, Human</term>
<term>Humans</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Génome humain</term>
<term>Génomique</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><title>Abstract</title>
<p><bold>Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique</bold>
<italic>k</italic>
<bold>-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in</bold>
<italic>k</italic>
<bold>-mers. Based on a</bold>
<italic>z</italic>
<bold>-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold.</bold>
</p>
<p><bold>Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as <italic>de novo</italic>
assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, <italic>de novo</italic>
assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.</bold>
</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Altschul, S F" uniqKey="Altschul S">S.F. Altschul</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Brown, T" uniqKey="Brown T">T. Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cheadle, C" uniqKey="Cheadle C">C. Cheadle</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chen, Y C" uniqKey="Chen Y">Y.-C. Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cortes, C" uniqKey="Cortes C">C. Cortes</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Grabherr, M G" uniqKey="Grabherr M">M.G. Grabherr</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ishiguro, H" uniqKey="Ishiguro H">H. Ishiguro</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kannan, K" uniqKey="Kannan K">K. Kannan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kim, D" uniqKey="Kim D">D. Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kim, J H" uniqKey="Kim J">J.H. Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lavezzo, E" uniqKey="Lavezzo E">E. Lavezzo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, H" uniqKey="Liu H">H. Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mackinnon, M J" uniqKey="Mackinnon M">M.J. Mackinnon</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Margulies, M" uniqKey="Margulies M">M. Margulies</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Morgulis, A" uniqKey="Morgulis A">A. Morgulis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pozzoli, U" uniqKey="Pozzoli U">U. Pozzoli</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Quail, M A" uniqKey="Quail M">M.A. Quail</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schmieder, R" uniqKey="Schmieder R">R. Schmieder</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Traish, A M" uniqKey="Traish A">A.M. Traish</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Trapnell, C" uniqKey="Trapnell C">C. Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vogel, F" uniqKey="Vogel F">F. Vogel</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Waszak, S M" uniqKey="Waszak S">S.M. Waszak</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wuitschick, J" uniqKey="Wuitschick J">J. Wuitschick</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yakovchuk, P" uniqKey="Yakovchuk P">P. Yakovchuk</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhao, Z" uniqKey="Zhao Z">Z. Zhao</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Alkhateeb, Abedalrhman" sort="Alkhateeb, Abedalrhman" uniqKey="Alkhateeb A" first="Abedalrhman" last="Alkhateeb">Abedalrhman Alkhateeb</name>
<name sortKey="Rueda, Luis" sort="Rueda, Luis" uniqKey="Rueda L" first="Luis" last="Rueda">Luis Rueda</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B14 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000B14 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:5563921 |texte= Zseq: An Approach for Preprocessing Next-Generation Sequencing Data }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:28414515" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |