Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.

Identifieur interne : 001321 ( PubMed/Curation ); précédent : 001320; suivant : 001322

Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.

Auteurs : Yaron Orenstein [États-Unis] ; Bonnie Berger [États-Unis]

Source :

RBID : pubmed:26713687

Abstract

Current microarray technologies to determine RNA structure or measure protein-RNA interactions rely on single-stranded, unstructured RNA probes on a chip covering together all k-mers. Since space on the array is limited, the problem is to efficiently design a compact library of unstructured ℓ-long RNA probes, where each k-mer is covered at least p times. Ray et al. designed such a library for specific values of k, ℓ, and p using ad-hoc rules. To our knowledge, there is no general method to date to solve this problem. Here, we address the problem of finding a minimum-size covering of all k-mers by ℓ-long sequences with the desired properties for any value of k, ℓ, and p. As we prove that the problem is NP-hard, we give two solutions: the first is a greedy algorithm with a logarithmic approximation ratio; the second, a heuristic greedy approach based on random walks in de Bruijn graphs. The heuristic algorithm works well in practice and produces a library of unstructured RNA probes that is only ∼1.1-times greater in size compared to the theoretical lower bound. We present results for typical values of k and probe lengths ℓ and show that our algorithm generates a library that is significantly smaller than the library of Ray et al.; moreover, we show that our algorithm outperforms naive methods. Our approach can be generalized and extended to generate RNA or DNA oligo libraries with other desired properties. The software is freely available online.

DOI: 10.1089/cmb.2015.0179
PubMed: 26713687

Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:26713687

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.</title>
<author>
<name sortKey="Orenstein, Yaron" sort="Orenstein, Yaron" uniqKey="Orenstein Y" first="Yaron" last="Orenstein">Yaron Orenstein</name>
<affiliation wicri:level="2">
<nlm:affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
<wicri:cityArea>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Berger, Bonnie" sort="Berger, Bonnie" uniqKey="Berger B" first="Bonnie" last="Berger">Bonnie Berger</name>
<affiliation wicri:level="2">
<nlm:affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
<wicri:cityArea>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge</wicri:cityArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2016">2016</date>
<idno type="RBID">pubmed:26713687</idno>
<idno type="pmid">26713687</idno>
<idno type="doi">10.1089/cmb.2015.0179</idno>
<idno type="wicri:Area/PubMed/Corpus">001321</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001321</idno>
<idno type="wicri:Area/PubMed/Curation">001321</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001321</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.</title>
<author>
<name sortKey="Orenstein, Yaron" sort="Orenstein, Yaron" uniqKey="Orenstein Y" first="Yaron" last="Orenstein">Yaron Orenstein</name>
<affiliation wicri:level="2">
<nlm:affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
<wicri:cityArea>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge</wicri:cityArea>
</affiliation>
</author>
<author>
<name sortKey="Berger, Bonnie" sort="Berger, Bonnie" uniqKey="Berger B" first="Bonnie" last="Berger">Bonnie Berger</name>
<affiliation wicri:level="2">
<nlm:affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
<wicri:cityArea>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge</wicri:cityArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Journal of computational biology : a journal of computational molecular cell biology</title>
<idno type="eISSN">1557-8666</idno>
<imprint>
<date when="2016" type="published">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Current microarray technologies to determine RNA structure or measure protein-RNA interactions rely on single-stranded, unstructured RNA probes on a chip covering together all k-mers. Since space on the array is limited, the problem is to efficiently design a compact library of unstructured ℓ-long RNA probes, where each k-mer is covered at least p times. Ray et al. designed such a library for specific values of k, ℓ, and p using ad-hoc rules. To our knowledge, there is no general method to date to solve this problem. Here, we address the problem of finding a minimum-size covering of all k-mers by ℓ-long sequences with the desired properties for any value of k, ℓ, and p. As we prove that the problem is NP-hard, we give two solutions: the first is a greedy algorithm with a logarithmic approximation ratio; the second, a heuristic greedy approach based on random walks in de Bruijn graphs. The heuristic algorithm works well in practice and produces a library of unstructured RNA probes that is only ∼1.1-times greater in size compared to the theoretical lower bound. We present results for typical values of k and probe lengths ℓ and show that our algorithm generates a library that is significantly smaller than the library of Ray et al.; moreover, we show that our algorithm outperforms naive methods. Our approach can be generalized and extended to generate RNA or DNA oligo libraries with other desired properties. The software is freely available online.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">26713687</PMID>
<DateRevised>
<Year>2019</Year>
<Month>11</Month>
<Day>20</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1557-8666</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>23</Volume>
<Issue>2</Issue>
<PubDate>
<Year>2016</Year>
<Month>Feb</Month>
</PubDate>
</JournalIssue>
<Title>Journal of computational biology : a journal of computational molecular cell biology</Title>
<ISOAbbreviation>J. Comput. Biol.</ISOAbbreviation>
</Journal>
<ArticleTitle>Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.</ArticleTitle>
<Pagination>
<MedlinePgn>67-79</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1089/cmb.2015.0179</ELocationID>
<Abstract>
<AbstractText>Current microarray technologies to determine RNA structure or measure protein-RNA interactions rely on single-stranded, unstructured RNA probes on a chip covering together all k-mers. Since space on the array is limited, the problem is to efficiently design a compact library of unstructured ℓ-long RNA probes, where each k-mer is covered at least p times. Ray et al. designed such a library for specific values of k, ℓ, and p using ad-hoc rules. To our knowledge, there is no general method to date to solve this problem. Here, we address the problem of finding a minimum-size covering of all k-mers by ℓ-long sequences with the desired properties for any value of k, ℓ, and p. As we prove that the problem is NP-hard, we give two solutions: the first is a greedy algorithm with a logarithmic approximation ratio; the second, a heuristic greedy approach based on random walks in de Bruijn graphs. The heuristic algorithm works well in practice and produces a library of unstructured RNA probes that is only ∼1.1-times greater in size compared to the theoretical lower bound. We present results for typical values of k and probe lengths ℓ and show that our algorithm generates a library that is significantly smaller than the library of Ray et al.; moreover, we show that our algorithm outperforms naive methods. Our approach can be generalized and extended to generate RNA or DNA oligo libraries with other desired properties. The software is freely available online.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Orenstein</LastName>
<ForeName>Yaron</ForeName>
<Initials>Y</Initials>
<AffiliationInfo>
<Affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Berger</LastName>
<ForeName>Bonnie</ForeName>
<Initials>B</Initials>
<AffiliationInfo>
<Affiliation>1 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology , Cambridge, MA.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>2 Department of Mathematics, Massachusetts Institute of Technology , Cambridge, MA.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y">
<Grant>
<GrantID>R01 GM081871</GrantID>
<Acronym>GM</Acronym>
<Agency>NIGMS NIH HHS</Agency>
<Country>United States</Country>
</Grant>
</GrantList>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2015</Year>
<Month>12</Month>
<Day>29</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>J Comput Biol</MedlineTA>
<NlmUniqueID>9433358</NlmUniqueID>
<ISSNLinking>1066-5277</ISSNLinking>
</MedlineJournalInfo>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">RNA secondary structure</Keyword>
<Keyword MajorTopicYN="N">de Bruijn graph</Keyword>
<Keyword MajorTopicYN="N">microarray library design</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="pubmed">
<Year>2015</Year>
<Month>12</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2015</Year>
<Month>12</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2015</Year>
<Month>12</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">26713687</ArticleId>
<ArticleId IdType="doi">10.1089/cmb.2015.0179</ArticleId>
<ArticleId IdType="pmc">PMC4752187</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Nat Rev Genet. 2014 Dec;15(12 ):829-45</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25365966</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2006 Nov;24(11):1429-35</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16998473</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Cell. 2014 Jun 5;54(5):887-900</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24837674</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Methods Mol Biol. 2015;1269:3-16</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25577369</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Algorithms Mol Biol. 2011 Nov 24;6:26</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22115189</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2014 Oct;15(10):689-701</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25112293</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2013 Jul 11;499(7457):172-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23846655</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2011 Jun 14;108(24):10010-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21610164</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2008 Sep;15(7):655-65</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18651798</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Anal Chem. 2009 Nov 1;81(21):8949-56</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19874056</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Genet. 2007 Oct;39(10):1278-84</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17893677</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2011 Aug 18;12(9):641-55</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21850044</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2010 Sep;28(9):970-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20802496</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Biochemistry. 2006 Jan 17;45(2):581-93</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16401087</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2012 Jan 18;13(2):77-83</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22251872</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2013 Jul 1;29(13):i71-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23813011</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2014 Jan 31;15(1):401</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24485348</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2015 Jan;43(1):1-12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25505162</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2013 May;14(5):333-46</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23594911</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2006 Feb 15;22(4):500-3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16357029</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2009 Jul;27(7):667-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19561594</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2010 Sep 2;467(7311):103-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20811459</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>EMBO Rep. 2005 Jan;6(1):33-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">15643449</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001321 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 001321 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Curation
   |type=    RBID
   |clé=     pubmed:26713687
   |texte=   Efficient Design of Compact Unstructured RNA Libraries Covering All k-mers.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i   -Sk "pubmed:26713687" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021