Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.

Identifieur interne : 001A64 ( PubMed/Curation ); précédent : 001A63; suivant : 001A65

A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.

Auteurs : Yaron Orenstein [Israël] ; Ron Shamir

Source :

RBID : pubmed:24500199

Descripteurs français

English descriptors

Abstract

Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes.

DOI: 10.1093/nar/gku117
PubMed: 24500199

Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:24500199

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.</title>
<author>
<name sortKey="Orenstein, Yaron" sort="Orenstein, Yaron" uniqKey="Orenstein Y" first="Yaron" last="Orenstein">Yaron Orenstein</name>
<affiliation wicri:level="1">
<nlm:affiliation>Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel.</nlm:affiliation>
<country xml:lang="fr">Israël</country>
<wicri:regionArea>Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Shamir, Ron" sort="Shamir, Ron" uniqKey="Shamir R" first="Ron" last="Shamir">Ron Shamir</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2014">2014</date>
<idno type="RBID">pubmed:24500199</idno>
<idno type="pmid">24500199</idno>
<idno type="doi">10.1093/nar/gku117</idno>
<idno type="wicri:Area/PubMed/Corpus">001A64</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001A64</idno>
<idno type="wicri:Area/PubMed/Curation">001A64</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001A64</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.</title>
<author>
<name sortKey="Orenstein, Yaron" sort="Orenstein, Yaron" uniqKey="Orenstein Y" first="Yaron" last="Orenstein">Yaron Orenstein</name>
<affiliation wicri:level="1">
<nlm:affiliation>Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel.</nlm:affiliation>
<country xml:lang="fr">Israël</country>
<wicri:regionArea>Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Shamir, Ron" sort="Shamir, Ron" uniqKey="Shamir R" first="Ron" last="Shamir">Ron Shamir</name>
</author>
</analytic>
<series>
<title level="j">Nucleic acids research</title>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2014" type="published">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Animals</term>
<term>Binding Sites</term>
<term>Chromatin Immunoprecipitation</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Mice</term>
<term>Models, Biological</term>
<term>Oligonucleotides</term>
<term>Protein Array Analysis (methods)</term>
<term>Regulatory Elements, Transcriptional</term>
<term>SELEX Aptamer Technique (methods)</term>
<term>Sequence Analysis, DNA</term>
<term>Transcription Factors (metabolism)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Analyse de séquence d'ADN</term>
<term>Analyse par réseau de protéines ()</term>
<term>Animaux</term>
<term>Facteurs de transcription (métabolisme)</term>
<term>Humains</term>
<term>Immunoprécipitation de la chromatine</term>
<term>Modèles biologiques</term>
<term>Oligonucléotides</term>
<term>Sites de fixation</term>
<term>Souris</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Technique SELEX ()</term>
<term>Éléments de régulation transcriptionnelle</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="metabolism" xml:lang="en">
<term>Transcription Factors</term>
</keywords>
<keywords scheme="MESH" type="chemical" xml:lang="en">
<term>Oligonucleotides</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Protein Array Analysis</term>
<term>SELEX Aptamer Technique</term>
</keywords>
<keywords scheme="MESH" qualifier="métabolisme" xml:lang="fr">
<term>Facteurs de transcription</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Animals</term>
<term>Binding Sites</term>
<term>Chromatin Immunoprecipitation</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Mice</term>
<term>Models, Biological</term>
<term>Regulatory Elements, Transcriptional</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Analyse de séquence d'ADN</term>
<term>Analyse par réseau de protéines</term>
<term>Animaux</term>
<term>Humains</term>
<term>Immunoprécipitation de la chromatine</term>
<term>Modèles biologiques</term>
<term>Oligonucléotides</term>
<term>Sites de fixation</term>
<term>Souris</term>
<term>Séquençage nucléotidique à haut débit</term>
<term>Technique SELEX</term>
<term>Éléments de régulation transcriptionnelle</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes. </div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">24500199</PMID>
<DateCompleted>
<Year>2014</Year>
<Month>07</Month>
<Day>22</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>11</Month>
<Day>13</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1362-4962</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>42</Volume>
<Issue>8</Issue>
<PubDate>
<Year>2014</Year>
<Month>Apr</Month>
</PubDate>
</JournalIssue>
<Title>Nucleic acids research</Title>
<ISOAbbreviation>Nucleic Acids Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.</ArticleTitle>
<Pagination>
<MedlinePgn>e63</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/nar/gku117</ELocationID>
<Abstract>
<AbstractText>Understanding gene regulation is a key challenge in today's biology. The new technologies of protein-binding microarrays (PBMs) and high-throughput SELEX (HT-SELEX) allow measurement of the binding intensities of one transcription factor (TF) to numerous synthetic double-stranded DNA sequences in a single experiment. Recently, Jolma et al. reported the results of 547 HT-SELEX experiments covering human and mouse TFs. Because 162 of these TFs were also covered by PBM technology, for the first time, a large-scale comparison between implementations of these two in vitro technologies is possible. Here we assessed the similarities and differences between binding models, represented as position weight matrices, inferred from PBM and HT-SELEX, and also measured how well these models predict in vivo binding. Our results show that HT-SELEX- and PBM-derived models agree for most TFs. For some TFs, the HT-SELEX-derived models are longer versions of the PBM-derived models, whereas for other TFs, the HT-SELEX models match the secondary PBM-derived models. Remarkably, PBM-based 8-mer ranking is more accurate than that of HT-SELEX, but models derived from HT-SELEX predict in vivo binding better. In addition, we reveal several biases in HT-SELEX data including nucleotide frequency bias, enrichment of C-rich k-mers and oligos and underrepresentation of palindromes. </AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Orenstein</LastName>
<ForeName>Yaron</ForeName>
<Initials>Y</Initials>
<AffiliationInfo>
<Affiliation>Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Shamir</LastName>
<ForeName>Ron</ForeName>
<Initials>R</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D003160">Comparative Study</PublicationType>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2014</Year>
<Month>02</Month>
<Day>05</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>Nucleic Acids Res</MedlineTA>
<NlmUniqueID>0411011</NlmUniqueID>
<ISSNLinking>0305-1048</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList>
<Chemical>
<RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D009841">Oligonucleotides</NameOfSubstance>
</Chemical>
<Chemical>
<RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D014157">Transcription Factors</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000818" MajorTopicYN="N">Animals</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D001665" MajorTopicYN="N">Binding Sites</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D047369" MajorTopicYN="N">Chromatin Immunoprecipitation</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D051379" MajorTopicYN="N">Mice</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008954" MajorTopicYN="N">Models, Biological</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D009841" MajorTopicYN="N">Oligonucleotides</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D040081" MajorTopicYN="N">Protein Array Analysis</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D050436" MajorTopicYN="Y">Regulatory Elements, Transcriptional</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D052156" MajorTopicYN="N">SELEX Aptamer Technique</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D014157" MajorTopicYN="N">Transcription Factors</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="Y">metabolism</QualifierName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2014</Year>
<Month>2</Month>
<Day>7</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2014</Year>
<Month>2</Month>
<Day>7</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2014</Year>
<Month>7</Month>
<Day>23</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">24500199</ArticleId>
<ArticleId IdType="pii">gku117</ArticleId>
<ArticleId IdType="doi">10.1093/nar/gku117</ArticleId>
<ArticleId IdType="pmc">PMC4005680</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Cell. 2011 Dec 9;147(6):1270-82</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22153072</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2011;12(2):R18</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21338519</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2012 Sep 1;40(17):e128</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22610855</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Curr Protoc Mol Biol. 2012 Oct;Chapter 21:Unit 21.24</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23026909</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2012;7(9):e46145</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23029415</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Cell. 2013 Jan 17;152(1-2):327-39</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23332764</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2013 Feb;31(2):126-34</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23354101</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Biophys J. 2013 Mar 5;104(5):1107-15</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23473494</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Cell Rep. 2013 Apr 25;3(4):1093-104</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23562153</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2013 May;20(5):375-82</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23464877</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2013 Jun 1;29(11):1390-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23559638</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2014 Jan;42(Database issue):D148-55</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24214955</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2010 Jul;38(12):e131</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20395217</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2000 Jan;16(1):16-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10812473</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Mol Biol Rep. 1994;20(2):97-107</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">7536299</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2006 Nov;24(11):1429-35</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16998473</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Science. 2007 Jun 8;316(5830):1497-502</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17540862</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2007 Jul 1;23(13):i72-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17646348</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Curr Protoc Cell Biol. 2004 Sep;Chapter 17:Unit 17.7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18228445</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Cell. 2008 Jun 27;133(7):1266-76</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18585359</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Cell Biochem. 2009 May 1;107(1):11-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19173299</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Science. 2009 Jun 26;324(5935):1720-3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19443739</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2009 Oct;10(10):669-80</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19736561</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS Comput Biol. 2009 Dec;5(12):e1000590</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19997485</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2010 Jun;20(6):861-73</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20378718</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Annu Rev Biochem. 2010;79:233-69</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20334529</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>EMBO J. 2010 Jul 7;29(13):2147-60</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20517297</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2010 Sep;28(9):970-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20802496</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2011 Jan;39(Database issue):D124-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21037262</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2011 Jun;29(6):480-3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21654662</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2012 Sep;22(9):1813-31</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22955991</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A64 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 001A64 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Curation
   |type=    RBID
   |clé=     pubmed:24500199
   |texte=   A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i   -Sk "pubmed:24500199" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021