CyberinfraV1, Pmc, Corpus, bibRecord, 0005829

***** Acces problem to record *****\

Identifieur interne : 0005829 ( Pmc/Corpus ); précédent : 0005828; suivant : 0005830 ***** probable Xml problem with record *****

Links to Exploration step

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">MEME: discovering and analyzing DNA and protein sequence motifs</title>
<author><name sortKey="Bailey, Timothy L" sort="Bailey, Timothy L" uniqKey="Bailey T" first="Timothy L." last="Bailey">Timothy L. Bailey</name>
</author>
<author><name sortKey="Williams, Nadya" sort="Williams, Nadya" uniqKey="Williams N" first="Nadya" last="Williams">Nadya Williams</name>
</author>
<author><name sortKey="Misleh, Chris" sort="Misleh, Chris" uniqKey="Misleh C" first="Chris" last="Misleh">Chris Misleh</name>
</author>
<author><name sortKey="Li, Wilfred W" sort="Li, Wilfred W" uniqKey="Li W" first="Wilfred W." last="Li">Wilfred W. Li</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">16845028</idno>
<idno type="pmc">1538909</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1538909</idno>
<idno type="RBID">PMC:1538909</idno>
<idno type="doi">10.1093/nar/gkl198</idno>
<date when="2006">2006</date>
<idno type="wicri:Area/Pmc/Corpus">000582</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">MEME: discovering and analyzing DNA and protein sequence motifs</title>
<author><name sortKey="Bailey, Timothy L" sort="Bailey, Timothy L" uniqKey="Bailey T" first="Timothy L." last="Bailey">Timothy L. Bailey</name>
</author>
<author><name sortKey="Williams, Nadya" sort="Williams, Nadya" uniqKey="Williams N" first="Nadya" last="Williams">Nadya Williams</name>
</author>
<author><name sortKey="Misleh, Chris" sort="Misleh, Chris" uniqKey="Misleh C" first="Chris" last="Misleh">Chris Misleh</name>
</author>
<author><name sortKey="Li, Wilfred W" sort="Li, Wilfred W" uniqKey="Li W" first="Wilfred W." last="Li">Wilfred W. Li</name>
</author>
</analytic>
<series><title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint><date when="2006">2006</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel ‘signals’ in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (<ext-link ext-link-type="uri" xlink:href="http://meme.nbcr.net"></ext-link>
) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">Nucleic Acids Research</journal-id>
<journal-title>Nucleic Acids Research</journal-title>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher><publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">16845028</article-id>
<article-id pub-id-type="pmc">1538909</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkl198</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Article</subject>
</subj-group>
</article-categories>
<title-group><article-title>MEME: discovering and analyzing DNA and protein sequence motifs</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Bailey</surname>
<given-names>Timothy L.</given-names>
</name>
<xref ref-type="corresp" rid="cor1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Williams</surname>
<given-names>Nadya</given-names>
</name>
<xref rid="au1" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Misleh</surname>
<given-names>Chris</given-names>
</name>
<xref rid="au1" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Li</surname>
<given-names>Wilfred W.</given-names>
</name>
<xref rid="au1" ref-type="aff">1</xref>
</contrib>
<aff><institution>Institute of Molecular Bioscience, The University of Queensland</institution>
<addr-line>St Lucia, QLD 4072, Australia</addr-line>
</aff>
<aff id="au1"><sup>1</sup>
<institution>SDSC, UCSD, La Jolla</institution>
<addr-line>CA, USA</addr-line>
</aff>
</contrib-group>
<author-notes><corresp id="cor1"><sup>*</sup>
To whom correspondence should be addressed. Tel: +61 7 3346 2614; Fax: +61 7 3346 2101; Email: <email>t.bailey@imb.uq.edu.au</email>
</corresp>
</author-notes>
<pmc-comment>For NAR: both ppub and collection dates generated for PMC processing 1/27/05 beck</pmc-comment>
      <pub-date pub-type="collection"><day>01</day>
<month>7</month>
<year>2006</year>
</pub-date>
<pub-date pub-type="ppub"><day>01</day>
<month>7</month>
<year>2006</year>
</pub-date>
<pub-date pub-type="epub"><day>14</day>
<month>7</month>
<year>2006</year>
</pub-date>
<volume>34</volume>
<issue>Web Server issue</issue>
<fpage>W369</fpage>
<lpage>W373</lpage>
<history><date date-type="received"><day>14</day>
<month>2</month>
<year>2006</year>
</date>
<date date-type="rev-recd"><day>21</day>
<month>3</month>
<year>2006</year>
</date>
<date date-type="accepted"><day>21</day>
<month>3</month>
<year>2006</year>
</date>
</history>
<copyright-statement>© The Author 2006. Published by Oxford University Press. All rights reserved</copyright-statement>
<copyright-year>2006</copyright-year>
<license license-type="openaccess"><p>The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org</p>
</license>
<abstract><p>MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel ‘signals’ in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (<ext-link ext-link-type="uri" xlink:href="http://meme.nbcr.net"></ext-link>
) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.</p>
</abstract>
</article-meta>
</front>
<body><sec><title>INTRODUCTION</title>
<p>The purpose of MEME (Multiple EM For Motif Elicitation) (rhymes with ‘team’) (<xref ref-type="bibr" rid="b1">1</xref>
,<xref ref-type="bibr" rid="b2">2</xref>
) is to allow users to discover signals (called ‘motifs’) in DNA or protein sequences. The user of MEME inputs a set of sequences believed to share some (unknown) sequence signal(s). For example, some or all of a set of promoters from co-expressed and/or orthologous genes may contain binding sites (the ‘signal’) for the same transcription factor (<xref ref-type="bibr" rid="b3">3</xref>
). Similarly, a set of proteins that interact with a single host protein may do so via similar domains (the ‘signal’) (<xref ref-type="bibr" rid="b4">4</xref>
). Both types of sequence signals can often be represented as motifs-ungapped, approximate sequence patterns. Using a process akin to gapless, local, multiple sequence alignment, MEME searches for statistically significant motifs in the input sequence set. In this way, MEME can discover the binding sites for the shared transcription factor in the set of promoters or the common protein–protein binding domains in the set of proteins. MEME can also be used to discover motifs describing many other types of DNA or protein signals besides transcription factor binding sites and protein–protein interaction domains.</p>
<p>To use MEME via the website, the user provides a set of sequences in the FASTA format by either uploading a file or by cut-and-paste. The only other required input is an email address where the results will be sent. (A planned future version will remove this requirement by providing temporary storage of the results on the web server for a preset period of time.) By default, MEME looks for up to three motifs, each of which may be present in some or all of the input sequences. MEME chooses the width and number of occurrences of each motif automatically in order to minimize the ‘<italic>E</italic>
-value’ of the motif—the probability of finding an equally well-conserved pattern in random sequences. By default, only motif widths between 6 and 50 are considered, but the user may change this as well as several other aspects of the search for motifs.</p>
<p>The MEME output is HTML and shows the motifs as local multiple alignments of (subsets of) the input sequences, as well as in several other formats (<xref ref-type="fig" rid="fig1">Figure 1</xref>
). ‘Block diagrams’ show the relative positions of the motifs in each of the input sequences. Buttons on the MEME HTML output allow one or all of the motifs to be forwarded for analysis by other web-based programs. Clicking on a button allows all of the motifs to be sent to the MAST web server where various sequence databases (or uploaded sequences) can be searched for sequences matching the motifs. This is useful in cases, for example, where the user would like to find whether the motif of interest is also present in other genes or genomes.</p>
<p>MAST is a web-based tool that can be used to search for sequences that match one or more motifs. It can be used to look for sequences that contain motifs found by MEME, by other motif discovery tools or that are taken from a motif database. The MAST website, reached via the same URL as the MEME website, provides numerous nucleotide and protein databases for searching. MAST queries may contain any number of motifs, and it scores each sequence in the selected database using all of the motifs. In the first example above, MAST can search DNA sequences for matches to the putative transcription factor binding site (TFBS) motifs found by MEME in a set of promoter sequences. MAST can search for matches in protein sequences to the putative protein–protein interaction motifs found in the second MEME example.</p>
<p>Users of MEME via the website or locally installed versions are asked to cite this article as well as the primary reference for MEME (5). Users of MAST are asked to cite this article and Ref. (<xref ref-type="bibr" rid="b6">6</xref>
).</p>
</sec>
<sec><title>MOTIF DISCOVERY STRATEGIES</title>
<p>Motif discovery can be viewed as a ‘needle in a haystack’ problem. The motif discovery algorithm is looking for a set of similar short sequences (the needle) in a set of much longer sequences (the haystack). The problem is easier when the motif instances are long and very similar to each other. It gets much harder when the motif instances are short and/or degenerate, or the input sequences are very long.</p>
<p>Discovering TFBS motifs in a set of DNA sequences (e.g. genomic regions upstream of genes) is a difficult task owing to the tendency of binding sites to be short and degenerate, and owing to the fact that promoter regions are often difficult to identify precisely. The problem tends to be worse in eukaryotes than in prokaryotes and yeast because eukaryotic TFBS tend to be shorter and more variable (<xref ref-type="bibr" rid="b7">7</xref>
).</p>
<p>To successfully discover TFBS motifs with MEME, it is necessary to choose and prepare the input sequences carefully. Candidate sequences can be the promoters of genes believed to be co-regulated based on the evidence from expression microarray experiments, or sequences appearing to bind to a transcription factor based on chromatin immunoprecipitation experiments. The sequences should be as short as possible and contain as few ‘noise’ sequences (sequences not containing any motif) as possible. Ideally, the sequences should be <1000 bp long (<xref ref-type="bibr" rid="b8">8</xref>
). Including more than 40 motif-containing sequences generally does not improve TFBS motif discovery with MEME and similar algorithms (<xref ref-type="bibr" rid="b9">9</xref>
). If the sequences contain low-information segments that do not contain motifs of interest, it can be helpful to remove them using the DUST program (R. L. Tatusov and D. J. Lipman, unpublished NCBI/Toolkit), which is available for downloading at <ext-link ext-link-type="uri" xlink:href="http://blast.wustl.edu/pub/dust/"></ext-link>
. Repetitive DNA elements should also be removed from the sequences input to MEME using the RepeatMasker program (A. Smit, R. Hubley and P. Green, unpublished data), which can be accessed via the Web (<ext-link ext-link-type="uri" xlink:href="http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker"></ext-link>
).</p>
<p>It should be noted that MEME is not suited to whole-genome TFBS motif discovery. Owing to their shortness and degeneracy, TFBS motifs become statistically ‘invisible’ in the context of a whole genome. The sensitivity of the search for TFBS motifs can be improved by using a ‘higher-order background sequence model’, but this option is only available currently when users download the MEME source code and install it locally. Instructions for the installation are available at the MEME website (<ext-link ext-link-type="uri" xlink:href="http://meme.nbcr.net/meme/website/meme-download.html"></ext-link>
) by clicking on ‘View MEME man page’; see the documentation for the ‘-bfile’ switch there.</p>
<p>Protein motifs are generally easier to discover owing to the length of the protein alphabet and the chemical similarity among groups of amino acids. This allows shorter motifs to be more statistically significant and makes it easier to distinguish functional motifs from statistical artifacts. To use MEME to discover protein motifs, the same basic guidelines apply as with DNA motifs—keep the sequences as short as possible and include as few sequences that are not likely to contain the motif as possible in the input to MEME. Low-complexity regions can be removed from the protein input sequences using the SEG program (<xref ref-type="bibr" rid="b10">10</xref>
).</p>
</sec>
<sec><title>ANALYZING MOTIFS USING THE MEME OUTPUT HYPERLINKS</title>
<p>The MEME HTML output contains buttons making it easy to analyze the motifs it discovers. By clicking on the button labeled ‘Compare PSPM to known motifs in JASPAR database’ following each motif, the DNA motif can be compared to each of the motifs in the JASPAR database (<xref ref-type="bibr" rid="b11">11</xref>
) of known TFBS motifs. Similarly, protein motifs may be compared with protein motifs in the BLOCKS database of protein motifs (<xref ref-type="bibr" rid="b12">12</xref>
) by clicking on the ‘submit BLOCK’ button following each motif on the MEME form. This takes the user to the ‘BLOCKS server’ where clicking on ‘LAMA’ will compare the motif with those in the BLOCKS database. The BLOCKS server also allows users to display protein motifs in many different ways, including LOGOS (<xref ref-type="bibr" rid="b13">13</xref>
) or phylogenetic trees, by clicking on the corresponding buttons on the BLOCKS server form. By clicking on one of the file output formats under Logos, the user is able to obtain a LOGOS diagram similar to that shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>
.</p>
<p>To search sequences for matches to the motifs found by MEME, users can click on the ‘MAST’ button at the top of the MEME output form. This will take the user to the MAST website where they can select the database to search. Since MAST is sequence-oriented, TFBS motifs should only be used to search promoter regions. These are listed in the MAST database pull-down menu as ‘Upstream Sequence Databases’. Currently, only a few organisms are supported. However, users can upload their own database of promoter sequences for searching using MAST. Protein motifs can be used to search any of the sequence databases provided by the MAST website since MAST can search either protein or nucleotide databases with protein motifs. The MAST database are updated weekly.</p>
</sec>
<sec><title>WEB SERVER AND USER SUPPORT</title>
<p>As of MEME version 3.5, the configuration and installation of MEME (including the web server) is significantly simplified by using Autoconf (<ext-link ext-link-type="uri" xlink:href="http://www.gnu.org/software/autoconf/autoconf.html"></ext-link>
) and Automake (<ext-link ext-link-type="uri" xlink:href="http://www.gnu.org/software/automake/automake.html"></ext-link>
) from the GNU Build System. An installation session for MEME and MAST web server may be as simple as follows:</p>
<p><monospace>cd meme_3.5.2</monospace>
</p>
<p><monospace>./configure --prefix=$HOME/meme --with-url=<ext-link ext-link-type="uri" xlink:href="http://www.nbcr.net/"></ext-link>
</monospace>
</p>
<p><monospace>meme --enable-web</monospace>
</p>
<p><monospace>make</monospace>
</p>
<p><monospace>make test</monospace>
</p>
<p><monospace>make install</monospace>
</p>
<p>Supported platforms now include Linux, Solaris, MacOS X, Cygwin and Irix.</p>
<p>The MEME web server hosted by NBCR is queried by about 800 different users (based on unique email addresses) each month. Usage has been growing steadily since the service was first introduced in 1996. <xref ref-type="fig" rid="fig3">Figure 3</xref>
 shows usage growth at the NBCR server since 2000.</p>
<p>To meet the growing user demand and take advantage of the emerging grid-computing resources (<xref ref-type="bibr" rid="b14">14</xref>
), we have made MEME available for the installation on Linux clusters using either the RPM package manager or Rocks. The RPM package manager is a tool for managing software installation on computers running many versions of the Linux operating system. Rocks (<ext-link ext-link-type="uri" xlink:href="http://www.rocksclusters.org"></ext-link>
) is a highly customized toolkit for computational biologists and engineers to build and maintain Linux clusters. The current NBCR MEME web server cluster is built using the MEME roll for Rocks and requires minimal maintenance effort.</p>
<p>MEME and MAST can be downloaded and installed free of charge by academic users via the website: (<ext-link ext-link-type="uri" xlink:href="http://meme.nbcr.net/meme/website/meme-download.html"></ext-link>
). Approximately 300 users download the MEME/MAST software each month. The MEME support team offers assistance to the MEME and MAST user community through the forum (<ext-link ext-link-type="uri" xlink:href="http://nbcr.net/forum/viewforum.php?f=5"></ext-link>
) or the mailing list (<email>meme@nbcr.net</email>
). Institutes interested in setting up MEME mirror sites are encouraged to contact us for any assistance.</p>
</sec>
<sec><title>FUTURE DIRECTIONS</title>
<p>To increase the sensitivity of MEME searches, we will add an option in the web server to let the user upload a background sequence model to MEME. We hope to add algorithms for removing low-complexity regions (SEG and DUST) and repeated elements (RepeatMasker) in the MEME website as a convenience to users. These services will also be exposed as web services and are integrated using workflow tools developed by using NBCR.</p>
<p>We have also planned to add buttons to the MEME output to allow TFBS motifs to be used in searching for <italic>cis</italic>
-regulatory modules via algorithms such as MCAST (<xref ref-type="bibr" rid="b15">15</xref>
). MCAST will be configured to be able to search the same DNA databases as MAST. In conjunction with this, we will add databases of upstream sequences for many additional organisms to the MAST/MCAST websites to facilitate the analysis of TFBS motifs discovered by using MEME.</p>
<p>NBCR has developed a set of tools built on top of the open source software that allows bioinformatics applications to be deployed as Web Services easily (S. Krishnan, B. Stearn, K. Bhatia, W. W. Li and P. Arzberger, manuscript submitted) and leverage the Cyberinfrastructure components transparently (<xref ref-type="bibr" rid="b14">14</xref>
). A prototype has been deployed using MEME as a scientific driver (<xref ref-type="bibr" rid="b16">16</xref>
) that offers a user with a dynamic pool of distributed compute resource, workflow management console and a friendly user interface. This portal will be deployed to the production web server in the future.</p>
</sec>
</body>
<back><ack><p>The authors acknowledge NBCR award from NCRR, NIH P41 RR08605, for support of the MEME and MAST website. TLB acknowledges grant from NIH, R01 RR021692-01, for support of continuing development of the MEME and related sequence analysis tools. T.L.B. also acknowledges the ARC Centre for Bioinformatics (ACB) (ARC CE0348221) for infrastructure support for the MEME mirror site at the ACB. Funding to pay the Open Access publication charges for this article was provided by the NIH.</p>
<p><italic>Conflict of interest statement</italic>
. None declared.</p>
</ack>
<ref-list><title>REFERENCES</title>
<ref id="b1"><label>1</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Elkan</surname>
<given-names>C.</given-names>
</name>
</person-group>
<article-title>Unsupervised Learning of Multiple Motifs In Biopolymers Using EM</article-title>
<source>Mach. Learn</source>
<year>1995</year>
<volume>21</volume>
<fpage>51</fpage>
<lpage>80</lpage>
</citation>
</ref>
<ref id="b2"><label>2</label>
<citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Elkan</surname>
<given-names>C.</given-names>
</name>
</person-group>
<person-group person-group-type="editor"><name><surname>Rawlings</surname>
<given-names>C.</given-names>
</name>
<name><surname>Clark</surname>
<given-names>D.</given-names>
</name>
<name><surname>Altman</surname>
<given-names>R.</given-names>
</name>
<name><surname>Hunter</surname>
<given-names>L.</given-names>
</name>
<name><surname>Lengauer</surname>
<given-names>T.</given-names>
</name>
<name><surname>Wodak</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>The value of prior knowledge in discovering motifs with MEME</article-title>
<year>1995</year>
<conf-name>Proceedings of the Third International Conference on Intelligent Systems for Molecular biology, July</conf-name>
<publisher-loc>Menlo Park, CA</publisher-loc>
<publisher-name>AAAI Press</publisher-name>
<fpage>21</fpage>
<lpage>29</lpage>
</citation>
</ref>
<ref id="b3"><label>3</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lyons</surname>
<given-names>T.J.</given-names>
</name>
<name><surname>Gasch</surname>
<given-names>A.P.</given-names>
</name>
<name><surname>Alex Gaither</surname>
<given-names>L.</given-names>
</name>
<name><surname>Botstein</surname>
<given-names>D.</given-names>
</name>
<name><surname>Brown</surname>
<given-names>P.O.</given-names>
</name>
<name><surname>Eide</surname>
<given-names>D.J.</given-names>
</name>
</person-group>
<article-title>Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>2000</year>
<volume>97</volume>
<fpage>7957</fpage>
<lpage>7962</lpage>
<pub-id pub-id-type="pmid">10884426</pub-id>
</citation>
</ref>
<ref id="b4"><label>4</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fang</surname>
<given-names>J.</given-names>
</name>
<name><surname>Haasl</surname>
<given-names>R.J.</given-names>
</name>
<name><surname>Dong</surname>
<given-names>Y.</given-names>
</name>
<name><surname>Lushington</surname>
<given-names>G.H.</given-names>
</name>
</person-group>
<article-title>Discover protein sequence signatures from protein-protein interaction data</article-title>
<source>BMC Bioinformatics</source>
<year>2005</year>
<volume>6</volume>
<fpage>1</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">15631638</pub-id>
</citation>
</ref>
<ref id="b5"><label>5</label>
<citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Elkan</surname>
<given-names>C.</given-names>
</name>
</person-group>
<person-group person-group-type="editor"><name><surname>Altman</surname>
<given-names>R.B.</given-names>
</name>
<name><surname>Brutlag</surname>
<given-names>D.L.</given-names>
</name>
<name><surname>Karp</surname>
<given-names>P.D.</given-names>
</name>
<name><surname>Lathrop</surname>
<given-names>R.H.</given-names>
</name>
<name><surname>Searls</surname>
<given-names>D.B.</given-names>
</name>
</person-group>
<article-title>Fitting a mixture model by expectation maximization to discover motifs in biopolymers</article-title>
<year>1994</year>
<conf-name>Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, August</conf-name>
<publisher-loc>Menlo Park, CA</publisher-loc>
<publisher-name>AAAI Press</publisher-name>
<fpage>28</fpage>
<lpage>36</lpage>
</citation>
</ref>
<ref id="b6"><label>6</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Gribskov</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>'Combining evidence using <italic>P</italic>
-values: application to sequence homology searches</article-title>
<source>Bioinformatics</source>
<year>1998</year>
<volume>14</volume>
<fpage>48</fpage>
<lpage>54</lpage>
<pub-id pub-id-type="pmid">9520501</pub-id>
</citation>
</ref>
<ref id="b7"><label>7</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tompa</surname>
<given-names>M.</given-names>
</name>
<name><surname>Li</surname>
<given-names>N.</given-names>
</name>
<name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Church</surname>
<given-names>G.M.</given-names>
</name>
<name><surname>De Moor</surname>
<given-names>B.</given-names>
</name>
<name><surname>Eskin</surname>
<given-names>E.</given-names>
</name>
<name><surname>Favorov</surname>
<given-names>A.V.</given-names>
</name>
<name><surname>Frith</surname>
<given-names>M.C.</given-names>
</name>
<name><surname>Fu</surname>
<given-names>Y.</given-names>
</name>
<name><surname>Kent</surname>
<given-names>W.J.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites</article-title>
<source>Nat. Biotechnol.</source>
<year>2005</year>
<volume>23</volume>
<fpage>137</fpage>
<lpage>147</lpage>
<pub-id pub-id-type="pmid">15637633</pub-id>
</citation>
</ref>
<ref id="b8"><label>8</label>
<citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Pevzner</surname>
<given-names>P.A.</given-names>
</name>
<name><surname>Sze</surname>
<given-names>S.H.</given-names>
</name>
</person-group>
<person-group person-group-type="editor"><name><surname>Bourne</surname>
<given-names>P.E.</given-names>
</name>
<name><surname>Gribskov</surname>
<given-names>M.</given-names>
</name>
<name><surname>Altman</surname>
<given-names>R.B.</given-names>
</name>
<name><surname>Jensen</surname>
<given-names>N.</given-names>
</name>
<name><surname>Hope</surname>
<given-names>D.</given-names>
</name>
<name><surname>Lengauer</surname>
<given-names>T.</given-names>
</name>
<name><surname>Mitchell</surname>
<given-names>J.C.</given-names>
</name>
<name><surname>Scheeff</surname>
<given-names>E.D.</given-names>
</name>
<name><surname>Smith</surname>
<given-names>C.</given-names>
</name>
<name><surname>Strande</surname>
<given-names>S.</given-names>
</name>
<name><surname>Weissig</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>Combinatorial approaches to finding subtle signals in DNA sequences</article-title>
<year>2000</year>
<conf-name>Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, August.</conf-name>
<publisher-loc>Menlo Park, CA</publisher-loc>
<publisher-name>AAAI Press</publisher-name>
<fpage>269</fpage>
<lpage>278</lpage>
</citation>
</ref>
<ref id="b9"><label>9</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname>
<given-names>J.</given-names>
</name>
<name><surname>Li</surname>
<given-names>B.</given-names>
</name>
<name><surname>Kihara</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Limitations and potentials of current motif discovery algorithms</article-title>
<source>Nucleic Acids Res.</source>
<year>2005</year>
<volume>33</volume>
<fpage>4899</fpage>
<lpage>4913</lpage>
<pub-id pub-id-type="pmid">16284194</pub-id>
</citation>
</ref>
<ref id="b10"><label>10</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wootton</surname>
<given-names>J.C.</given-names>
</name>
<name><surname>Federhen</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Analysis of compositionally biased regions in sequence databases</article-title>
<source>Methods Enzymol</source>
<year>1966</year>
<volume>266</volume>
<fpage>554</fpage>
<lpage>571</lpage>
<pub-id pub-id-type="pmid">8743706</pub-id>
</citation>
</ref>
<ref id="b11"><label>11</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sandelin</surname>
<given-names>A.</given-names>
</name>
<name><surname>Alkema</surname>
<given-names>W.</given-names>
</name>
<name><surname>Engström</surname>
<given-names>P.</given-names>
</name>
<name><surname>Wasserman</surname>
<given-names>W.W.</given-names>
</name>
<name><surname>Lenhard</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>JASPAR: an open-access database for eukaryotic transcription factor binding profiles</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<fpage>D91</fpage>
<lpage>D94</lpage>
<pub-id pub-id-type="pmid">14681366</pub-id>
</citation>
</ref>
<ref id="b12"><label>12</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Henikoff</surname>
<given-names>J.G.</given-names>
</name>
<name><surname>Pietrokovski</surname>
<given-names>S.</given-names>
</name>
<name><surname>Henikoff</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Recent enhancements to the blocks database servers</article-title>
<source>Nucleic Acids Res.</source>
<year>1997</year>
<volume>25</volume>
<fpage>222</fpage>
<lpage>225</lpage>
<pub-id pub-id-type="pmid">9016540</pub-id>
</citation>
</ref>
<ref id="b13"><label>13</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schneider</surname>
<given-names>T.D.</given-names>
</name>
<name><surname>Stephens</surname>
<given-names>R.M.</given-names>
</name>
</person-group>
<article-title>Sequence logos: a new way to display consensus sequences</article-title>
<source>Nucleic Acids Res.</source>
<year>1990</year>
<volume>18</volume>
<fpage>6097</fpage>
<lpage>6100</lpage>
<pub-id pub-id-type="pmid">2172928</pub-id>
</citation>
</ref>
<ref id="b14"><label>14</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Foster</surname>
<given-names>I.</given-names>
</name>
<name><surname>Kesselman</surname>
<given-names>C.</given-names>
</name>
</person-group>
<source>The Grid 2: Blueprint for a New Computing Infrastructure</source>
<year>2004</year>
<edition>2nd edn</edition>
<publisher-loc>San Francisco, CA</publisher-loc>
<publisher-name>Morgan Kaufmann Publishers, Inc.</publisher-name>
</citation>
</ref>
<ref id="b15"><label>15</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
<name><surname>Noble</surname>
<given-names>W.S.</given-names>
</name>
</person-group>
<article-title>Searching for statistically significant regulatory modules</article-title>
<source>Bioinformatics</source>
<year>2003</year>
<volume>19</volume>
<issue>Suppl 2</issue>
<fpage>II16</fpage>
<lpage>II25</lpage>
<pub-id pub-id-type="pmid">14534166</pub-id>
</citation>
</ref>
<ref id="b16"><label>16</label>
<citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Li</surname>
<given-names>W.W.</given-names>
</name>
<name><surname>Krishnan</surname>
<given-names>S.</given-names>
</name>
<name><surname>Mueller</surname>
<given-names>K.</given-names>
</name>
<name><surname>Misleh</surname>
<given-names>C.</given-names>
</name>
<name><surname>Arzberger</surname>
<given-names>P.</given-names>
</name>
</person-group>
<person-group person-group-type="editor"><name><surname>Bu Sung</surname>
<given-names>F.L.</given-names>
</name>
<name><surname>Abramson</surname>
<given-names>D.</given-names>
</name>
<name><surname>Cai</surname>
<given-names>W.</given-names>
</name>
<name><surname>Graupner</surname>
<given-names>S.</given-names>
</name>
<name><surname>Jin</surname>
<given-names>H.</given-names>
</name>
<name><surname>Sloot</surname>
<given-names>P.</given-names>
</name>
</person-group>
<article-title>Building cyberinfrastructure for bioinformatics using service oriented architecture</article-title>
<year>2006</year>
<conf-name>Proceedings of the IEEE International Symposium on Cluster Computing and the Grid, May</conf-name>
<publisher-loc>USA</publisher-loc>
<publisher-name>IEEE Press</publisher-name>
<comment>(in press)</comment>
</citation>
</ref>
</ref-list>
<sec sec-type="display-objects"><title>Figures and Tables</title>
<fig id="fig1" position="float"><label>Figure 1</label>
<caption><p>Sample MEME output.This portion of an MEME HTML output form shows a protein motif that MEME has discovered in the input sequences. The sites identified as belonging to the motif are indicated, and above them is the ‘consensus’ of the motif and a color-coded bar graph showing the conservation of each position in the motif. Some of the hyperlinked buttons that allow the motif to be viewed and analyzed in other ways can be seen at the bottom of the screen shot.</p>
</caption>
<graphic xlink:href="gkl198f1"></graphic>
</fig>
<fig id="fig2" position="float"><label>Figure 2</label>
<caption><p>LOGO of protein motif. LOGOS are a visualization tool for motifs. The height of a letter indicates its relative frequency at the given position (<italic>x</italic>
-axis) in the motif.</p>
</caption>
<graphic xlink:href="gkl198f2"></graphic>
</fig>
<fig id="fig3" position="float"><label>Figure 3</label>
<caption><p>Usage of MEME at the NBCR web server. The plot shows the number of different users submitting jobs to the NBCR MEME web server each month since December 2000. Usage figures for March 2006 include up to March 20 only.</p>
</caption>
<graphic xlink:href="gkl198f3"></graphic>
</fig>
</sec>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0005829 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0005829 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024

	Serveur d'exploration Cyberinfrastructure
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Cyberinfrastructure

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri