Serveur d'exploration sur l'oranger

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000958 ( Pmc/Corpus ); précédent : 0009579; suivant : 0009590 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes</title>
<author>
<name sortKey="Hayden, Celine A" sort="Hayden, Celine A" uniqKey="Hayden C" first="Celine A" last="Hayden">Celine A. Hayden</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jorgensen, Richard A" sort="Jorgensen, Richard A" uniqKey="Jorgensen R" first="Richard A" last="Jorgensen">Richard A. Jorgensen</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">17663791</idno>
<idno type="pmc">2075485</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2075485</idno>
<idno type="RBID">PMC:2075485</idno>
<idno type="doi">10.1186/1741-7007-5-32</idno>
<date when="2007">2007</date>
<idno type="wicri:Area/Pmc/Corpus">000958</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes</title>
<author>
<name sortKey="Hayden, Celine A" sort="Hayden, Celine A" uniqKey="Hayden C" first="Celine A" last="Hayden">Celine A. Hayden</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Jorgensen, Richard A" sort="Jorgensen, Richard A" uniqKey="Jorgensen R" first="Richard A" last="Jorgensen">Richard A. Jorgensen</name>
<affiliation>
<nlm:aff id="I1">Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Biology</title>
<idno type="eISSN">1741-7007</idno>
<imprint>
<date when="2007">2007</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Upstream open reading frames (uORFs) can mediate translational control over the largest, or major ORF (mORF) in response to starvation, polyamine concentrations, and sucrose concentrations. One plant uORF with conserved peptide sequences has been shown to exert this control in an amino acid sequence-dependent manner but generally it is not clear what kinds of genes are regulated, or how extensively this mechanism is invoked in a given genome.</p>
</sec>
<sec>
<title>Results</title>
<p>By comparing full-length cDNA sequences from Arabidopsis and rice we identified 26 distinct homology groups of conserved peptide uORFs, only three of which have been reported previously. Pairwise
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
analysis showed that purifying selection had acted on nearly all conserved peptide uORFs and their associated mORFs. Functions of predicted mORF proteins could be inferred for 16 homology groups and many of these proteins appear to have a regulatory function, including 6 transcription factors, 5 signal transduction factors, 3 developmental signal molecules, a homolog of translation initiation factor eIF5, and a RING finger protein. Transcription factors are clearly overrepresented in this data set when compared to the frequency calculated for the entire genome (p = 1.2 × 10
<sup>-7</sup>
). Duplicate gene pairs arising from a whole genome duplication (ohnologs) with a conserved uORF are much more likely to have been retained in Arabidopsis (
<italic>Arabidopsis thaliana</italic>
) than are ohnologs of other genes (39% vs 14% of ancestral genes, p = 5 × 10
<sup>-3</sup>
). Two uORF groups were found in animals, indicating an ancient origin of these putative regulatory elements.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>Conservation of uORF amino acid sequence, association with homologous mORFs over long evolutionary time periods, preferential retention after whole genome duplications, and preferential association with mORFs coding for transcription factors suggest that the conserved peptide uORFs identified in this study are strong candidates for translational controllers of regulatory genes.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Biol</journal-id>
<journal-title>BMC Biology</journal-title>
<issn pub-type="epub">1741-7007</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">17663791</article-id>
<article-id pub-id-type="pmc">2075485</article-id>
<article-id pub-id-type="publisher-id">1741-7007-5-32</article-id>
<article-id pub-id-type="doi">10.1186/1741-7007-5-32</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes</article-title>
</title-group>
<contrib-group>
<contrib id="A1" contrib-type="author">
<name>
<surname>Hayden</surname>
<given-names>Celine A</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>chayden@email.arizona.edu</email>
</contrib>
<contrib id="A2" corresp="yes" contrib-type="author">
<name>
<surname>Jorgensen</surname>
<given-names>Richard A</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>raj@ag.arizona.edu</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036, USA</aff>
<pub-date pub-type="collection">
<year>2007</year>
</pub-date>
<pub-date pub-type="epub">
<day>30</day>
<month>7</month>
<year>2007</year>
</pub-date>
<volume>5</volume>
<fpage>32</fpage>
<lpage>32</lpage>
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1741-7007/5/32"></ext-link>
<history>
<date date-type="received">
<day>22</day>
<month>1</month>
<year>2007</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>7</month>
<year>2007</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2007 Hayden and Jorgensen; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2007</copyright-year>
<copyright-holder>Hayden and Jorgensen; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment> Hayden A Celine chayden@email.arizona.edu Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes 2007BMC Biology 5(1): 32-. (2007)1741-7007(2007)5:1<32>urn:ISSN:1741-7007</pmc-comment>
</license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>Upstream open reading frames (uORFs) can mediate translational control over the largest, or major ORF (mORF) in response to starvation, polyamine concentrations, and sucrose concentrations. One plant uORF with conserved peptide sequences has been shown to exert this control in an amino acid sequence-dependent manner but generally it is not clear what kinds of genes are regulated, or how extensively this mechanism is invoked in a given genome.</p>
</sec>
<sec>
<title>Results</title>
<p>By comparing full-length cDNA sequences from Arabidopsis and rice we identified 26 distinct homology groups of conserved peptide uORFs, only three of which have been reported previously. Pairwise
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
analysis showed that purifying selection had acted on nearly all conserved peptide uORFs and their associated mORFs. Functions of predicted mORF proteins could be inferred for 16 homology groups and many of these proteins appear to have a regulatory function, including 6 transcription factors, 5 signal transduction factors, 3 developmental signal molecules, a homolog of translation initiation factor eIF5, and a RING finger protein. Transcription factors are clearly overrepresented in this data set when compared to the frequency calculated for the entire genome (p = 1.2 × 10
<sup>-7</sup>
). Duplicate gene pairs arising from a whole genome duplication (ohnologs) with a conserved uORF are much more likely to have been retained in Arabidopsis (
<italic>Arabidopsis thaliana</italic>
) than are ohnologs of other genes (39% vs 14% of ancestral genes, p = 5 × 10
<sup>-3</sup>
). Two uORF groups were found in animals, indicating an ancient origin of these putative regulatory elements.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>Conservation of uORF amino acid sequence, association with homologous mORFs over long evolutionary time periods, preferential retention after whole genome duplications, and preferential association with mORFs coding for transcription factors suggest that the conserved peptide uORFs identified in this study are strong candidates for translational controllers of regulatory genes.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Upstream open reading frames (uORFs) are small open reading frames found in the 5' UTR of a mature mRNA, and can mediate translational regulation of the largest, or major, ORF (mORF). Regulation by uORFs has been studied in several individual transcripts demonstrating the importance of uORFs in such processes as polyamine production [
<xref ref-type="bibr" rid="B1">1</xref>
], amino acid production [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B3">3</xref>
], and sucrose response [
<xref ref-type="bibr" rid="B4">4</xref>
], but the biological effect of uORFs in the vaste majority of transcripts of the genome is still unclear. Upstream start codons (uAUGs) occur in 20–30% of yeast, mammalian, and plant transcript 5' UTRs [
<xref ref-type="bibr" rid="B5">5</xref>
-
<xref ref-type="bibr" rid="B7">7</xref>
] therefore potentially thousands of genes are regulated in this manner.</p>
<p>The majority of characterized uORFs appear to act in an amino acid sequence-independent manner, regulating mORF translation by the uORF start codon nucleotide context, by the uORF length, or by the distance between the uORF stop codon and the mORF start codon, rather than by uORF-encoded peptides [
<xref ref-type="bibr" rid="B8">8</xref>
-
<xref ref-type="bibr" rid="B11">11</xref>
]. Some uORFs, however, do rely on peptide sequences to mediate translational regulation of the associated mORF, but few examples have been identified and characterized to date. In fungi and animals, a few genes have been shown to contain uORFs whose amino acid sequences are similar between two or more species [
<xref ref-type="bibr" rid="B12">12</xref>
-
<xref ref-type="bibr" rid="B17">17</xref>
], but only two cases,
<italic>CPA1 </italic>
[
<xref ref-type="bibr" rid="B3">3</xref>
] and
<italic>SAMDC1</italic>
/
<italic>AdoMetDC1 </italic>
[
<xref ref-type="bibr" rid="B18">18</xref>
], have demonstrated uORF sequence-dependent regulation. In plants two groups of genes, S-Adenosylmethionine decarboxylases (AdoMetDCs; EC 4.1.1.50) and group S basic region leucine zipper (bZIP) transcription factors, have been shown to contain uORFs with similar amino acids between monocots and dicots [
<xref ref-type="bibr" rid="B19">19</xref>
,
<xref ref-type="bibr" rid="B20">20</xref>
]. In the former group, mORF translational regulation is dependent on the sequence of the uORF peptide [
<xref ref-type="bibr" rid="B1">1</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
] and overexpression of the mORF in either group results in stunted or lethal phenotypes, suggesting that these genes play a critical role in growth and/or development. Indeed, AdoMetDC is required for polyamine synthesis, molecules that are implicated in essential plant functions such as cell division, embryogenesis, leaf, root, and flower development, and stress responses [
<xref ref-type="bibr" rid="B21">21</xref>
,
<xref ref-type="bibr" rid="B22">22</xref>
].</p>
<p>In general, it has been difficult to carry out genome-wide surveys of conserved peptide uORFs due to poor annotation of 5' UTRs. The availability of expressed sequence tags (ESTs) has improved exon and intron annotation of the genomic sequence, but they are relatively short and often do not predict the entire mRNA molecule, even when several ESTs overlap the same genomic region and can be assembled to predict one transcript. As there are very few introns in yeast transcripts, prediction of uORF conservation has been attempted in
<italic>S. cerevisiae </italic>
by analyzing genomic sequence upstream of predicted mORF start sites [
<xref ref-type="bibr" rid="B23">23</xref>
], but it is still not clear whether these uORFs are truly conserved (i.e., are under negative selection pressures), or are simply undergoing evolutionary drift. With the sequencing of the
<italic>Aspergillus nidulans </italic>
genome, comparison to
<italic>A. fumigatus </italic>
and
<italic>A. oryzae </italic>
has identified 38 uORFs with putatively conserved start and stop codon positions relative to the mORF, 14 of which are conserved in one of
<italic>Neurospora crassa, Fusarium graminaerum</italic>
, or
<italic>Magnaporthe grisea </italic>
[
<xref ref-type="bibr" rid="B5">5</xref>
], but the authors did not comment on whether the uORF amino acid sequences are also conserved.</p>
<p>With the emergence of large plant full-length cDNA sequence collections [
<xref ref-type="bibr" rid="B24">24</xref>
-
<xref ref-type="bibr" rid="B26">26</xref>
], it is now possible to adopt a comparative genomics approach to determine the prevalence of conserved amino acid uORFs in the genome and the persistence of these elements throughout eukaryotic evolution. Because rice and Arabidopsis shared a common ancestor 140–200 million years ago (Mya) [
<xref ref-type="bibr" rid="B27">27</xref>
-
<xref ref-type="bibr" rid="B29">29</xref>
], sequence similarity retained over this amount of time provides good candidates for truly conserved peptide uORF sequences. In this study we have used
<italic>Oryza sativa </italic>
(rice) and
<italic>Arabidopsis thaliana </italic>
(Arabidopsis) full-length cDNA sequence collections to estimate the incidence of conserved peptide uORFs in the rice and Arabidopsis genomes, to determine the prevalence of uORFs within regulatory genes, and to compare evolutionary rates for uORFs versus mORFs. By examining more distantly related sequences, we posit an ancient origin for select uORFs and we provide evidence for one mechanism by which uORFs can arise within genes.</p>
</sec>
<sec>
<title>Results</title>
<sec>
<title>Identification of conserved peptide uORFs by comparison of rice and Arabidopsis transcripts</title>
<p>To identify conserved peptide uORFs, we developed "uORF-Finder", a Perl program that first compares the mORF amino acid sequence of each cDNA from one collection with the mORF sequences of another species' collection to identify putative mORF homologs, and then compares the uORFs in the 5' UTRs of the two paired sequences to identify uORFs with conserved amino acid sequences (see Methods). Comparison by uORF-Finder of a corrected set of 34000 full-length cDNA sequences from Arabidopsis with a similar set from rice resulted in the identification of conserved peptide uORFs in 44 Arabidopsis genes and 36 rice genes, which together comprise 19 homology groups based on uORF amino acid similarity (Tables
<xref ref-type="table" rid="T1">1</xref>
,
<xref ref-type="table" rid="T2">2</xref>
,
<xref ref-type="table" rid="T3">3</xref>
; Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
). All three of the homology groups that had been previously reported were identified by uORF-Finder [
<xref ref-type="bibr" rid="B1">1</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
]. The other 16 conserved uORFs have not been reported previously. Homologs of these 19 conserved uORF groups also exist in other angiosperm species (Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>uORF homology groups and associated mORF molecular function and biological role</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">Homology group</td>
<td align="center">mORF: known or probable molecular function/domain</td>
<td align="center">Known or inferred biological process</td>
<td align="center">Source</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" colspan="4">uORF conserved in Arabidopsis and rice</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">bZIP transcription factor</td>
<td align="left">Sucrose regulation</td>
<td align="left">[89]</td>
</tr>
<tr>
<td align="center">2</td>
<td align="left">bHLH transcription factor</td>
<td align="left">Transcriptional control</td>
<td align="left">[68]</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">AdoMetDC</td>
<td align="left">Polyamine biosynthesis: developmental regulation</td>
<td align="left">[1]</td>
</tr>
<tr>
<td align="center">4</td>
<td align="left">Unknown; plant-specific</td>
<td align="left">Unknown</td>
<td align="left">BLAST analysis</td>
</tr>
<tr>
<td align="center">5</td>
<td align="left">Ankyrin repeat protein</td>
<td align="left">Unknown</td>
<td align="left">Protein domain analysis*</td>
</tr>
<tr>
<td align="center">6</td>
<td align="left">Amine oxidase</td>
<td align="left">Unknown</td>
<td align="left">Protein domain analysis*</td>
</tr>
<tr>
<td align="center">7</td>
<td align="left">Putative translation initiation factor eIF5</td>
<td align="left">Start codon selection</td>
<td align="left">Protein domain analysis*</td>
</tr>
<tr>
<td align="center">8</td>
<td align="left">Similar to Mic-1</td>
<td align="left">Unknown</td>
<td align="left">BLAST analysis</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">Unknown, cysteine-rich</td>
<td align="left">Unknown (Possible novel zinc finger?)</td>
<td align="left">CX
<sub>4–7</sub>
CX
<sub>10</sub>
CX
<sub>2</sub>
HX
<sub>5 </sub>
tandem repeats</td>
</tr>
<tr>
<td align="center">10</td>
<td align="left">MAP kinase</td>
<td align="left">Signal transduction</td>
<td align="left">PlantsP database</td>
</tr>
<tr>
<td align="center">11</td>
<td align="left">Trehalose-6-phosphate phosphatase</td>
<td align="left">Trehalose metabolism: developmental regulation</td>
<td align="left">[90]</td>
</tr>
<tr>
<td align="center">12</td>
<td align="left">Unknown</td>
<td align="left">Systemically primed response to pathogens</td>
<td align="left">[91]</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">Phosphoethanolamine
<italic>N</italic>
-methyltransferase</td>
<td align="left">Phosphocholine biosynthesis</td>
<td align="left">[38]</td>
</tr>
<tr>
<td align="center">14</td>
<td align="left">HDZip class I transcription factor</td>
<td align="left">Transcriptional control; development</td>
<td align="left">[92,93]</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">bHLH transcription factor</td>
<td align="left">Transcriptional control; responsive to polyamine?</td>
<td align="left">[52,68]</td>
</tr>
<tr>
<td align="center">16</td>
<td align="left">MAP kinase</td>
<td align="left">Signal transduction</td>
<td align="left">PlantsP database [99]</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">Unknown</td>
<td align="left">Unknown</td>
<td></td>
</tr>
<tr>
<td align="center">18</td>
<td align="left">Transcription co-activator/repressor HsfB1</td>
<td align="left">Mediator of heat shock response</td>
<td align="left">[94,95]</td>
</tr>
<tr>
<td align="center">19</td>
<td align="left">SAUR protein</td>
<td align="left">Mediator of auxin response; calmodulin (CaM) binding</td>
<td align="left">IPR003676; [96]</td>
</tr>
<tr>
<td align="left" colspan="4">uORF conserved in Arabidopsis paralogs</td>
</tr>
<tr>
<td align="center">20</td>
<td align="left">Unknown</td>
<td align="left">Unknown</td>
<td></td>
</tr>
<tr>
<td align="center">21</td>
<td align="left">ERF/AP2 transcription factor</td>
<td align="left">Putative regulator of pathogen resistance</td>
<td align="left">[97,98]</td>
</tr>
<tr>
<td align="center">22</td>
<td align="left">Unknown</td>
<td align="left">Unknown</td>
<td></td>
</tr>
<tr>
<td align="center">23</td>
<td align="left">MAP kinase</td>
<td align="left">Signal transduction</td>
<td align="left">PlantsP database [99]</td>
</tr>
<tr>
<td align="center">24</td>
<td align="left">Unknown</td>
<td align="left">Unknown</td>
<td></td>
</tr>
<tr>
<td align="center">25</td>
<td align="left">Calcium response protein kinase</td>
<td align="left">Ca++/CaM-dependent signal transduction</td>
<td align="left">PlantsP database [100]</td>
</tr>
<tr>
<td align="center">26</td>
<td align="left">RING finger (C3HC4-type zinc finger)</td>
<td align="left">Ubiquitination; mediator of protein degradation</td>
<td align="left">Protein domain analysis*</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>bZIP, basic leucine zipper; bHLH, basic helix-loop-helix; AdoMetDC, S-Adenosylmethionine decarboxylase; Mic-1, colon cancer-associated protein macrophage-inhibitory cytokine 1; MAP kinase, mitogen activated protein kinase; HDZip, homeodomain leucine zipper; ERF/AP2, ethylene response factor/apetala2.</p>
<p>*As determined by InterProScan and NCBI conserved domains search.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption>
<p>Arabidopsis loci with conserved peptide uORFs identified from Arabidopsis-rice comparison</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">Homology group</td>
<td align="center">Locus</td>
<td align="center">Gene Name</td>
<td align="center">mORF description</td>
<td align="center">Gene ontology molecular function</td>
<td align="center">Recent duplicate</td>
</tr>
</thead>
<tbody>
<tr>
<td align="center">1</td>
<td align="left">At2g18160.1</td>
<td align="left">
<italic>GBF5, AtbZIP2</italic>
</td>
<td align="left">Basic leucine zipper (bZIP)</td>
<td align="left">Transcription factor</td>
<td align="left">At4g34590</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">At4g34590.1</td>
<td align="left">
<italic>GBF6, ATB2, AtbZIP11</italic>
</td>
<td align="left">bZIP</td>
<td align="left">Transcription factor</td>
<td align="left">At2g18160</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">At3g62420.1
<sup>a</sup>
</td>
<td align="left">
<italic>AtbZIP53</italic>
</td>
<td align="left">bZIP</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">At5g49450.1</td>
<td align="left">
<italic>AtbZIP1</italic>
</td>
<td align="left">bZIP</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">At1g75390.1</td>
<td align="left">
<italic>AtbZIP44</italic>
</td>
<td align="left">bZIP</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">2</td>
<td align="left">At2g27230.1</td>
<td align="left">
<italic>AtBHLH156</italic>
<sup>b</sup>
</td>
<td align="left">Basic helix-loop-helix (bHLH)</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">2</td>
<td align="left">At2g31280.1</td>
<td align="left">
<italic>AtBHLH155</italic>
<sup>b</sup>
</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">At1g06150</td>
</tr>
<tr>
<td align="center">2</td>
<td align="left">At1g06150.1</td>
<td></td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">At2g31280</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">At3g02470.1</td>
<td align="left">
<italic>AdoMetDC1</italic>
</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC</td>
<td align="left">At5g15950</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">At5g15950.1</td>
<td align="left">
<italic>AdoMetDC2</italic>
</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC</td>
<td align="left">At3g02470</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">At3g25570.1</td>
<td align="left">
<italic>AdoMetDC3</italic>
</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">4</td>
<td align="left">At4g25670.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At5g52550</td>
</tr>
<tr>
<td align="center">4</td>
<td align="left">At4g25690.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At5g52550
<sup>c</sup>
</td>
</tr>
<tr>
<td align="center">4</td>
<td align="left">At5g52550.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At4g25670</td>
</tr>
<tr>
<td align="center">5</td>
<td align="left">At5g61230.1</td>
<td></td>
<td align="left">Ankyrin repeat</td>
<td align="left">Protein binding</td>
<td align="left">At5g07840</td>
</tr>
<tr>
<td align="center">5</td>
<td align="left">At5g07840.1</td>
<td></td>
<td align="left">Ankyrin repeat</td>
<td align="left">Protein binding</td>
<td align="left">At5g61230</td>
</tr>
<tr>
<td align="center">6</td>
<td align="left">At2g43020.1</td>
<td></td>
<td align="left">Amine oxidase</td>
<td align="left">Oxidoreductase</td>
<td align="left">At3g59050</td>
</tr>
<tr>
<td align="center">6</td>
<td align="left">At3g59050.1</td>
<td></td>
<td align="left">Amine oxidase</td>
<td align="left">Oxidoreductase</td>
<td align="left">At2g43020</td>
</tr>
<tr>
<td align="center">7</td>
<td align="left">At1g36730.1</td>
<td></td>
<td align="left">Putative eIF-5</td>
<td align="left">Translation initiation factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">8</td>
<td align="left">At3g12010.1
<sup>a</sup>
</td>
<td></td>
<td align="left">Similar to Mic-1</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">At5g09670.1 and .2</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At5g64550</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">At5g64550.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At5g09670</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">At1g64140.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">10</td>
<td align="left">At5g45430.1</td>
<td align="left">
<italic>AtMPK23</italic>
<sup>d</sup>
</td>
<td align="left">MAP kinase, PPC family 4.5.1</td>
<td align="left">ATP binding, protein kinase</td>
<td align="left">At4g19110
<sup>e</sup>
</td>
</tr>
<tr>
<td align="center">10</td>
<td align="left">At4g19110.1</td>
<td align="left">
<italic>AtMPK22</italic>
<sup>d</sup>
</td>
<td align="left">MAP kinase, PPC family 4.5.1</td>
<td align="left">ATP binding, protein kinase</td>
<td align="left">At5g45430
<sup>e</sup>
</td>
</tr>
<tr>
<td align="center">11</td>
<td align="left">At4g12430.1</td>
<td></td>
<td align="left">TPPase</td>
<td align="left">Catalytic activity</td>
<td align="left">At4g22590</td>
</tr>
<tr>
<td align="center">11</td>
<td align="left">At4g22590.1</td>
<td></td>
<td align="left">TPPase</td>
<td align="left">Catalytic activity</td>
<td align="left">At4g12430</td>
</tr>
<tr>
<td align="center">12</td>
<td align="left">At1g70780.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At1g23150</td>
</tr>
<tr>
<td align="center">12</td>
<td align="left">At1g23150.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At1g70780</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">At3g18000.1</td>
<td align="left">
<italic>XPL1, NMT1, PEAMT1</italic>
</td>
<td align="left">Phosphoethanolamine
<italic>N</italic>
-methyltransferase</td>
<td align="left">Methyltransferase</td>
<td align="left">At1g48600</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">At1g48600.2</td>
<td align="left">
<italic>NMT2</italic>
</td>
<td align="left">Methyltransferase</td>
<td align="left">Methyltransferase</td>
<td align="left">At3g18000</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">At1g73600.1</td>
<td align="left">
<italic>NMT3</italic>
</td>
<td align="left">Methyltransferase</td>
<td align="left">Methyltransferase</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">14</td>
<td align="left">At3g01470.1</td>
<td align="left">
<italic>HAT5, HB-1, HD-ZIP-1, ATHB1</italic>
</td>
<td align="left">Homeobox</td>
<td align="left">DNA binding, transcription factor, transcriptional activator</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">At1g29950.2</td>
<td align="left">
<italic>AtBHLH144</italic>
<sup>b</sup>
</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">At5g50010.1</td>
<td align="left">
<italic>AtBHLH145</italic>
<sup>b</sup>
</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">At5g64340.1</td>
<td align="left">
<italic>AtBHLH142</italic>
<sup>b</sup>
,
<italic>SAC51</italic>
</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">At5g09460</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">At5g09460.1
<sup>a</sup>
</td>
<td align="left">
<italic>AtBHLH143</italic>
<sup>b</sup>
</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
<td align="left">At5g64340</td>
</tr>
<tr>
<td align="center">16</td>
<td align="left">At3g51630.1</td>
<td align="left">
<italic>ZIK1, WNK5</italic>
</td>
<td align="left">MAP kinase, PPC family 4.1.5</td>
<td align="left">Protein kinase</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">At1g58120.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">At3g53400.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">At5g03190.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">At5g01710.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">18</td>
<td align="left">At4g36990.1</td>
<td align="left">
<italic>AT-HSFB1, ATHSF4</italic>
</td>
<td align="left">Heat shock factor</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">19</td>
<td align="left">At5g53590.1</td>
<td></td>
<td align="left">SAUR Auxin responsive</td>
<td align="left">Unknown</td>
<td align="left">Not found</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>AdoMetDC, S-Adenosylmethionine decarboxylase; PPC, PlantsP protein kinase classification; TPPase, Trehalose-6-phosphate phosphatase.</p>
<p>
<sup>a</sup>
uORF found upstream of annotated mORF-containing locus (within 2 kb).</p>
<p>
<sup>b</sup>
As designated by Bailey et al [68], nomenclature agreed upon by both Heim et al [69] and Toledo-Ortiz et al [67].</p>
<p>
<sup>c</sup>
At4g25670 and At4g25690 (tandem duplicates) have the same recent retained duplicate (not reported by Blanc and Wolfe).</p>
<p>
<sup>d</sup>
As designated by the PlantsP database [99].</p>
<p>
<sup>e</sup>
Not found in Blanc and Wolfe's initial analysis of ohnologs, but synteny and homology suggest they are retained recent duplicates.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption>
<p>Rice loci with conserved peptide uORFs identified from Arabidopsis-rice comparison</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">Homology group</td>
<td align="center">Locus</td>
<td align="center">mORF description</td>
<td align="center">Gene ontology molecular function</td>
</tr>
</thead>
<tbody>
<tr>
<td align="center">1</td>
<td align="left">LOC_Os02g03960</td>
<td align="left">bZIP</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">LOC_Os09g13570</td>
<td align="left">bZIP</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">LOC_Os05g03860</td>
<td align="left">bZIP</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">LOC_Os03g19370</td>
<td align="left">bZIP</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">1</td>
<td align="left">LOC_Os12g37410</td>
<td align="left">bZIP</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">2</td>
<td align="left">LOC_Os12g06330</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">LOC_Os02g39790</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC activity</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">LOC_Os04g42090</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC activity</td>
</tr>
<tr>
<td align="center">3</td>
<td align="left">LOC_Os09g25620</td>
<td align="left">AdoMetDC</td>
<td align="left">AdoMetDC activity</td>
</tr>
<tr>
<td align="center">4</td>
<td align="left">LOC_Os02g01360</td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">5</td>
<td align="left">LOC_Os02g01240, 133165–133284*</td>
<td align="left">Ankyrin repeat</td>
<td align="left">Protein binding, Acyl CoA binding</td>
</tr>
<tr>
<td align="center">6</td>
<td align="left">LOC_Os04g53190, 31234580–31234757*</td>
<td align="left">Amine oxidase</td>
<td align="left">Amine oxidase</td>
</tr>
<tr>
<td align="center">7</td>
<td align="left">LOC_Os09g15770</td>
<td align="left">IF2B and IF5 domains</td>
<td align="left">Translation initiation</td>
</tr>
<tr>
<td align="center">7</td>
<td align="left">LOC_Os06g48350</td>
<td align="left">IF2B and IF5 domains</td>
<td align="left">Translation initiation</td>
</tr>
<tr>
<td align="center">8</td>
<td align="left">LOC_Os10g26140</td>
<td align="left">Similar to Mic-1</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">LOC_Os04g38520</td>
<td align="left">Expressed transcript</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">LOC_Os02g36590, 22043438–22043536*</td>
<td align="left">Expressed transcript</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">LOC_Os01g43370</td>
<td align="left">Expressed transcript</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">9</td>
<td align="left">LOC_Os02g15880, 8987945–8988028*</td>
<td align="left">Expressed transcript</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">10</td>
<td align="left">LOC_Os06g02550</td>
<td align="left">Protein kinase</td>
<td align="left">Kinase activity</td>
</tr>
<tr>
<td align="center">10</td>
<td align="left">LOC_Os02g47220, 28767408–28767530*</td>
<td align="left">Protein kinase</td>
<td align="left">Kinase activity</td>
</tr>
<tr>
<td align="center">11</td>
<td align="left">LOC_Os02g44230</td>
<td align="left">TPPase</td>
<td align="left">Trehalose phosphatase</td>
</tr>
<tr>
<td align="center">11</td>
<td align="left">LOC_Os10g40550</td>
<td align="left">TPPase</td>
<td align="left">Trehalose phosphatase</td>
</tr>
<tr>
<td align="center">12</td>
<td align="left">LOC_Os02g21920</td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">LOC_Os01g50030</td>
<td align="left">Methyltransferase</td>
<td align="left">Phosphoethanolamine
<italic>N</italic>
-methyltransferase activity</td>
</tr>
<tr>
<td align="center">13</td>
<td align="left">LOC_Os05g47540</td>
<td align="left">Methyltransferase</td>
<td align="left">Phosphoethanolamine
<italic>N</italic>
-methyltransferase activity</td>
</tr>
<tr>
<td align="center">14</td>
<td align="left">LOC_Os08g32080, 19755174–19755260*</td>
<td align="left">Homeobox</td>
<td align="left">DNA binding, transcription factor, protein binding</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">LOC_Os02g21090</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">LOC_Os01g43680, 25011025–25012089*</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">LOC_Os03g39432, 21870203–21870427* (LOC_Os03g39432 v.4 TIGR annotation)</td>
<td align="left">bHLH</td>
<td align="left">Transcription factor</td>
</tr>
<tr>
<td align="center">15</td>
<td align="left">LOC_Os03g27390</td>
<td align="left">bHLH</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">16</td>
<td align="left">LOC_Os11g02300</td>
<td align="left">Protein kinase</td>
<td align="left">Protein kinase</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">LOC_Os07g42830, 25650516–25650623* (LOC_Os0742834 v.4 TIGR annotation)</td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">17</td>
<td align="left">LOC_Os02g52300</td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
</tr>
<tr>
<td align="center">18</td>
<td align="left">LOC_Os09g28350 (LOC_Os09g28354 v.4 TIGR annotation)</td>
<td align="left">Heat shock factor</td>
<td align="left">DNA binding, transcription factor</td>
</tr>
<tr>
<td align="center">19</td>
<td align="left">LOC_Os10g36700 (LOC_Os10g36699 v.4 TIGR annotation)</td>
<td align="left">Auxin responsive</td>
<td align="left">Unknown</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>All locus identifiers based on version 3 TIGR pseudomolecule assembly except where noted. AdoMetDC, S-Adenosylmethionine decarboxylase; TPPase, Trehalose-6-phosphate phosphatase.</p>
<p>*Locus numbers indicate mORF location, and coordinates indicate uORF location in intergenic region on the same chromosome.</p>
</table-wrap-foot>
</table-wrap>
<fig position="float" id="F1">
<label>Figure 1</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 1–4</bold>
. Plant sequences were aligned using ClustalW v. 1.82 and displayed using Jalview. See main text for abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-1"></graphic>
</fig>
<fig position="float" id="F2">
<label>Figure 2</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 5–7</bold>
. Details as in Figure 1.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-2"></graphic>
</fig>
<fig position="float" id="F3">
<label>Figure 3</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 8–11</bold>
. Details as in Figure 1. Decimal places in the group number indicate multiple conserved uORFs in a given 5' UTR.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-3"></graphic>
</fig>
<fig position="float" id="F4">
<label>Figure 4</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 12–15.1</bold>
. Details as in Figure 1. Decimal places in the group number indicate multiple conserved uORFs in a given 5' UTR.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-4"></graphic>
</fig>
<fig position="float" id="F5">
<label>Figure 5</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 15.2–19</bold>
. Details as in Figure 1. Decimal places in the group number indicate multiple conserved uORFs in a given 5' UTR.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-5"></graphic>
</fig>
</sec>
<sec>
<title>Comparison of Arabidopsis homologs detects additional conserved uORFs</title>
<p>Conserved uORFs that are not sufficiently well conserved to be detected in a rice-Arabidopsis comparison could conceivably be detected in ohnologs, homologous genes arising by whole-genome duplication (WGD) [
<xref ref-type="bibr" rid="B30">30</xref>
], and in paralogs, homologous genes arising from segmental duplication or tandem duplication. Modification of uORF-Finder allowed comparison of each full-length cDNA to all other cDNAs in the same collection (see Methods), and identified seven additional conserved uORF homology groups (Tables
<xref ref-type="table" rid="T1">1</xref>
and
<xref ref-type="table" rid="T4">4</xref>
; Figures
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
). Six of these pairs are ohnologs, created by the most recent WGD (24–40 Mya) in an ancestor of Arabidopsis [
<xref ref-type="bibr" rid="B31">31</xref>
-
<xref ref-type="bibr" rid="B33">33</xref>
]. The seventh pair is not found in syntenic regions and is most likely a paralogous pair. It appears to have arisen at about the same time as the recent WGD event because its synonymous substitution frequency (
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
value) of 0.7 is similar to the median
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
of recent duplicate pairs (0.8) and is within their
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
range (0.4–1.6) [
<xref ref-type="bibr" rid="B32">32</xref>
]. The corresponding rice genes in four of the seven homology groups possess uORFs, but lack sufficient uORF sequence similarity to have been detected in the Arabidopsis-rice comparison (Figures
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
).</p>
<fig position="float" id="F6">
<label>Figure 6</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 20 and 21</bold>
. Details as in Figure 1. Groups with similarity in both the monocot and dicot lineages are shown as separate alignments and as a joint alignment.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-6"></graphic>
</fig>
<fig position="float" id="F7">
<label>Figure 7</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 22 and 23</bold>
. Details as in Figure 6.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-7"></graphic>
</fig>
<fig position="float" id="F8">
<label>Figure 8</label>
<caption>
<p>
<bold>Alignments of plant uORF homology groups 24–26</bold>
. Details as in Figure 6.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-8"></graphic>
</fig>
</sec>
<sec>
<title>Purifying selection maintains uORF amino acid sequences</title>
<p>Pairwise
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
tests for selection on amino acid sequences were applied to each uORF homology group and their associated mORFs to determine whether uORF amino acid sequences are under selective constraints similar to their associated mORFs. Both an approximate method (Yn00) and a maximum likelihood method (codeml) were used to calculate mean pairwise
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratios for each group. A
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratio less than 1 implies that negative, or purifying, selection has acted on the sequence, a ratio equal to 1 suggests drift, and a ratio greater than 1 indicates that positive selection has acted on an amino acid sequence. It is also true that conservation at the nucleotide level, not the amino acid level, can drive the
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratio to one. Analysis of all 26 homology groups showed that generally both uORFs and mORFs have been under mild to strong purifying selection since the divergence of each gene pair (Table
<xref ref-type="table" rid="T5">5</xref>
) and these low
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratios suggest that the conservation is at the amino acid level, not simply at the nucleotide level.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption>
<p>Arabidopsis loci with conserved peptide uORFs identified from Arabidopsis-Arabidopsis comparison</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">Homology group</td>
<td align="center">Locus</td>
<td align="center">Gene name</td>
<td align="center">mORF description</td>
<td align="center">Gene ontology molecular function</td>
<td align="center">Recent duplicate</td>
</tr>
</thead>
<tbody>
<tr>
<td align="center">20</td>
<td align="left">At3g53670.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At2g37480
<sup>a</sup>
</td>
</tr>
<tr>
<td align="center">20</td>
<td align="left">At2g37480.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At3g53670
<sup>a</sup>
</td>
</tr>
<tr>
<td align="center">21</td>
<td align="left">At1g68550.1</td>
<td align="left">
<italic>AtERF#118</italic>
<sup>b</sup>
</td>
<td align="left">Group VI-L ERF/AP2 transcription factor</td>
<td align="left">Transcription factor</td>
<td align="left">At1g25470</td>
</tr>
<tr>
<td align="center">21</td>
<td align="left">At1g25470.1</td>
<td align="left">
<italic>AtERF#116</italic>
<sup>b</sup>
</td>
<td align="left">Group VI-L ERF/AP2 transcription factor</td>
<td align="left">Transcription factor</td>
<td align="left">At1g68550</td>
</tr>
<tr>
<td align="center">22</td>
<td align="left">At1g16860.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At1g78880</td>
</tr>
<tr>
<td align="center">22</td>
<td align="left">At1g78880.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At1g16860</td>
</tr>
<tr>
<td align="center">23</td>
<td align="left">At1g64630.1</td>
<td align="left">
<italic>ZIK10</italic>
</td>
<td align="left">MAP kinase, PPC Family 4.1.5</td>
<td align="left">Transcription factor</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">23</td>
<td align="left">At5g41990.1</td>
<td align="left">
<italic>WNK8</italic>
/
<italic>ZIK6</italic>
</td>
<td align="left">MAP kinase, PPC family 4.1.5</td>
<td align="left">Protein kinase</td>
<td align="left">Not found</td>
</tr>
<tr>
<td align="center">24</td>
<td align="left">At3g22970.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At4g14620</td>
</tr>
<tr>
<td align="center">24</td>
<td align="left">At4g14620.1</td>
<td></td>
<td align="left">Expressed transcript</td>
<td align="left">Unknown</td>
<td align="left">At3g22970</td>
</tr>
<tr>
<td align="center">25</td>
<td align="left">At3g45240.1
<sup>c</sup>
</td>
<td></td>
<td align="left">Calcium response kinase, PPC family 4.2.7</td>
<td align="left">ATP binding, protein kinase</td>
<td align="left">At5g60550</td>
</tr>
<tr>
<td align="center">25</td>
<td align="left">At5g60550.1</td>
<td></td>
<td align="left">Calcium response kinase, PPC family 4.2.7</td>
<td align="left">ATP binding, protein kinase</td>
<td align="left">At3g45240</td>
</tr>
<tr>
<td align="center">26</td>
<td align="left">At3g10910.1</td>
<td></td>
<td align="left">Zinc finger, C3HC4-type (RING finger)</td>
<td align="left">Protein binding, zinc ion binding</td>
<td align="left">At5g05280</td>
</tr>
<tr>
<td align="center">26</td>
<td align="left">At5g05280.1</td>
<td></td>
<td align="left">Zinc finger, C3HC4-type (RING finger)</td>
<td align="left">Protein binding, zinc ion binding</td>
<td align="left">At3g10910</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>ERF/AP2, Ethylene Response Factor/Apetela 2 transcription factor; PPC, PlantsP protein kinase classification.</p>
<p>
<sup>a</sup>
Blanc and Wolfe (2004) report that At2g3790 and At3g53670 are retained recent duplicates, but the At2g3790 locus has since been replaced by At2g3780.</p>
<p>
<sup>b</sup>
As defined by Nakano, et al [97] and previously characterized as part of subfamily B-6 by Sakuma, et al [100].</p>
<p>
<sup>c</sup>
uORF found upstream of annotated mORF-containing locus (within 2 kb).</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption>
<p>Mean pairwise
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
values for all pairwise combinations of a given homology group using two methods (yn00 and codeml).</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">Homology group</td>
<td align="center" colspan="2">uORF</td>
<td align="center" colspan="2">mORF</td>
</tr>
<tr>
<td></td>
<td colspan="4">
<hr></hr>
</td>
</tr>
<tr>
<td></td>
<td align="center">yn00</td>
<td align="center">codeml</td>
<td align="center">yn00</td>
<td align="center">codeml</td>
</tr>
</thead>
<tbody>
<tr>
<td align="center">1</td>
<td align="center">0.20</td>
<td align="center">0.16</td>
<td align="center">0.22</td>
<td align="center">0.11</td>
</tr>
<tr>
<td align="center">2</td>
<td align="center">0.28</td>
<td align="center">0.15</td>
<td align="center">0.29</td>
<td align="center">0.19</td>
</tr>
<tr>
<td align="center">3</td>
<td align="center">0.13</td>
<td align="center">0.11</td>
<td align="center">0.15</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">4</td>
<td align="center">0.19</td>
<td align="center">0.18</td>
<td align="center">0.21</td>
<td align="center">0.22</td>
</tr>
<tr>
<td align="center">5</td>
<td align="center">0.06</td>
<td align="center">0.06</td>
<td align="center">0.06</td>
<td align="center">0.08</td>
</tr>
<tr>
<td align="center">6</td>
<td align="center">0.43</td>
<td align="center">0.01
<sup>a</sup>
</td>
<td align="center">0.10</td>
<td align="center">0.08</td>
</tr>
<tr>
<td align="center">7</td>
<td align="center">0.43</td>
<td align="center">0.89</td>
<td align="center">0.09</td>
<td align="center">0.05</td>
</tr>
<tr>
<td align="center">8</td>
<td align="center">0.14</td>
<td align="center">0.01
<sup>a</sup>
</td>
<td align="center">0.11</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">9</td>
<td align="center">0.19</td>
<td align="center">0.05</td>
<td align="center">0.20</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">10.1
<sup>b</sup>
</td>
<td align="center">0.69</td>
<td align="center">0.48</td>
<td align="center">0.10</td>
<td align="center">0.10</td>
</tr>
<tr>
<td align="center">10.2
<sup>b</sup>
</td>
<td align="center">0.70</td>
<td align="center">0.64</td>
<td></td>
<td></td>
</tr>
<tr>
<td align="center">11</td>
<td align="center">0.13</td>
<td align="center">0.09</td>
<td align="center">0.13</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">12</td>
<td align="center">0.25</td>
<td align="center">0.26</td>
<td align="center">0.15</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">13</td>
<td align="center">0.07</td>
<td align="center">0.04</td>
<td align="center">0.10</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">14</td>
<td align="center">0.17</td>
<td align="center">0.05</td>
<td align="center">0.14</td>
<td align="center">0.01
<sup>a</sup>
</td>
</tr>
<tr>
<td align="center">15.1
<sup>b</sup>
</td>
<td align="center">0.31</td>
<td align="center">0.17</td>
<td align="center">0.34</td>
<td align="center">0.21</td>
</tr>
<tr>
<td align="center">15.2
<sup>b</sup>
</td>
<td align="center">0.03</td>
<td align="center">0.07</td>
<td></td>
<td></td>
</tr>
<tr>
<td align="center">15.3
<sup>b</sup>
</td>
<td align="center">0.37</td>
<td align="center">0.16</td>
<td></td>
<td></td>
</tr>
<tr>
<td align="center">16</td>
<td align="center">0.30</td>
<td align="center">0.11</td>
<td align="center">0.10</td>
<td align="center">0.11</td>
</tr>
<tr>
<td align="center">17</td>
<td align="center">0.28</td>
<td align="center">0.24</td>
<td align="center">0.41</td>
<td align="center">0.11</td>
</tr>
<tr>
<td align="center">18</td>
<td align="center">0.26</td>
<td align="center">0.01
<sup>a</sup>
</td>
<td align="center">0.15</td>
<td align="center">0.01
<sup>a</sup>
</td>
</tr>
<tr>
<td align="center">19</td>
<td align="center">0.00
<sup>a</sup>
</td>
<td align="center">0.01
<sup>a</sup>
</td>
<td align="center">0.01
<sup>a</sup>
</td>
<td align="center">0.01
<sup>a</sup>
</td>
</tr>
<tr>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td align="center">20</td>
<td align="center">0.13</td>
<td align="center">0.17</td>
<td align="center">0.48</td>
<td align="center">0.39</td>
</tr>
<tr>
<td align="center">21</td>
<td align="center">0.47</td>
<td align="center">0.44</td>
<td align="center">0.11</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">22</td>
<td align="center">0.52</td>
<td align="center">0.16</td>
<td align="center">0.09</td>
<td align="center">0.09</td>
</tr>
<tr>
<td align="center">23</td>
<td align="center">0.57</td>
<td align="center">0.43</td>
<td align="center">0.23</td>
<td align="center">0.21</td>
</tr>
<tr>
<td align="center">24</td>
<td align="center">0.18</td>
<td align="center">0.20</td>
<td align="center">0.22</td>
<td align="center">0.20</td>
</tr>
<tr>
<td align="center">25</td>
<td align="center">0.53</td>
<td align="center">0.50</td>
<td align="center">0.16</td>
<td align="center">0.14</td>
</tr>
<tr>
<td align="center">26</td>
<td align="center">0.37</td>
<td align="center">0.28</td>
<td align="center">0.23</td>
<td align="center">0.22</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<sup>a</sup>
<italic>K</italic>
<sub>
<italic>a </italic>
</sub>
or
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
values too high to determine
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratio accurately.</p>
<p>
<sup>b</sup>
Decimal points after homology group numbers are used when multiple independent uORF peptides are conserved within a single transcript.</p>
</table-wrap-foot>
</table-wrap>
<p>One possible explanation for low
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
ratios in the putative uORFs invokes an incomplete splicing of the full-length cDNAs for which the uORF and mORF are normally fused. To address this possibility, all Genbank Arabidopsis ESTs were screened for evidence of uORF-mORF translational fusions. No ORFs were found to run continuously between the uORF and mORF, with one exception. A fusion product (Genbank accession no.
<ext-link ext-link-type="gen" xlink:href="DR353698">DR353698</ext-link>
) was identified between the N-terminal and central region of the uORF and the central and C-terminal region of the mORF found at locus At5g03190 (group 17). Classification of this putative uORF is shown in Table
<xref ref-type="table" rid="T1">1</xref>
for two reasons. Firstly, the four uORF C-terminal amino acids that are excluded in the fusion EST are perfectly conserved in monocot and dicot members, and the position of their stop codon is perfectly conserved, therefore it is difficult to explain this conservation if the uORF is not translated. Secondly, the N-terminal portion of the mORF that is removed in the fusion EST is similar between three Arabidopsis loci of the same homology group, with the start codon position also being conserved in these three members. It is likely, therefore, that the fusion EST represents an alternatively spliced form of this transcript, but further characterization of this locus will be needed to support this conclusion. Most of the homology groups show uORFs with conserved amino acid residues at the C-terminus and an identical positioning of the uORF stop codon (Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
). This would suggest that the full-length cDNAs are fully spliced and are not erroneously predicting uORF sequences due to incomplete splicing.</p>
</sec>
<sec>
<title>Conserved features of uORF sequences</title>
<p>The lengths of uORFs vary to differing degrees within and among homology groups, but in amino acid sequence alignments nearly all groups exhibit considerable conservation of the position of the N-terminus and/or the C-terminus, i.e., length variation is usually due to a variable region in the middle or at one end of the uORF (Table
<xref ref-type="table" rid="T6">6</xref>
; Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
). The amino acid sequences of some uORFs possess potentially interesting features. Notably, some uORF groups possess regions rich in serine, threonine, and/or tyrosine, and others possess regions rich in lysine and/or arginine. Two homology groups are particularly noteworthy: Group 8 uORFs specify peptides with a coiled coil-helix, coiled coil-helix (CHCH) domain (Pfam accession number PF06747; Figure
<xref ref-type="fig" rid="F9">9</xref>
), and group 13 uORFs encode peptides that are extremely serine/arginine-rich (Figure
<xref ref-type="fig" rid="F10">10</xref>
). Both of these unusual peptides will be discussed in further detail below.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption>
<p>uORF features conserved between Arabidopsis and rice</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="left">uORF homology group</td>
<td align="left">uORF length (amino acids)</td>
<td align="left" colspan="5">Conserved sequence features</td>
</tr>
<tr>
<td></td>
<td></td>
<td colspan="5">
<hr></hr>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td align="left">Length conserved at N- and C-Termini</td>
<td align="left">N-terminus (amino acids)</td>
<td align="left">Middle (amino acids)</td>
<td align="left">C-terminus (amino acids)</td>
<td align="left">Overall</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" colspan="7">uORF features conserved between Arabidopsis and rice</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">25–43</td>
<td align="left">C-terminus</td>
<td></td>
<td></td>
<td align="left">SY-rich: 5/14</td>
<td align="left">20–39% STY</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">34–39</td>
<td align="left">C-terminus</td>
<td></td>
<td></td>
<td align="left">KR-rich: 5–6/20</td>
<td align="left">18–24% KR</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">50–54</td>
<td align="left">N- and C-termini</td>
<td align="left">K-rich: 4/9</td>
<td align="left">S-rich: 5–6/6</td>
<td align="left">KR-rich: 4–5/16, SY-rich: 4–5/12</td>
<td align="left">17–23% KR. 22–29% SY</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">52–55</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td align="left">SY-rich: 7/30</td>
<td align="left">21–23% STY</td>
</tr>
<tr>
<td align="left">5</td>
<td align="left">38–41</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">6</td>
<td align="left">55–68</td>
<td align="left">N-terminus</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">7</td>
<td align="left">57–105</td>
<td align="left">N-terminus</td>
<td align="left">STY-rich: 6–7/22</td>
<td align="left">KR-rich: 4/5–8</td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">8</td>
<td align="left">61–62</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td align="left">CHCH domain, 17% KR</td>
</tr>
<tr>
<td align="left">9</td>
<td align="left">17–33</td>
<td align="left">N-terminus</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">10.1</td>
<td align="left">41</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">11</td>
<td align="left">24–44</td>
<td align="left">C-terminus</td>
<td></td>
<td></td>
<td align="left">KR-rich: 5–6/15</td>
<td align="left">25% KR</td>
</tr>
<tr>
<td align="left">12</td>
<td align="left">39–51</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">13</td>
<td align="left">25</td>
<td align="left">N- and C-termini</td>
<td></td>
<td align="left">RS-rich: 10–12/18</td>
<td></td>
<td align="left">40–48% RS</td>
</tr>
<tr>
<td align="left">14</td>
<td align="left">29</td>
<td align="left">N-terminus</td>
<td></td>
<td align="left">ST-rich: 4–6/10</td>
<td></td>
<td align="left">14–32% STY</td>
</tr>
<tr>
<td align="left">15.1</td>
<td align="left">18–27</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td align="left">8/9 hydrophobic</td>
<td></td>
</tr>
<tr>
<td align="left">15.3</td>
<td align="left">43–54</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td align="left">13/14 completely conserved</td>
<td></td>
</tr>
<tr>
<td align="left">16</td>
<td align="left">40–62</td>
<td align="left">C-terminus</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">17</td>
<td align="left">36–45</td>
<td align="left">C-terminus</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">18</td>
<td align="left">36–38</td>
<td align="left">Neither</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">19</td>
<td align="left">30–34</td>
<td align="left">N-terminus</td>
<td></td>
<td></td>
<td></td>
<td align="left">29% ST</td>
</tr>
<tr>
<td align="left" colspan="7">uORF features conserved between Arabidopsis paralogs</td>
</tr>
<tr>
<td align="left">20</td>
<td align="left">41–43</td>
<td align="left">N- and C-termini</td>
<td align="left">STY-rich: 8–9/27</td>
<td></td>
<td></td>
<td align="left">23% STY</td>
</tr>
<tr>
<td align="left">21</td>
<td align="left">87–90</td>
<td align="left">N- and C-termini</td>
<td></td>
<td align="left">ST-rich: 11–12/17–22</td>
<td></td>
<td align="left">22–25% ST</td>
</tr>
<tr>
<td align="left">22</td>
<td align="left">25</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">23</td>
<td align="left">69–71</td>
<td align="left">Neither</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">24</td>
<td align="left">31–34</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">25</td>
<td align="left">25</td>
<td align="left">N- and C-termini</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">26</td>
<td align="left">22</td>
<td align="left">N-terminus</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</table-wrap>
<fig position="float" id="F9">
<label>Figure 9</label>
<caption>
<p>
<bold>Group 8 small ORF/uORF alignment and percent identity across various eukaryotes</bold>
. Representative eukaryotic species aligned using Muscle and displayed by percent identity using Jalview. Arrowheads represent two conserved intron positions for all but Mesvi (no genomic support), Dicdi (first but not second intron present), Ciosa (no introns), Caeel (no introns), Drome (no introns), and Neucr (first but not second intron present based on predicted mRNA). See main text for abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-9"></graphic>
</fig>
<fig position="float" id="F10">
<label>Figure 10</label>
<caption>
<p>
<bold>Group 13 alignment and percent identity of (A) uORF and (B) mORF sequences</bold>
. Representative eukaryotic species were aligned using Muscle and displayed using Jalview. Panel A alignment is restricted to the first 50 amino acid positions, which excludes the full 92 amino acid uORF of
<italic>Cycas rumphii</italic>
. All other uORFs are shown in their entirety. Panel B alignment is restricted to the first 100 amino acid positions of the mORFs. See main text for abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-10"></graphic>
</fig>
</sec>
<sec>
<title>Most genes with conserved uORFs appear to have regulatory functions</title>
<p>A total of 31% of mORFs encoded by conserved peptide uORF loci in Arabidopsis were predicted to be a transcription factor, as determined by GO molecular function terms (Tables
<xref ref-type="table" rid="T2">2</xref>
and
<xref ref-type="table" rid="T4">4</xref>
), whereas only 5.9% of all Arabidopsis loci are predicted to encode transcription factors [
<xref ref-type="bibr" rid="B34">34</xref>
]. Thus, genes predicted to encode transcription factors are significantly overrepresented (p = 1.2 × 10
<sup>-7</sup>
) among conserved peptide uORF loci. In each case, GO terms were validated by manual annotation of protein functions using domain predictions from NCBI Conserved Domain and InterProScan Database searches [
<xref ref-type="bibr" rid="B35">35</xref>
,
<xref ref-type="bibr" rid="B36">36</xref>
]. A variety of different types of transcription factors, including bZIP, Ethylene Response Factor/Apetala 2-like (ERF/AP2-like), basic helix-loop-helix (bHLH), and homeobox proteins, are represented among conserved peptide uORF loci with no demonstrable bias. No other GO terms were found to be significantly over- or under-represented in the uORF data set.</p>
<p>Biological functions could be inferred for 16 of the 26 uORF homology groups (Table
<xref ref-type="table" rid="T1">1</xref>
). Six groups encode transcription factor homologs and so are presumably involved in transcriptional control (1, 2, 14, 15, 18, and 21). Five groups are likely to be involved in signal transduction, including four protein kinases and a putative calmodulin-binding protein involved in auxin response (groups 10, 16, 19, 23, 25). Two groups are involved in the metabolism of small molecules that regulate plant development: polyamines (group 3) [
<xref ref-type="bibr" rid="B1">1</xref>
] and trehalose (group 11) [
<xref ref-type="bibr" rid="B37">37</xref>
]. One group (13) encodes the key enzyme in the biosynthesis of phosphocholine, which is an intermediate in biosynthesis of phosphatidylcholine and phosphatidic acid; phosphocholine levels influence levels of phosphatidic acid, an important physiological and developmental signal molecule [
<xref ref-type="bibr" rid="B38">38</xref>
-
<xref ref-type="bibr" rid="B40">40</xref>
]. Group 7 putatively encodes translation initiation factor eIF5, which influences start codon selection, and Group 26 encodes a RING finger protein, suggesting a role in targeted protein turnover by ubiquitination. Of the remaining 10 groups, 8 encode predicted proteins of unknown function, 1 encodes an ankyrin-repeat protein, and 1 encodes an amine oxidase. Thus, all but two families of conserved uORF genes whose functions are known or can be inferred potentially play a regulatory role in the biology of plants.</p>
</sec>
<sec>
<title>Genes with conserved uORFs were preferentially retained after whole genome duplication</title>
<p>Since the most recent WGD event in the Arabidopsis lineage, only 14% of the original gene pairs present in the ancestral tetraploid have been retained as a duplicate pair in the extant Arabidopsis genome, i.e., for the remaining 86% of ancestral gene pairs, one member has been lost [
<xref ref-type="bibr" rid="B32">32</xref>
]. Among 31 ancestral gene pairs that possessed conserved uORFs at the time immediately following the genome duplication, 12 (39%) pairs have been retained in the present Arabidopsis genome (Table
<xref ref-type="table" rid="T2">2</xref>
), which is significantly higher than the genome-wide average (p = 0.0005). The conserved uORF was retained in both copies of each of the twelve retained duplicate pairs. Retention of these 12 uORFs in both paralogs suggests that they act
<italic>in cis</italic>
, consistent with the expectation that uORFs typically control translation of downstream mORFs on the same RNA molecule [
<xref ref-type="bibr" rid="B41">41</xref>
].</p>
<p>The overrepresentation of transcription factors among conserved uORF loci could be due, in part, to preferential retention of transcription factor recent duplicates (22.7% retention of transcription factor duplicates vs 14.4% retention genome-wide) [
<xref ref-type="bibr" rid="B32">32</xref>
], but this alone does not account for the high frequency of predicted transcription factors among the uORF loci. When duplicate history bias is removed by calculating GO term frequencies of the pre-genome-duplication set of loci, transcription factors are still overrepresented (11/31 loci, or 35%).</p>
</sec>
<sec>
<title>Conserved angiosperm uORF peptide sequences in primitive plants and other eukaryotes</title>
<p>To determine whether any of the 19 uORF homology groups conserved between rice and Arabidopsis might also be present in other eukaryotes, we searched for uORF sequences in all Genbank eukaryotic ESTs. Amino acid sequences similar to four homology groups (3, 8, 13, and 15) were detected in non-angiosperms. Group 15 was found only as distantly as a fern (
<italic>Adiantum</italic>
); group 3 was found as far from angiosperms as the green algae (
<italic>Ulva</italic>
); group 13 was found in an animal (
<italic>Xenopus tropicalis</italic>
); and group 8 uORF sequence was found in primitive plants, animals, fungi, and a slime mold (Figures
<xref ref-type="fig" rid="F9">9</xref>
and
<xref ref-type="fig" rid="F10">10</xref>
). Another algal sequence (
<italic>Chlamydomonas</italic>
) from the Genbank non-redundant database was identified belonging to group 3 (Genbank:
<ext-link ext-link-type="gen" xlink:href="AJ841703">AJ841703</ext-link>
). The group 13 uORF homolog found in a
<italic>X. tropicalis </italic>
EST was also found in a genomic contig sequence [
<xref ref-type="bibr" rid="B42">42</xref>
] in which the uORF homolog is flanked by genes that are more similar to animal sequences than to any known plant sequences. Thus, this group 13 uORF homolog most likely exists in the
<italic>Xenopus </italic>
genome rather than being an EST library contaminant.</p>
<p>Sequences similar to group 8 Arabidopsis and rice uORFs were found in most eukaryotes, but transcript sequence following the uORF varied among the different lineages. All land plant uORFs were associated with macrophage inhibitory cytokine-1-like (Mic1-like) mORF sequences while the mORFs downstream of the group 8 uORF homologs in nematodes and arthropods code for an unknown protein and a putative mannosyl transferase, respectively (Figure
<xref ref-type="fig" rid="F11">11</xref>
). Available EST sequences for each of the group 8 uORF homologs in mammals, fungi, algae, and slime mold end shortly after the conserved peptide uORF, suggesting that in these eukaryotes the uORF homolog is not associated with a mORF and is simply a short ORF. This is further supported by more than 10 human ESTs that end at the same position and include a polyA sequence. In the sea squirt lineage a putative mORF is present in the EST sequences, but a full-length cDNA sequence will be needed to further investigate this possibility.</p>
<fig position="float" id="F11">
<label>Figure 11</label>
<caption>
<p>
<bold>Diagrammatic representation of Group 8 features among eukaryotes</bold>
. Light grey boxes represent small ORFs/uORFs, four perfectly conserved cysteine residues are shown as 'C', and numbers within triangles represent the number of amino acids between the immediately preceding cysteine and an intron. Brackets surrounding fungal introns represent the variable nature of the intron position and/or presence. White boxes show mORFs directly downstream of the uORFs in a given lineage. Presence of a polyA tail is likely to occur in vertebrates (pA; see Results). Question marks indicate mORFs could be present, but insufficient EST sequence is available to infer this feature reliably.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-11"></graphic>
</fig>
<p>Although there is variability in the sequences found downstream of group 8 uORFs, three features of these uORF homologs are relatively well conserved: the length of the predicted uORF, the relative positions of four cysteine codons, and the positions of two introns (Figure
<xref ref-type="fig" rid="F9">9</xref>
). The length of the uORF peptide ranges from 51 amino acids in
<italic>Haemonchus </italic>
(nematode), to 74 amino acids in humans, and length is even more highly conserved within each of the land plant, arthropod, nematode, fungal, and vertebrate lineages (59–62, 65–69, 51–68, 54–66, and 69–74 amino acids, respectively). Four cysteine residues consistently align in all eukaryotes, with nine amino acids separating the first and second cysteine residues, as well as the third and fourth cysteine residues, whereas 11–15 residues separate the second and third cysteines. Two intron positions are perfectly conserved among the land plants, vertebrates, and at least one member of the fungal lineage. The first intron lies between the third and fourth amino acids following the first conserved cysteine position, and the second intron lies between the fourth and fifth amino acids following the fourth conserved cysteine position (Figure
<xref ref-type="fig" rid="F11">11</xref>
). The first and/or second intron positions are present in
<italic>Dictyostelium</italic>
, algae, and some fungi, but are absent in nematodes, arthropods, and sea squirts.</p>
<p>The four cysteines are part of a putative coiled coil-helix, coiled coil-helix (CHCH) domain (Pfam accession number PF06747), also found in three small yeast proteins, Cox17p, Cox19p, and Mrp10p. Cox17p and Cox19p are required for assembly of functional cytochrome oxidase and Mrp10p is homologous to a nuclear-encoded mitochondrial ribosomal protein. A hypothetical human gene, CHCH domain 7 (
<italic>CHCHD7</italic>
), is also similar to the group 8 uORF, as determined by BLAST similarity searches.</p>
</sec>
<sec>
<title>Phylogenetic relationships among group 8-like ORFs</title>
<p>Fungal, animal, and plant representatives of each CHCH-containing ORF were identified using a BLAST search, and their evolutionary relationships were inferred using a Bayesian phylogenetic analysis (Figure
<xref ref-type="fig" rid="F12">12</xref>
; Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
). Animal Mrp10p-like (Genbank:
<ext-link ext-link-type="gen" xlink:href="BC075310">BC075310</ext-link>
,
<ext-link ext-link-type="gen" xlink:href="DR155443">DR155443</ext-link>
and
<ext-link ext-link-type="gen" xlink:href="BX935835">BX935835</ext-link>
), Debaryomyces group 8-like (Genbank:
<ext-link ext-link-type="gen" xlink:href="NC_006045">NC_006045</ext-link>
), and Dictyostelium Cox19p-like (Genbank:
<ext-link ext-link-type="gen" xlink:href="XM_631387">XM_631387</ext-link>
) sequences were more divergent than other sequences, causing long branch attraction [
<xref ref-type="bibr" rid="B43">43</xref>
]. Thus, these sequences were removed from the analysis to prevent tree topology distortion. Five distinct clades were observed, which we refer to as Cox17p-like, Cox19p-like, Mrp10p-like, CHCHD7-like, and uORF group 8-like (Figure
<xref ref-type="fig" rid="F12">12</xref>
). All clades but one (Mrp10p-like) contain representatives from fungi, animals, and plants and are strongly supported, showing branch order probabilities greater than 0.8, which suggests that these sequences emerged in a common eukaryotic ancestor and have since diverged in the three lineages. Mrp10p-like sequences do not strongly group independently of other branches (P = 0.57), which could be due to highly divergent amino acid sequence represented by relatively long branches. The tree shows that the group 8-like proteins are a distinct clade from other CHCH domain proteins (P = 1.0), and that CHCHD7-like proteins are more closely related to group 8-like members than to other CHCH-containing proteins (P = 0.94). The tree topology also indicates that Cox17p-like and Cox19p-like genes are more closely related to each other than to other CHCH proteins (P = 0.97).</p>
<fig position="float" id="F12">
<label>Figure 12</label>
<caption>
<p>
<bold>Phylogenetic tree depicting CHCH domain-containing genes and alignment</bold>
. Unrooted phylogenetic tree generated using MrBayes 3.0. See main text for abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-12"></graphic>
</fig>
<p>A separate phylogenetic analysis of the 46 group 8-like sequences shows that most cluster into five taxonomic groups (plants and green algae, arthropods, nematodes, vertebrates, and fungi) with strong branch support (0.85–1.00) in all but the fungal lineage (0.58; Figure
<xref ref-type="fig" rid="F13">13</xref>
). Sea squirt sequences group with one of two
<italic>Branchiostoma </italic>
sequences with weak branch support (0.53).
<italic>Dictyostelium</italic>
, sea urchin (
<italic>Strongylocentrotus</italic>
), and one further
<italic>Branchiostoma </italic>
sequence do not group with any of these with weak support (0.53). Sea squirt,
<italic>Branchiostoma</italic>
, and sea urchin sequences should be more similar to other deuterastomes (includes the vertebrate lineage) than other organisms, but the short group 8-like sequence alignment could prevent resolution of correct evolutionary relationships of some groups (Additional file
<xref ref-type="supplementary-material" rid="S2">2</xref>
). Despite weakly supported branches, there is strong support for independent clustering of the arthropods, nematodes, vertebrates and plants, as expected.</p>
<fig position="float" id="F13">
<label>Figure 13</label>
<caption>
<p>
<bold>Phylogenetic tree depicting group 8 small ORFs/uORFs and alignment</bold>
. Unrooted phylogenetic tree generated using MrBayes 3.0. See main text for abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier.</p>
</caption>
<graphic xlink:href="1741-7007-5-32-13"></graphic>
</fig>
<p>Although two
<italic>Branchiostoma </italic>
group 8-like sequences (Brafl1 and 2) suggest that there has been a duplication event within this lineage, there is no evidence for maintenance of ancient group 8-like gene duplications occurring within the plant, vertebrate, nematode, arthropod, or fungal lineages. In Arabidopsis both the recent and ancient duplicates from two WGD events have been lost from the genome. Only the
<italic>Mesostigma </italic>
genome contains two group 8-like transcripts. Their short branch lengths indicate that this duplication occurred relatively recently and it is possible that insufficient time has passed for loss of the second copy.</p>
</sec>
</sec>
<sec>
<title>Discussion and conclusion</title>
<p>Comparative analysis by uORF-Finder of 5' UTRs in full-length cDNAs from two distantly related plant species, rice and Arabidopsis, identified conserved peptide uORFs in 58 Arabidopsis loci that comprised 26 uORF homology groups and in 36 rice loci that comprised 19 homology groups, increasing the number of known conserved uORF homology groups from two to 26 and providing useful, new information for investigations of regulatory biology. Because full-length cDNAs derived from both Arabidopsis and rice only represent a fraction of all nuclear genes, not all conserved uORFs are expected to be detected by this approach. Extrapolation to the whole Arabidopsis genome suggests that it possesses approximately 61 to 102 genes with conserved peptide uORFs that are also conserved in the rice genome (see Methods for calculation). An additional 24 conserved peptide uORF genes are predicted among Arabidopsis loci with retained duplicates from the most recent WGD event. In all, there are likely to be approximately 99–140 genes, or 0.38–0.53% of all protein-coding genes, with conserved peptide uORFs in the Arabidopsis genome. Because short conserved uORFs (<20 amino acids) would not have been detected by uORF-Finder, this is a conservative estimate.</p>
<p>To find additional conserved uORFs, more extensive collections of full-length cDNA sequences will need to be developed and/or 5' UTRs predicted from genomic sequence will be required. As full-length cDNA sequence resources become available for other plant species, such as maize [
<xref ref-type="bibr" rid="B44">44</xref>
] and poplar [
<xref ref-type="bibr" rid="B45">45</xref>
], it should be possible to identify additional conserved uORFs that might be specific to taxonomic groups, such as monocotyledons or dicotyledons. Similarly, analysis of ancient tetraploidy events in species such as poplar and maize might be able to identify uORFs conserved between retained duplicates.</p>
<sec>
<title>Conserved uORF genes are regulatory genes</title>
<p>Based on the study of a few hundred genes, it has been suggested that uORFs are usually associated with mORFs that encode proteins that regulate cell growth [
<xref ref-type="bibr" rid="B41">41</xref>
,
<xref ref-type="bibr" rid="B46">46</xref>
], but a genome-wide study of upstream AUGs (uAUGs) found no correlation of uAUG-containing transcripts with any particular gene ontology (GO) molecular function term in mammalian transcripts [
<xref ref-type="bibr" rid="B6">6</xref>
]. These observations did not differentiate between sequence-dependent and sequence-independent uORFs. Our analysis shows that genes encoding transcription factors are overrepresented among genes predicted to encode conserved peptide uORFs, representing almost one third of the 58 Arabidopsis loci as compared to 6% of all genes. Moreover, nearly all genes whose function can be reasonably inferred appear to play some regulatory role in the biology of plants.</p>
</sec>
<sec>
<title>Do conserved peptide uORFs mediate feedback translational regulation by small regulatory molecules?</title>
<p>Certain eukaryotic conserved peptide uORFs are known to control translation of a downstream mORF in response to a metabolic product such as arginine or polyamines [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
,
<xref ref-type="bibr" rid="B47">47</xref>
]. In the case of the fungal arginine-regulated carbamoyl-phosphate synthase subunit, a uORF codes for the arginine attenuator peptide that responds to increased arginine concentrations by causing ribosomes to stall near the 3' end of the uORF, interfering with ribosome scanning and translation of the downstream mORF [
<xref ref-type="bibr" rid="B14">14</xref>
]. A similar mechanism has been elucidated for the regulation of AdoMetDC in which the uORF peptide interferes with the termination of uORF translation in a polyamine-dependent manner [
<xref ref-type="bibr" rid="B48">48</xref>
,
<xref ref-type="bibr" rid="B49">49</xref>
]. In plants, sucrose is a signaling molecule that controls not only the transcription of many genes, but also translation of a class of bZIP transcription factors via their conserved uORF, suggesting the possibility of sucrose interaction with a uORF-encoded peptide to regulate translation downstream [
<xref ref-type="bibr" rid="B4">4</xref>
].</p>
<p>Our analysis identified not only these previously known examples of genes involved in pathways exhibiting small molecule feedback in a uORF sequence-dependent manner, but several additional genes that might also act via this mechanism. One is the conserved group 13 uORFs, which are present in genes that encode phosphoethanolamine
<italic>N</italic>
-methyltransferase (PEAMT/NMT), the key enzyme in phosphocholine (PCho) biosynthesis. Recently,
<italic>NMT1 </italic>
has been shown to contain a uORF that differentially affects translation of the mORF in response to exogenously added choline [
<xref ref-type="bibr" rid="B50">50</xref>
]. This effect is observed when the uORF start codon is abolished but it remains to be determined whether the response to choline is uORF sequence-specific. Intriguingly, the group 13 uORF peptide is rich in arginine and serine (40–48% in Arabidopsis and rice genes; Table
<xref ref-type="table" rid="T6">6</xref>
). A variety of arginine-rich peptides 15–20 amino acids long with 5 or more arginines bind to specific RNA sequences [
<xref ref-type="bibr" rid="B51">51</xref>
]. The predicted group 13 uORF peptide has 5–7 arginines in a 16–17 amino acid region, well within this range, suggesting the possibility that it might bind to a specific RNA sequence, perhaps in
<italic>PEAMT</italic>
/
<italic>NMT </italic>
transcripts. The fact that the group 13 uORF peptide was also found in Xenopus suggests that its regulatory role is widespread in eukaryotes.</p>
<p>Another example is homology group 11, whose mORFs are predicted to encode trehalose-6-phosphate phosphatase (TPPase); trehalose-6-phosphate is postulated to regulate sugar metabolism in plants [
<xref ref-type="bibr" rid="B37">37</xref>
]. In summary, sucrose, polyamines, phosphatidic acid, and trehalose-6-phosphate are possible regulators of translation of downstream mORFs through interaction with conserved uORFs. Also interesting in this light are group 19, which specifies an auxin-induced calmodulin-binding homolog, and group 15, which encodes a bHLH transcription factor that is believed to be subject to translational control through its conserved uORF by spermine synthase [
<xref ref-type="bibr" rid="B52">52</xref>
]. Spermine is a polyamine signal molecule necessary for normal plant growth and defense responses.</p>
<p>As mentioned, six conserved uORF families specify transcription factors, one of which is regulated by the small signaling molecule sucrose. In plants, transcription factors often act quantitatively to control target gene expression proportionate to transcription factor concentration [
<xref ref-type="bibr" rid="B53">53</xref>
]. Therefore, it is interesting to consider the possibility that translational control of transcription factor protein levels could be mediated by interaction of a conserved uORF peptide with a metabolite. This might be an effective means for quantitatively modulating the levels of expression of a pathway or network of downstream genes, for instance, in response to changing physiological or environmental conditions. This logic can equally be applied to other key control proteins and their uORFs.</p>
</sec>
<sec>
<title>How is translational control mediated by conserved peptide uORFs?</title>
<p>If conserved uORF peptides can regulate mORF levels in response to small molecules, they are clearly analogous to RNA sensors and riboswitches that sense small molecules and regulate transcript translation accordingly [
<xref ref-type="bibr" rid="B28">28</xref>
,
<xref ref-type="bibr" rid="B54">54</xref>
]. It is interesting to think of conserved peptide uORFs too as sensors of cellular, physiological, or developmental conditions. Although the role of conserved uORFs as 'sensors' of cellular metabolites has been clearly established in the cases of polyamine, sucrose, and arginine concentration, it is still not clear how uORF peptides gauge cellular conditions. uORF peptides could affect mORF translation by interacting directly with the ribosomal complex, by associating with other proteins that influence the translational machinery, and/or by stabilizing or destabilizing RNA secondary structures in the 5' UTR that impede or promote mORF translation. Given the variety of uORF peptides represented in the 26 homology groups, each of these possibilities could occur one or more times.</p>
<p>It is perhaps interesting to note also that the uORFs of 9 homology groups are rich in serine, threonine, and/or tyrosine. These amino acids are potential targets for phosphorylation that conceivably could promote or inhibit ribosome stalling or initiation at downstream mORFs. As mentioned above, lysine/arginine-rich motifs could function in RNA binding [
<xref ref-type="bibr" rid="B51">51</xref>
].</p>
</sec>
<sec>
<title>Effect of nonsense-mediated decay on uORF transcripts</title>
<p>Because uORFs create a premature termination codon (PTC), the nonsense-mediated decay (NMD) system might target uORF transcripts for degradation. Yoine et al [
<xref ref-type="bibr" rid="B55">55</xref>
] carried out a microarray analysis of plants mutant in the
<italic>UPF1 </italic>
ortholog, which is required for NMD.</p>
<p>Among 75 genes that Yoine et al identified that accumulate transcripts at more than twice the level in the
<italic>upf1 </italic>
mutant as in wild type Arabidopsis, we found representatives of seven uORF homology groups (1, 7, 10, 12, 13, 15, and 17), suggesting that these uORF transcripts are susceptible to nonsense-mediated decay. The uORFs in these groups might work in a manner analogous to the uORF arginine attenuator protein (AAP) in the fungal CPA1 transcript. The CPA1 transcript exclusively exhibits increased levels of degradation via NMD when the AAP inhibits translation termination in response to high levels of arginine, ultimately decreasing translation using a two-pronged approach [
<xref ref-type="bibr" rid="B56">56</xref>
]. Similarly, the above-identified plant uORFs could intensify translational inhibition of their associated mORFs by both blocking the ribosome physically and inducing the NMD pathway.</p>
</sec>
<sec>
<title>Evolutionary emergence of uORFs and a 'transcriptional fusion' model</title>
<p>Very little is known about how uORFs arise. In the extant rice and Arabidopsis genomes, sequences homologous to uORFs identified by uORF-Finder were observed only in 5' UTRs and never as part of another mORF, within 3' UTRs, within introns, or in non-transcribed regions. Possible origins of 5' UTR ORFs include (a) fragmentation of mORF sequences, (b) creation of an AUG or alternate start codon by random mutation within the 5' UTR and subsequent selection for the peptide sequence, and (c) relocation of other ORF sequences within the genome to the 5' UTR or upstream region of a given gene and subsequent transcriptional fusion of the two ORFs.</p>
<p>Transcriptional fusions occur in an estimated 2% of adjacently transcribed mRNAs in the human genome [
<xref ref-type="bibr" rid="B57">57</xref>
]. The evolutionary history of uORF homology group 8 suggests a stable transcriptional fusion model leading to uORF emergence in plants, arthropods and nematodes. Group 8 uORFs are associated with three independent mORFs in the land plant, arthropod and nematode lineages, while the vertebrate, slime mold, algal, and fungal small ORFs that are orthologous to group 8 uORFs do not seem to be associated with mORFs. Given the phylogenetic relationships among these species [
<xref ref-type="bibr" rid="B58">58</xref>
], the most parsimonious explanation for the evolutionary origin of group 8 uORFs is that they originated as a small ORF transcribed independently of a mORF. Subsequently, this small ORF gene was displaced via genome rearrangements or transposition events to regions upstream of three independent large ORFs resulting in transcriptional fusions of the two previously independent transcripts. The uORFs and mORFs in the plant, nematode, and arthropod lineages have remained associated within the same transcript for 300–500 My, therefore these transcriptional fusion events seem to be stable and perhaps biologically advantageous. Evidence for other uORF emergence models, such as mORF fragmentation or
<italic>de novo </italic>
creation, will require further analysis of closely related organisms.</p>
</sec>
<sec>
<title>Potential dual role for uORF proteins</title>
<p>uORFs can regulate specific mORF protein expression
<italic>in trans </italic>
when the
<italic>cis </italic>
uORF is intact [
<xref ref-type="bibr" rid="B59">59</xref>
,
<xref ref-type="bibr" rid="B60">60</xref>
] but it is still unclear whether uORF proteins can play additional roles in the cell. Small proteins, similar in length to uORFs, play a role in plant development and could also be involved in plant defense [
<xref ref-type="bibr" rid="B61">61</xref>
,
<xref ref-type="bibr" rid="B62">62</xref>
]. Potentially, uORFs could affect such processes independently of their role as a translational regulator. Homology group 8 uORFs are largely conserved in length, sequence, and intron position across most eukaryotes, but in fungi, algae, slime mold, and vertebrates, the associated mORF seems to be absent. The absence of the mORF and strong conservation of the uORF amino acid sequence over one billion years in these eukaryotes indicates that, in plants, this protein could act as both a regulator of mORF expression and as a
<italic>trans </italic>
acting factor in the cell.</p>
<p>Group 13 uORFs contain peptides similar to RS motifs found in SR proteins. SR proteins are a family of proteins required for alternative and constitutive pre-mRNA splicing [
<xref ref-type="bibr" rid="B63">63</xref>
,
<xref ref-type="bibr" rid="B64">64</xref>
]. A subset of these proteins, shuttling SR proteins, have not only been implicated in splicing but have also been shown to stimulate translation of a reporter gene when fused to the same transcript [
<xref ref-type="bibr" rid="B65">65</xref>
], analogous to a uORF-mORF associated pair. It is possible then, that group 13 uORF proteins could also play a dual role, as a translational regulator and
<italic>trans </italic>
factor.</p>
<p>Similarly, some uORFs in mammalian genomes might adopt these dual roles and further characterization of conserved mammalian uORFs [
<xref ref-type="bibr" rid="B66">66</xref>
] could resolve a dual role model.</p>
</sec>
<sec>
<title>Applications</title>
<p>
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
analyses suggest that conserved peptide uORFs are under mild to strong negative selection and might therefore be useful for resolving orthology and paralogy of specific gene pairs. For example, phylogenetic studies have sometimes failed to identify all members within a uORF homology group when only considering the mORF sequence (e.g. homology group 2). Although the bHLH transcription factor domain occurs in the mORF of all three group 2 members, none were identified in the original studies, and only two of the three members have been included in the latest description of Arabidopsis bHLH family members [
<xref ref-type="bibr" rid="B67">67</xref>
-
<xref ref-type="bibr" rid="B69">69</xref>
].</p>
<p>Further characterization of conserved peptide uORFs and their functional mechanisms might also provide useful tools for creating inducible or repressible expression vectors in plants. AdoMetDC1, bZIP11, and PEAMT/NMT1 protein levels are regulated by conserved uORFs in a metabolite-dependent manner (polyamine, sucrose, and choline, respectively) and other conserved uORFs might also regulate mORF translation in response to cellular compounds, such as TPPases. If this is the case, further functional characterization of conserved peptide uORFs could provide the tools necessary to build constructs that are quickly inducible or repressible at the translational level under various conditions.</p>
</sec>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Identifying conserved uORFs in rice and Arabidopsis</title>
<p>Corrected RIKEN and Genoscope
<italic>Arabidopsis thaliana </italic>
ecotype Columbia and NIAS, FAIS and RIKEN
<italic>Oryza sativa </italic>
spp.
<italic>japonica </italic>
cv Nipponbare full-length cDNA collections were used for all analysis [
<xref ref-type="bibr" rid="B70">70</xref>
]. A cDNA's major ORF (mORF) was defined as the longest ORF starting with an AUG, the sequence upstream of this AUG was designated the 5' UTR, and upstream ORFs (uORFs) were any ORFs found in the 5' UTR starting with an AUG. All ORFs were identified using getorf [
<xref ref-type="bibr" rid="B71">71</xref>
]. Arabidopsis mORFs were aligned to rice cDNAs using tBLASTn with an E-value cutoff = 1e-5 [
<xref ref-type="bibr" rid="B72">72</xref>
,
<xref ref-type="bibr" rid="B73">73</xref>
] to find putative homologs. Rice cDNAs with hits below this threshold were paired with their respective Arabidopsis transcript, 5' UTR sequences extracted from both, uORFs determined using getorf, and all combinations of rice and Arabidopsis uORF peptide pairs aligned using needle [
<xref ref-type="bibr" rid="B71">71</xref>
]. The reciprocal analysis was also performed, starting with rice full-length cDNA sequences and comparing them to Arabidopsis transcript sequences. All uORFs greater than 100 amino acids were excluded from this analysis.</p>
<p>All pairs with scores >50 were kept and examined manually against existing Arabidopsis transcript annotations (TAIR and TIGR) and existing ESTs to determine whether aligned peptides fall within a probable 5' UTR. To validate the putative uORFs, the first 100 amino acids of the Arabidopsis mORF were aligned to Genbank plant ESTs using tBLASTn (E-value = 1e-10, limit: Viridiplantae [orgn] NOT Arabidopsis [orgn], complexity filter off), and all retrieved plant uORF sequences were aligned to rice and Arabidopsis uORFs using ClustalW [
<xref ref-type="bibr" rid="B74">74</xref>
], manually adjusted, and visualized using Jalview [
<xref ref-type="bibr" rid="B75">75</xref>
] (Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
). There were two exceptions to this procedure. Because the uORFs in group 10 are 400–600 bp upstream of the mORF AUG, only the first 25 mORF amino acids were used to search Genbank plant ESTs (first 25 amino acids are very highly conserved). Secondly, high identity was limited to the 3' end of mORFs in group 17, therefore the Arabidopsis transcript's terminal 50 amino acids were aligned to Genbank non-EST plant sequences. Support for a conserved uORF was found in the
<italic>Medicago truncatula </italic>
and
<italic>Lotus corniculatus </italic>
genomic sequences.</p>
<p>To test whether uORFs appear upstream of non-homologous genes, Arabidopsis uORF sequences were aligned to the entire Arabidopsis genome (version 5) [
<xref ref-type="bibr" rid="B76">76</xref>
] using tBLASTn (E-value = 10). Predicted conserved uORFs were found to lie upstream of the annotated gene instead of in the annotated 5' UTR in approximately 10% of Arabidopsis and 25% of rice genes (Tables
<xref ref-type="table" rid="T2">2</xref>
,
<xref ref-type="table" rid="T3">3</xref>
,
<xref ref-type="table" rid="T4">4</xref>
). The discrepancies with the accepted annotations, found at TAIR [
<xref ref-type="bibr" rid="B76">76</xref>
] and TIGR [
<xref ref-type="bibr" rid="B77">77</xref>
], respectively, demonstrate the benefit of using full-length cDNA sequences for this analysis.</p>
<p>To determine whether sequences similar to these conserved uORFs reside elsewhere in the rice and Arabidopsis genomes, uORF amino acid sequences were aligned with sequences translated from the genome sequence using tBLASTn [
<xref ref-type="bibr" rid="B73">73</xref>
]. Sequences similar to these uORFs were found within 5' UTRs of homologous mORF loci, and were absent from non-homologous transcripts, intronic regions, and intergenic regions with only one exception, Arabidopsis
<italic>NMT3 </italic>
(AGI locus identifier At1g73600). The annotated mORF for
<italic>NMT3 </italic>
[
<xref ref-type="bibr" rid="B78">78</xref>
] is not covered by any available full-length cDNA and has no EST support at its 5' end. Thus, we annotated
<italic>NMT3 </italic>
by comparison with its paralog,
<italic>NMT1 </italic>
(At3g18000) [
<xref ref-type="bibr" rid="B33">33</xref>
].
<italic>NMT3 </italic>
possesses sequences similar to the
<italic>NMT1 </italic>
uORF, as well as sequences similar to the
<italic>NMT1 </italic>
mORF, but the TAIR annotation fuses these into a single ORF. However,
<italic>NMT3 </italic>
possesses potential splice sites that would produce transcripts with uORF and mORF sequences similar to those in NMT1. The
<italic>NMT3 </italic>
uORF predicted by one alternative splice model is the same length as, and is 72% identical to, the
<italic>NMT1 </italic>
uORF amino acid sequence (Group 13 in Figure
<xref ref-type="fig" rid="F4">4</xref>
).</p>
<p>The TAIR website was used to assign locus numbers for each Arabidopsis transcript and the TIGR website for rice locus numbers. The Arabidopsis locus numbers were then used to search for retained duplicates from the recent and ancient whole genome duplications as defined on the Arabidopsis Paralogon website [
<xref ref-type="bibr" rid="B33">33</xref>
].</p>
</sec>
<sec>
<title>Calculating
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s</italic>
</sub>
</title>
<p>For homology groups 1–19,
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
values for homologous rice and Arabidopsis mORFs and uORFs were determined using pairwise_kaks.PLS (version 1.7) [
<xref ref-type="bibr" rid="B79">79</xref>
]. Both the approximate method (option-kaks yn00) and the maximum likelihood method (-kaks codeml) were used. Any
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
values resulting from a
<italic>K</italic>
<sub>
<italic>a </italic>
</sub>
or
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
value >10 was excluded from the analysis, as these values result in inaccurate predictions of
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
[
<xref ref-type="bibr" rid="B80">80</xref>
,
<xref ref-type="bibr" rid="B81">81</xref>
]. The
<italic>K</italic>
<sub>
<italic>a</italic>
</sub>
/
<italic>K</italic>
<sub>
<italic>s </italic>
</sub>
values for homology groups 20–26 were determined with the same approach using Arabidopsis sequences only.</p>
</sec>
<sec>
<title>GO molecular function terms</title>
<p>GO molecular function terms [
<xref ref-type="bibr" rid="B82">82</xref>
] were retrieved from TAIR Locus History pages [
<xref ref-type="bibr" rid="B76">76</xref>
]. GO terms for all Arabidopsis loci were downloaded from the TAIR website and used to compare genome-wide GO molecular function term frequencies to those found in the conserved uORF-containing loci. Statistically significant differences were detected using the Exact Binomial test as described in the R program package [
<xref ref-type="bibr" rid="B83">83</xref>
]. This analysis was also carried out by GeneMerge, a program that incorporates a Bonferroni corrected
<italic>P</italic>
-value [
<xref ref-type="bibr" rid="B84">84</xref>
].</p>
</sec>
<sec>
<title>Identification of Arabidopsis ohnologs and paralogs with conserved uORF</title>
<p>Conserved uORFs were found in Arabidopsis duplicates in much the same way as conserved uORFs were found between rice and Arabidopsis. uORFs and mORFs were defined in the same way, and mORF sequences were aligned to the entire Arabidopsis full-length cDNA collection using BLASTp (E-value cutoff = 1e-5) to detect transcripts deriving from a duplicated locus. mORFs aligning with ≥ 99% identity were discarded, and uORFs of all remaining pairs were aligned using needle and validated as above.</p>
</sec>
<sec>
<title>Generation of phylogenetic trees</title>
<p>Sequences similar to Cox17p, Cox19p, Mrp10, CHCHD7, and uORF homology group 8 (as determined by tBLASTn and analyzed for conservation of the CHCH motif) were aligned using Muscle [
<xref ref-type="bibr" rid="B85">85</xref>
], trimmed of non-informative sites, and analyzed using Mr. Bayes v. 3.0 [
<xref ref-type="bibr" rid="B86">86</xref>
] (rates = gamma, aamodel = mixed, ngen = 2000000). Phylogenetic trees were visualized using PHYLIP's DRAWTREE program v. 3.65 [
<xref ref-type="bibr" rid="B87">87</xref>
].</p>
<p>Sequences similar to uORF homology group 8 were aligned, edited, and analyzed in the same manner with one exception, ngen = 3000000.</p>
</sec>
<sec>
<title>Estimate of conserved peptide uORF prevalence</title>
<sec>
<title>Number of Arabidopsis-rice loci</title>
<p>There is an average of 2.23 full-length cDNAs per uORF locus identified (excluding loci identified by BLAST alignment), which suggests that 15200 Arabidopsis genes are represented in the cDNA collections (34000 cDNAs/2.23 cDNAs per locus), representing approximately 60% of all Arabidopsis genes (assuming 26000 genes) [
<xref ref-type="bibr" rid="B88">88</xref>
]. In addition, Kikuchi et al [
<xref ref-type="bibr" rid="B25">25</xref>
] report that the 28000 rice full-length cDNA sequences represent 20000 transcription units (TUs) and that 64% of these (12800) have a homolog in Arabidopsis. Assuming that 60–100% of these homologs are represented in the Arabidopsis cDNA collections, the estimated number of Arabidopsis homologs screened for uORF conservation is 7800–13000. Only 80% of Arabidopsis genes also have a homolog in rice (~21000) [
<xref ref-type="bibr" rid="B25">25</xref>
], therefore the uORF-Finder program has identified 37–62% of all conserved upstream ORFs (7800/21000 to 13000/21000) when comparing rice and Arabidopsis full-length cDNAs. Therefore, there should be 61–102 loci that contain conserved uORFs: 38 loci found by uORF-Finder, 6 additional loci found by aligning known uORF sequences with the Arabidopsis genome using BLAST, and 17–58 presently unidentified loci. Using both uORF-Finder and BLAST algorithms we estimate that between 43% and 72% of conserved peptide uORFs between monocots and dicots have been identified.</p>
</sec>
<sec>
<title>Number of Arabidopsis-Arabidopsis loci</title>
<p>A total of 60% of Arabidopsis genes are represented in the full-length cDNA collections used for this study. Therefore, the probability of selecting two loci that have conserved peptide uORFs from the pool of known sequences is 0.6*0.6 = 0.36. This translates to a total of 38 loci that have conserved uORFs using an Arabidopsis-Arabidopsis comparison (14 identified (36%), and 24 unidentified).</p>
</sec>
<sec>
<title>Total loci</title>
<p>We therefore predict that there are between 99 and 140 loci in the Arabidopsis genome that contain conserved peptide uORFs, 41–58% of which have been identified.</p>
</sec>
</sec>
</sec>
<sec>
<title>Abbreviations</title>
<sec>
<title>Species name abbreviations for Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
,
<xref ref-type="fig" rid="F9">9</xref>
,
<xref ref-type="fig" rid="F10">10</xref>
,
<xref ref-type="fig" rid="F12">12</xref>
, and
<xref ref-type="fig" rid="F13">13</xref>
</title>
<p>Acypi,
<italic>Acyrthosiphon pisum</italic>
; Adica,
<italic>Adiantum capillus-veneris</italic>
; Ajeca,
<italic>Ajellomyces capsulatus</italic>
; Allce,
<italic>Allium cepa</italic>
; Anoga,
<italic>Anopheles gambiae</italic>
; Apime,
<italic>Apis mellifera</italic>
; Arath,
<italic>Arabidopsis thaliana</italic>
; Ascsu,
<italic>Ascaris suum</italic>
; Aspof,
<italic>Asparagus officinalis</italic>
; Betvu,
<italic>Beta vulgaris</italic>
; Bosta,
<italic>Bos taurus</italic>
; Brafl,
<italic>Branchiostoma floridae</italic>
; Brana,
<italic>Brassica napus</italic>
; Brugy,
<italic>Bruguiera gymnorhiza</italic>
; Caeel,
<italic>Caenorhabditis elegans</italic>
; Cicli,
<italic>Cicindela litorea</italic>
; Ciosa,
<italic>Ciona savignyi</italic>
; Citja,
<italic>Citrus jambhiri</italic>
; Citpa,
<italic>Citrus paradisi</italic>
; Citsi,
<italic>Citrus sinensis</italic>
; Chlre,
<italic>Chlamydomonas reinhardtii</italic>
; Cocpo,
<italic>Coccidioides posadasii</italic>
; Cryne,
<italic>Cryptococcus neoformans</italic>
; Cycru,
<italic>Cycas rumphii</italic>
; Danre,
<italic>Danio rerio</italic>
; Debha,
<italic>Debaryomyces hansenii</italic>
; Dicdi,
<italic>Dictyostelium discoidium</italic>
; Drome,
<italic>Drosophila melanogaster</italic>
; Drops,
<italic>Drosophila pseudoobscura</italic>
; Erate,
<italic>Eragrostis tef</italic>
; Escca,
<italic>Eschscholzia californica</italic>
; Eupes,
<italic>Euphorbia esula</italic>
; Eupti,
<italic>Euphorbia tirucalli</italic>
; Galga,
<italic>Gallus gallus</italic>
; Gibze,
<italic>Gibberella zeae</italic>
; Glomo,
<italic>Glossina morsitans</italic>
; Glyma,
<italic>Glycine max</italic>
; Glyso,
<italic>Glycine soja</italic>
; Gosar,
<italic>Gossypium arboreum</italic>
; Goshi,
<italic>Gossypium hirsutum</italic>
; Gosra,
<italic>Gossypium raimondii</italic>
; Haeco,
<italic>Haemonchus contortus</italic>
; Helan,
<italic>Helianthus annuus</italic>
; Hetgl,
<italic>Heterodera glycines</italic>
; Hevbr,
<italic>Hevea brasiliensis</italic>
; Homsa,
<italic>Homo sapiens </italic>
Horvu,
<italic>Hordeum vulgare</italic>
; Iponi,
<italic>Ipomoea nil</italic>
; Jugre,
<italic>Juglans regia</italic>
; Lacsa,
<italic>Lactuca sativa</italic>
; Lacse,
<italic>Lactuca serriola</italic>
; Linus,
<italic>Linum usitatissimum</italic>
; Locmi,
<italic>Locusta migratoria</italic>
; Lyces,
<italic>Lycopersicon esculentum</italic>
; Maldo,
<italic>Malus domestica</italic>
; Medtr,
<italic>Medicago truncatula</italic>
; Mescr,
<italic>Mesembryanthemum crystallinum</italic>
; Mesvi,
<italic>Mesostigma viride</italic>
; Molte,
<italic>Molgula tectiformis</italic>
; Musmu,
<italic>Mus musculus</italic>
; Neucr,
<italic>Neurospora crassa</italic>
; Nicbe,
<italic>Nicotiana benthamiana</italic>
; Oncmy,
<italic>Oncorhynchus mykiss</italic>
; Oryla,
<italic>Oryzias latipes</italic>
; Orysa,
<italic>Oryza sativa</italic>
; Parbr,
<italic>Paracoccidioides brasiliensis</italic>
; Pethy,
<italic>Petunia hybrida</italic>
; Phaac,
<italic>Phaseolus acutifolius</italic>
; Phaco,
<italic>Phaseolus coccineus</italic>
; Phypa,
<italic>Physcomitrella patens</italic>
; Pontr,
<italic>Poncirus trifoliata</italic>
; Popde,
<italic>Populus deltoides</italic>
; Popca,
<italic>Populus canadensis</italic>
; Popeu,
<italic>Populus euphratica</italic>
; Poptd,
<italic>Populus trichocarpa </italic>
×
<italic>Populus deltoides</italic>
; Poptt,
<italic>Populus tremula </italic>
×
<italic>Populus tremuloides</italic>
; Prupe,
<italic>Prunus persica</italic>
; Sacce,
<italic>Saccharomyces cerevisiae</italic>
; Sachc,
<italic>Saccharum </italic>
hybrid cultivar; Sacof,
<italic>Saccharum officinarum</italic>
; Schma,
<italic>Schistosoma mansoni</italic>
; Schpo,
<italic>Schizosaccharomyces pombe</italic>
; Selmo,
<italic>Selaginella moellendorffii</italic>
; Soltu,
<italic>Solanum tuberosum</italic>
; Sorbi,
<italic>Sorghum bicolor</italic>
; Strpu,
<italic>Strongylocentrotus purpuratus</italic>
; Strra,
<italic>Strongyloides ratti</italic>
; Styhu,
<italic>Stylosanthes humilis</italic>
; Tetni,
<italic>Tetraodon nigroviridis</italic>
; Theha,
<italic>Thelungiella halophila</italic>
; Torru,
<italic>Tortula ruralis</italic>
; Triae,
<italic>Triticum aestivum</italic>
; Trisp,
<italic>Trichinella spiralis</italic>
; Ulvli,
<italic>Ulva linza</italic>
; Ustma,
<italic>Ustilago maydis</italic>
; Vitsh,
<italic>Vitis shuttleworthii</italic>
; Vitvi,
<italic>Vitis vinifera</italic>
; Welma,
<italic>Welwitschia mirabilis</italic>
; Xenla,
<italic>Xenopus laevis</italic>
; Xentr,
<italic>Xenopus tropicalis</italic>
; Yarli,
<italic>Yarrowia lipolytica</italic>
; Zeama,
<italic>Zea mays</italic>
.</p>
</sec>
<sec>
<title>Abbreviated species names and Genbank accession number, cDNA clone number, or genome identifier</title>
<sec>
<title>Figures
<xref ref-type="fig" rid="F1">1</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
</title>
<p>Group 1: Arath1 (CNS0ABWH); Arath2 (CNS09Y87); Arath3 (CNS0A364); Arath4 (CNS0A728); Arath5 (RAFL11-10-D10); Orysa1 (AK070887); Orysa2 (AK065180); Orysa3 (AK064903); Orysa4 (AK109929); Orysa5 (LOC_Os12g37410).</p>
<p>Group2: Arath1 (At2g31280); Arath2 (At1g06150); Arath3 (RAFL04-15-e03); Lacsa (BQ869454); Lyces (AW621910); Medtr, (BF643643); Orysa (AK074015.1).</p>
<p>Group3: Arath1 (CNS0A7A6); Arath2 (RAFL04-16-A04); Arath3 (RAFL09-22-L13); Cycru (CB092297); Orysa1 (AK072162); Orysa2 (AK100397); Orysa3 (AK070259); Selmo (DN838497); Torru (CN201012); Ulvli (AJ892634).</p>
<p>Group4: Arath1 (RAFL09-11-P17); Arath2 (RAFL09-63-H05); Arath3 (RAFL06-76-P19); Brana (CD823274); Goshi (AI730427); Gosra (CO113165); Medtr (AW689516); Orysa (AK060830); Poptt (BU896557); Prupe (BU045695).</p>
<p>Group 5: Arath1 (RAFL05-05-C03); Arath2 (CNS0A9PN); Gosra (CO130855); Hevbr (CB376393); Lacse (BU011020); Orysa (AK103103); Phaac (BU791117); Triae (BJ233459).</p>
<p>Group 6: Arath1 (RAFL05-17-I08); Arath2 (CNS0A6ZP); Aspof (CV291431); Glyma (BM143067); Gosar (BG442153); Orysa (AK064902); Pontr (CD576165); Triae (CK161649); Vitvi (CB980452).</p>
<p>Group 7: Arath (RAFL09-25-N17); Brana (CD836460); Mescr (BM301482); Nicbe (CK290710); Orysa1 (AK067685); Orysa2 (LOC_Os06g48350); Triae (CV066319).</p>
<p>Group 8: Arath (RAFL07-08-P17); Chlre (BE121764); Mesvi1 (DN255332); Mesvi2 (DN261354); Orysa (AK072620); Phypa (BJ174896); Popca (CX178804); Popeu (AJ776458); Sachc (CF573523); Triae (CA499582).</p>
<p>Group 9: Allce (CF443194); (Arath1 (RAFL07-09-G06); Arath2 (RAFL09-23-F23); Arath3 (At1g64140); Gosra (CO081490); Orysa1 (AK101398); Orysa2 (AK105763); Orysa3 (AK068099); Orysa4 (AK099577).</p>
<p>Group 10: Arath1 (RAFL07-11-O11); Arath2 (RAFL09-17-I10); Brana (CN732239); Orysa1 (AK069526); Orysa2 (AK100056); Poptt (BI131713); Sorbi (CN139168); Theha (BE758596).</p>
<p>Group 11: Arath1 (RAFL07-14-D12); Arath2 (CNS0A404); Glyma (CA783255); Jugre (CV197923); Medtr (AW691064); Orysa1 (AK103391); Orysa2 (AK069361); Soltu (BQ113418).</p>
<p>Group 12: Arath1 (RAFL07-18-F03); Arath2 (CNS0AB39); Brana (CD812479); Citse (CN185367); Jugre (CV196770); Orysa (AK060405); Popde (CK319714); Triae (BQ752938); Zeama (CD433782).</p>
<p>Group 13: Arath1 (RAFL08-10-M03); Arath2 (At1g48600.2); Arath3 (At1g73600); Cycru (CB093136); Gosra (CO080661); Iponi (BJ562806); Linus (CA483285); Medtr (AW587372); Orysa1 (LOC_Os05g47540); Orysa2 (AK102037); Phypa (BJ204269); Xenla (CA792398); Xentr (CX412233); Zeama (AY103779).</p>
<p>Group14: Allce (CF450799); Arath (RAFL09-10-M04); Medtr (AW267817); Nicbe (CK295530); Orysa (AK101569); Soltu (CK258175); Zeama (CO519993). Group 15: Adica (BP914226); Arath1 (CNS0ADY7); Arath2 (RAFL08-17-G21); Arath3 (RAFL04-17-N21); Arath4 (RAFL16-69-M04); Citpa (DN959636); Gosra (CO125506); Maldo (CV082382); Medtr (CX528608); Orysa1 (AK102703); Orysa2 (AK101749); Orysa3 (AK071582); Orysa4 (AK065674); Sacof (CA154823); Vitvi (CB001711); Welma (DT579937).</p>
<p>Group 15: Adica (BP914226); Arath1 (CNS0ADY7); Arath2 (RAFL08-17-G21); Arath3 (RAFL04-17-N21); Arath4 (RAFL16-69-M04); Citpa (DN959636); Gosra (CO125506); Maldo (CV082382); Medtr (CX528608); Orysa1 (AK102703); Orysa2 (AK101749); Orysa3 (AK071582); Orysa4 (AK065674); Sacof (CA154823); Vitvi (CB001711); Welma (DT579937).</p>
<p>Group 16: Arath (CNS0A4RC); Medtr (AW693231); Orysa1 (AK071885); Orysa2 (AK067447).</p>
<p>Group 17: Arath1 (RAFL09-25-E19); Arath2 (At5g03190); Arath3 (RAFL19-67-G09); Arath4 (At5g01710); Gosra (CO108440); Lyces (AW738430); Medtr1 (BQ149694); Medtr2 (AC144517); Orysa1 (AK69088); Orysa2 (AK070250); Sacof (CA191644).</p>
<p>Group 18: Arath (RAFL08-18-B11); Gosra (CO115325); Nicbe (CK286574); Orysa (AK061433).</p>
<p>Group 19: Arath (CNS09ZXM); Eupes (DV113097); Helan (AJ541596); Medtr (BI309364); Orysa (AK068270); Triae (CD927685); Vitvi (CB918939); Zeama (DV166198).</p>
<p>Group 20: Arath1 (RAFL04-17-G13); Arath2 (CNS0A8YX); Brana (CD835762); Brugy (BP941533); Gosar (BF274209); Maldo (CN940921); Medtr (BE316669); Styhu (L36823).</p>
<p>Group 21: Allce (CF450138); Arath1 (RAFL07-08-G04); Arath2 (RAFL21-49-G19); Betvu (BQ594525); Brana (CD835573); Erate (DN481483); Escca (CD481239); Eupti (BP958766); Gosra (CO074819); Glyma (BU761432); Horvu (AV834976); Lacse (BQ998418); Maldo (CV881926); Medtr (CA991201); Orysa (AK100575); Popca (CX182168).</p>
<p>Group 22: Arath1 (RAFL07-11-D20); Arath2 (RAFL11-03-J07); Brana (CD836422); Horvu (CA023398); Orysa (CK041713); Sacof (CA242575); Triae (BJ247925); Zeama (CO458204).</p>
<p>Group 23: Arath1 (RAFL07-11-L03); Arath2 (RAFL09-07-L11); Citsi (CV720092); Glyma (BI892512).</p>
<p>Group 24: Arath1 (RAFL07-14-J09); Arath2 (CNS0A44P); Brana (CD828343); Glyma (BI471587); Horvu (BQ471053); Orysa (AK119634); Sacof (CA118382); Sorbi (CB928687); Triae (CA483985); Zeama (CO520078).</p>
<p>Group 25: Arath1 (RAFL09-94-P19); Arath2 (CNS0A6N0); Brana (CD835519); Citsi (CN191447); Escca (CD481312); Glyma (BE805986); Phaco (CA913939); Soltu (DN940765); Vitsh (CV098492).</p>
<p>Group 26: Arath1 (CNS0A7NI); Arath2 (CNS0A1F5); Citja (CO912573); Pethy (CV298852); Poptd (CN521002); Prupe (BU045483).</p>
</sec>
<sec>
<title>Figure
<xref ref-type="fig" rid="F9">9</xref>
</title>
<p>Arath (RAFL07-08-P17); Caeel (U10402); Ciosa (BW577210); Danre (CO350578); Dicdi (AU072562); Drome (AI297387); Homsa (BU541024); Mesvi (DN255332); Neucr (BX284746); Orysa, (AK072620); Phypa (BJ174896); Strpu (CX079489); Ustma (CF644197).</p>
</sec>
<sec>
<title>Figure
<xref ref-type="fig" rid="F10">10</xref>
</title>
<p>Arath1 (RAFL08-10-M03); Arath2 (At1g48600.2); Arath3 (At1g73600); Cycru (CB093136); Gosra (CO080661); Iponi (BJ562806); Linus (CA483285); Medtr (AW587372); Orysa1 (LOC_Os05g47540); Orysa2 (AK102037); Phypa (BJ204269); Xenla (CA792398); Xentr (CX412233); Zeama (AY103779).</p>
</sec>
<sec>
<title>Figure
<xref ref-type="fig" rid="F12">12</xref>
</title>
<p>Acypi (CV847404); Ajeca (CV605785); Anoga1 (BX617953), Anoga2 (XM_552406); Apime (NW_622706); Arath1 (BP562704), Arath2 (AY065264), Arath3 (RAFL07-08-P17), Arath4 (NM_179521), Arath5 (NM_112400); Ascsu (BM964977); Bosta (CO877216); Brafl1 (BW786058), Brafl2 (BW840607); Caeel (U10402); Chlre1 (BE121764), Chlre2 (AF280543); Cicli (CV156944); Ciosa (BW577210); Cocpo (CO006101); Cryne (XM_572394); Danre (CO350578); Debha (NC_006045); Dicdi1 (AU072562), Dicdi2 (XM_631387); Drome1 (AI297387), Drome2 (AY102691); Drops (DR121964), Erate (DN481021); Galga1 (BX935835), Galga2 (CR407540); Gibze (BI750032); Glomo (BX557417); Glyso (BG045953); Haeco (CA956938); Hetgl (CB299856); Homsa1 (DR155443), Homsa2 (CR607136), Homsa3 (BU541024), Homsa4 (AY957566), Homsa5 (NM_005694); Hordvu (BF628344); Locmi1 (CO854527), Locmi2 (CO825844); Mesvi1 (DN255332), Mesvi2 (DN261354); Molte (CJ368011); Musmu1 (BC030366), Musmu2 (AK010111); Neucr (BX284746); Oncmy (BX081024); Oryla (BJ737531); Orysa1 (XM_482456), Orysa2 (AK072620), Orysa3 (AK120143), Orysa4 (XM_468245); Parbr (CA581923); Phypa1 (BJ966696), Phypa2 (BJ174896); Popca (CX178804); Popeu (AJ776458); Sacce1 (NC_001136), Sacce2 (AY692601), Sacce3 (NC_001144), Sacce4 (NC_001144); Sachc (CF573523); Schma (CD081475); Schpo1 (NM_001019463), Schpo2 (NM_001022867), Schpo3 (NM_001022571); Sorbi (CD423660); Strpu (CX079489); Strra (BI323578); Tetni1 (CR709012), Tetni2 (CNS0G27U); Triae (CA499582); Trisp (BQ693345); Ustma1 (CF644197), Ustma2 (XM_754796); Xenla1 (BI477811), Xenla2 (BC084847); Xentr1 (BC075310), Xentr2 (CN119217); Yarli (XM_500713).</p>
</sec>
<sec>
<title>Figure
<xref ref-type="fig" rid="F13">13</xref>
</title>
<p>Acypi (CV847404); Ajeca (CV605785); Anoga (BX617953); Apime (NW_622706); Arath (RAFL07-08-P17); Ascsu (BM964977); Bosta (CO877216); Brafl1 (BW840607), Brafl2 (BW786058); Caeel (U10402); Chlre (BE121764); Cicli (CV156944); Ciosa (BW577210); Danre (CO350578); Debha (NC_006045); Dicdi (AU072562); Drome (AI297387); Drops (DR121964); Galga (CR407540); Gibze (BI750032); Glomo (BX557417); Haeco (CA956938); Hetgl (CB299856); Homsa (BU541024); Locmi (CO825844); Mesvi1 (DN255332), Mesvi2 (DN261354); Molte (CJ368011); Musmu (AK010111); Neucr (BX284746); Oncmy (BX081024); Oryla (BJ737531); Orysa (AK072620); Phypa (BJ174896); Popca (CX178804); Popeu (AJ776458); Sachc (CF573523); Schma (CD081475); Sorbi (CD423660); Strpu (CX079489); Strra (BI323578); Tetni (CR709012); Triae (CA499582); Trisp1 (BQ693345), Trisp2 (BQ692350); Ustma (CF644197); Xenla (BI477811); Xentr (CN119217).</p>
</sec>
</sec>
</sec>
<sec>
<title>Authors' contributions</title>
<p>Both CAH and RAJ designed and implemented the analyses for the present study. CAH drafted the manuscript and RAJ provided critical comments. Both authors have read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>Alignment used to generate Figure
<xref ref-type="fig" rid="F12">12</xref>
</p>
</caption>
<media xlink:href="1741-7007-5-32-S1.doc" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>Alignment used to generate Figure
<xref ref-type="fig" rid="F13">13</xref>
</p>
</caption>
<media xlink:href="1741-7007-5-32-S2.doc" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<sec>
<title>Acknowledgements</title>
<p>We thank Dr Yadegari for suggesting this project. We would also like to recognize N Merchant and S Miller at the Biotechnology Computing Facility, as well as T Wheeler in the Department of Computer Science for invaluable programming suggestions. This research was supported by a University of Arizona NSF IGERT Genomics Initiative fellowship (DGE-0114420) to CAH and the NSF Plant Genome Program Grant No. DBI-0421679 (RAJ).</p>
</sec>
</ack>
<ref-list>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanfrey</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Franceschetti</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Illingworth</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Michael</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Abrogation of upstream open reading frame-mediated translational control of a plant
<italic>S</italic>
-adenosylmethionine decarboxylase results in polyamine disruption and growth perturbations</article-title>
<source>J Biol Chem</source>
<year>2002</year>
<volume>277</volume>
<fpage>44131</fpage>
<lpage>44139</lpage>
<pub-id pub-id-type="pmid">12205086</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M206161200</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hinnebusch</surname>
<given-names>AG</given-names>
</name>
</person-group>
<article-title>Translational regulation of yeast
<italic>GCN4</italic>
</article-title>
<source>J Biol Chem</source>
<year>1997</year>
<volume>272</volume>
<fpage>21661</fpage>
<lpage>21664</lpage>
<pub-id pub-id-type="pmid">9268289</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.272.35.21661</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Werner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Feller</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Messenguy</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Pierard</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>The leader peptide of yeast gene
<italic>CPA1 </italic>
is essential for the translational repression of its expression</article-title>
<source>Cell</source>
<year>1987</year>
<volume>49</volume>
<fpage>805</fpage>
<lpage>813</lpage>
<pub-id pub-id-type="pmid">3555844</pub-id>
<pub-id pub-id-type="doi">10.1016/0092-8674(87)90618-0</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wiese</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Elzinga</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Wobbes</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Smeekens</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>A conserved upstream open reading frame mediates sucrose-induced repression of translation</article-title>
<source>Plant Cell</source>
<year>2004</year>
<volume>16</volume>
<fpage>1717</fpage>
<lpage>1729</lpage>
<pub-id pub-id-type="pmid">15208401</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.019349</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Galagan</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Calvo</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Cuomo</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Wortman</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Batzoglou</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>SI</given-names>
</name>
<name>
<surname>Basturkmen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Spevak</surname>
<given-names>CC</given-names>
</name>
<name>
<surname>Clutterbuck</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Sequencing of
<italic>Aspergillus nidulans </italic>
and comparative analysis with
<italic>A. fumigatus </italic>
and
<italic>A. oryzae</italic>
</article-title>
<source>Nature</source>
<year>2005</year>
<volume>438</volume>
<fpage>1105</fpage>
<lpage>1115</lpage>
<pub-id pub-id-type="pmid">16372000</pub-id>
<pub-id pub-id-type="doi">10.1038/nature04341</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Churbanov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rogozin</surname>
<given-names>IB</given-names>
</name>
<name>
<surname>Babenko</surname>
<given-names>VN</given-names>
</name>
<name>
<surname>Ali</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>Evolutionary conservation suggests a regulatory function of AUG triplets in 5'-UTRs of eukaryotic genes</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>5512</fpage>
<lpage>5520</lpage>
<pub-id pub-id-type="pmid">16186132</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gki847</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kawaguchi</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bailey-Serres</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>mRNA sequence features that contribute to translational regulation in Arabidopsis</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>955</fpage>
<lpage>965</lpage>
<pub-id pub-id-type="pmid">15716313</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gki240</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Futterer</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hohn</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Role of an upstream open reading frame in the translation of polycistronic mRNAs in plant cells</article-title>
<source>Nucleic Acids Res</source>
<year>1992</year>
<volume>20</volume>
<fpage>3851</fpage>
<lpage>3857</lpage>
<pub-id pub-id-type="pmid">1508670</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/20.15.3851</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kozak</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Effects of intercistronic length on the efficiency of reinitiation by eucaryotic ribosomes</article-title>
<source>Mol Cell Biol</source>
<year>1987</year>
<volume>7</volume>
<fpage>3438</fpage>
<lpage>3445</lpage>
<pub-id pub-id-type="pmid">3683388</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kozak</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Pushing the limits of the scanning mechanism for initiation of translation</article-title>
<source>Gene</source>
<year>2002</year>
<volume>299</volume>
<fpage>1</fpage>
<lpage>34</lpage>
<pub-id pub-id-type="pmid">12459250</pub-id>
<pub-id pub-id-type="doi">10.1016/S0378-1119(02)01056-9</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Luukkonen</surname>
<given-names>BG</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Schwartz</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Efficiency of reinitiation of translation on human immunodeficiency virus type 1 mRNAs is determined by the length of the upstream open reading frame and by intercistronic distance</article-title>
<source>J Virol</source>
<year>1995</year>
<volume>69</volume>
<fpage>4086</fpage>
<lpage>4094</lpage>
<pub-id pub-id-type="pmid">7769666</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gopfert</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Kullmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hengst</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Cell cycle-dependent translation of p27 involves a responsive element in its 5'-UTR that overlaps with a uORF</article-title>
<source>Hum Mol Genet</source>
<year>2003</year>
<volume>12</volume>
<fpage>1767</fpage>
<lpage>1779</lpage>
<pub-id pub-id-type="pmid">12837699</pub-id>
<pub-id pub-id-type="doi">10.1093/hmg/ddg177</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gray</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Saitoh</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nicholls</surname>
<given-names>RD</given-names>
</name>
</person-group>
<article-title>An imprinted, mammalian bicistronic transcript encodes two independent proteins</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1999</year>
<volume>96</volume>
<fpage>5616</fpage>
<lpage>5621</lpage>
<pub-id pub-id-type="pmid">10318933</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.96.10.5616</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fang</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Sachs</surname>
<given-names>MS</given-names>
</name>
</person-group>
<article-title>Evolutionarily conserved features of the arginine attenuator peptide provide the necessary requirements for its function in translational regulation</article-title>
<source>J Biol Chem</source>
<year>2000</year>
<volume>275</volume>
<fpage>26710</fpage>
<lpage>26719</lpage>
<pub-id pub-id-type="pmid">10818103</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.275.12.8945</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jin</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Turcott</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Englehardt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mize</surname>
<given-names>GJ</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>The two upstream open reading frames of oncogene
<italic>mdm2 </italic>
have different translational regulatory properties</article-title>
<source>J Biol Chem</source>
<year>2003</year>
<volume>278</volume>
<fpage>25716</fpage>
<lpage>25721</lpage>
<pub-id pub-id-type="pmid">12730202</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M300316200</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Couture</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Harvey</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Garneau</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Pelletier</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>An upstream open reading frame impedes translation of the huntingtin gene</article-title>
<source>Nucleic Acids Res</source>
<year>2002</year>
<volume>30</volume>
<fpage>5110</fpage>
<lpage>5119</lpage>
<pub-id pub-id-type="pmid">12466534</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gkf664</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lincoln</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Monczak</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>PF</given-names>
</name>
</person-group>
<article-title>Inhibition of CCAAT/enhancer-binding protein alpha and beta translation by upstream open reading frames</article-title>
<source>J Biol Chem</source>
<year>1998</year>
<volume>273</volume>
<fpage>9552</fpage>
<lpage>9560</lpage>
<pub-id pub-id-type="pmid">9545285</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.273.16.9552</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hill</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Cell-specific translational regulation of
<italic>S</italic>
-adenosylmethionine decarboxylase mRNA</article-title>
<source>J Biol Chem</source>
<year>1993</year>
<volume>268</volume>
<fpage>726</fpage>
<lpage>731</lpage>
<pub-id pub-id-type="pmid">8416975</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>KY</given-names>
</name>
</person-group>
<article-title>Characterization and expression of two members of the
<italic>S</italic>
-adenosylmethionine decarboxylase gene family in carnation flower</article-title>
<source>Plant Mol Biol</source>
<year>1997</year>
<volume>34</volume>
<fpage>371</fpage>
<lpage>382</lpage>
<pub-id pub-id-type="pmid">9225849</pub-id>
<pub-id pub-id-type="doi">10.1023/A:1005811229988</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martinez-Garcia</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Moyano</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Alcocer</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Two bZIP proteins from
<italic>Antirrhinum </italic>
flowers preferentially bind a hybrid C-box/G-box motif and help to define a new sub-family of bZIP transcription factors</article-title>
<source>Plant J</source>
<year>1998</year>
<volume>13</volume>
<fpage>489</fpage>
<lpage>505</lpage>
<pub-id pub-id-type="pmid">9680995</pub-id>
<pub-id pub-id-type="doi">10.1046/j.1365-313X.1998.00050.x</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Evans</surname>
<given-names>PT</given-names>
</name>
<name>
<surname>Malmberg</surname>
<given-names>RL</given-names>
</name>
</person-group>
<article-title>Do polyamines have roles in plant development?</article-title>
<source>Annu Rev Plant Phys</source>
<year>1989</year>
<volume>40</volume>
<fpage>235</fpage>
<lpage>269</lpage>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Walden</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Cordeiro</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tiburcio</surname>
<given-names>AF</given-names>
</name>
</person-group>
<article-title>Polyamines: small molecules triggering pathways in plant growth and development</article-title>
<source>Plant Physiol</source>
<year>1997</year>
<volume>113</volume>
<fpage>1009</fpage>
<lpage>1013</lpage>
<pub-id pub-id-type="pmid">9112764</pub-id>
<pub-id pub-id-type="doi">10.1104/pp.113.4.1009</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Dietrich</surname>
<given-names>FS</given-names>
</name>
</person-group>
<article-title>Identification and characterization of upstream open reading frames (uORF) in the 5' untranslated regions (UTR) of genes in
<italic>Saccharomyces cerevisiae</italic>
</article-title>
<source>Curr Genet</source>
<year>2005</year>
<volume>48</volume>
<fpage>77</fpage>
<lpage>87</lpage>
<pub-id pub-id-type="pmid">16012843</pub-id>
<pub-id pub-id-type="doi">10.1007/s00294-005-0001-x</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Castelli</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Aury</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Jaillon</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Wincker</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Clepet</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Menard</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cruaud</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Quetier</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Scarpelli</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Schachter</surname>
<given-names>V</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Whole genome sequence comparisons and "full-length" cDNA sequences: a combined approach to evaluate and improve Arabidopsis genome annotation</article-title>
<source>Genome Res</source>
<year>2004</year>
<volume>14</volume>
<fpage>406</fpage>
<lpage>413</lpage>
<pub-id pub-id-type="pmid">14993207</pub-id>
<pub-id pub-id-type="doi">10.1101/gr.1515604</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kikuchi</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Satoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Nagata</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kawagashira</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Doi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kishimoto</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Yazaki</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ishikawa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yamada</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ooka</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Collection, mapping, and annotation of over 28,000 cDNA clones from
<italic>japonica </italic>
rice</article-title>
<source>Science</source>
<year>2003</year>
<volume>301</volume>
<fpage>376</fpage>
<lpage>379</lpage>
<pub-id pub-id-type="pmid">12869764</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1081288</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seki</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Narusaka</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kamiya</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ishida</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Satou</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sakurai</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Nakajima</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Enju</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Akiyama</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Oono</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Functional annotation of a full-length
<italic>Arabidopsis </italic>
cDNA collection</article-title>
<source>Science</source>
<year>2002</year>
<volume>296</volume>
<fpage>141</fpage>
<lpage>145</lpage>
<pub-id pub-id-type="pmid">11910074</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1071006</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wolfe</surname>
<given-names>KH</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>YW</given-names>
</name>
<name>
<surname>Sharp</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>WH</given-names>
</name>
</person-group>
<article-title>Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1989</year>
<volume>86</volume>
<fpage>6201</fpage>
<lpage>6205</lpage>
<pub-id pub-id-type="pmid">2762323</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.86.16.6201</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanderson</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>A nonparametric approach to estimating divergence times in the absence of rate constancy</article-title>
<source>Mol Biol Evol</source>
<year>1997</year>
<volume>14</volume>
<fpage>1218</fpage>
<lpage>1231</lpage>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chaw</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>CC</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>HL</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>WH</given-names>
</name>
</person-group>
<article-title>Dating the monocot-dicot divergence and the origin of core eudicots using whole chloroplast genomes</article-title>
<source>J Mol Evol</source>
<year>2004</year>
<volume>58</volume>
<fpage>424</fpage>
<lpage>441</lpage>
<pub-id pub-id-type="pmid">15114421</pub-id>
<pub-id pub-id-type="doi">10.1007/s00239-003-2564-9</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wolfe</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Robustness – it's not where you think it is</article-title>
<source>Nat Genet</source>
<year>2000</year>
<volume>25</volume>
<fpage>3</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="pmid">10802639</pub-id>
<pub-id pub-id-type="doi">10.1038/75560</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schranz</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Mitchell-Olds</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Independent ancient polyploidy events in the sister families Brassicaceae and Cleomaceae</article-title>
<source>Plant Cell</source>
<year>2006</year>
<volume>18</volume>
<fpage>1152</fpage>
<lpage>1165</lpage>
<pub-id pub-id-type="pmid">16617098</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.106.041111</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blanc</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Wolfe</surname>
<given-names>KH</given-names>
</name>
</person-group>
<article-title>Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution</article-title>
<source>Plant Cell</source>
<year>2004</year>
<volume>16</volume>
<fpage>1679</fpage>
<lpage>1691</lpage>
<pub-id pub-id-type="pmid">15208398</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.021410</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blanc</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Hokamp</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Wolfe</surname>
<given-names>KH</given-names>
</name>
</person-group>
<article-title>A recent polyploidy superimposed on older large-scale duplications in the
<italic>Arabidopsis </italic>
genome</article-title>
<source>Genome Res</source>
<year>2003</year>
<volume>13</volume>
<fpage>137</fpage>
<lpage>144</lpage>
<pub-id pub-id-type="pmid">12566392</pub-id>
<pub-id pub-id-type="doi">10.1101/gr.751803</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Riechmann</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Heard</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Reuber</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Keddie</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Adam</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Pineda</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Ratcliffe</surname>
<given-names>OJ</given-names>
</name>
<name>
<surname>Samaha</surname>
<given-names>RR</given-names>
</name>
<etal></etal>
</person-group>
<article-title>
<italic>Arabidopsis </italic>
transcription factors: genome-wide comparative analysis among eukaryotes</article-title>
<source>Science</source>
<year>2000</year>
<volume>290</volume>
<fpage>2105</fpage>
<lpage>2110</lpage>
<pub-id pub-id-type="pmid">11118137</pub-id>
<pub-id pub-id-type="doi">10.1126/science.290.5499.2105</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marchler-Bauer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Cherukuri</surname>
<given-names>PF</given-names>
</name>
<name>
<surname>DeWeese-Scott</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>LY</given-names>
</name>
<name>
<surname>Gwadz</surname>
<given-names>M</given-names>
</name>
<name>
<surname>He</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hurwitz</surname>
<given-names>DI</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Ke</surname>
<given-names>Z</given-names>
</name>
<etal></etal>
</person-group>
<article-title>CDD: a Conserved Domain Database for protein classification</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>D192</fpage>
<lpage>196</lpage>
<pub-id pub-id-type="pmid">15608175</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gki069</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zdobnov</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Apweiler</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>InterProScan – an integration platform for the signature-recognition methods in InterPro</article-title>
<source>Bioinformatics</source>
<year>2001</year>
<volume>17</volume>
<fpage>847</fpage>
<lpage>848</lpage>
<pub-id pub-id-type="pmid">11590104</pub-id>
<pub-id pub-id-type="doi">10.1093/bioinformatics/17.9.847</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eastmond</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Graham</surname>
<given-names>IA</given-names>
</name>
</person-group>
<article-title>Trehalose metabolism: a regulatory role for trehalose-6-phosphate?</article-title>
<source>Curr Opin Plant Biol</source>
<year>2003</year>
<volume>6</volume>
<fpage>231</fpage>
<lpage>235</lpage>
<pub-id pub-id-type="pmid">12753972</pub-id>
<pub-id pub-id-type="doi">10.1016/S1369-5266(03)00037-2</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cruz-Ramirez</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lopez-Bucio</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ramirez-Pimentel</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Zurita-Silva</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sanchez-Calderon</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ramirez-Chavez</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gonzalez-Ortega</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Herrera-Estrella</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>The
<italic>xipotl </italic>
mutant of Arabidopsis reveals a critical role for phospholipid metabolism in root system development and epidermal cell integrity</article-title>
<source>Plant Cell</source>
<year>2004</year>
<volume>16</volume>
<fpage>2020</fpage>
<lpage>2034</lpage>
<pub-id pub-id-type="pmid">15295103</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.103.018648</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mou</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ouyang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Silencing of phosphoethanolamine
<italic>N</italic>
-methyltransferase results in temperature-sensitive male sterility and salt hypersensitivity in Arabidopsis</article-title>
<source>Plant Cell</source>
<year>2002</year>
<volume>14</volume>
<fpage>2031</fpage>
<lpage>2043</lpage>
<pub-id pub-id-type="pmid">12215503</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.001701</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
</person-group>
<article-title>Regulatory functions of phospholipase D and phosphatidic acid in plant growth, development, and stress responses</article-title>
<source>Plant Physiol</source>
<year>2005</year>
<volume>139</volume>
<fpage>566</fpage>
<lpage>573</lpage>
<pub-id pub-id-type="pmid">16219918</pub-id>
<pub-id pub-id-type="doi">10.1104/pp.105.068809</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Geballe</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Sachs</surname>
<given-names>MS</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Sonenberg N, Hershey JWB, Mathews MB</surname>
</name>
</person-group>
<article-title>Translational control by upstream open reading frames</article-title>
<source>Translational control of gene expression</source>
<year>2000</year>
<publisher-name>Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press</publisher-name>
<fpage>595</fpage>
<lpage>614</lpage>
</citation>
</ref>
<ref id="B42">
<citation citation-type="other">
<article-title>UCSC
<italic>X. tropicalis </italic>
BLAT search</article-title>
<ext-link ext-link-type="uri" xlink:href="http://genome.ucsc.edu/cgi-bin/hgBlat"></ext-link>
</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Cases in which parsimony or compatibility methods will be positively misleading</article-title>
<source>Syst Zool</source>
<year>1978</year>
<volume>27</volume>
<fpage>401</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="doi">10.2307/2412923</pub-id>
</citation>
</ref>
<ref id="B44">
<citation citation-type="other">
<article-title>The Maize full length cDNA Project</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.maizecdna.org"></ext-link>
</citation>
</ref>
<ref id="B45">
<citation citation-type="other">
<article-title>RIKEN Poplar full-length cDNA clones</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.brc.riken.jp/lab/epd/Eng/catalog/poplar.shtml"></ext-link>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kozak</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs</article-title>
<source>Nucleic Acids Res</source>
<year>1987</year>
<volume>15</volume>
<fpage>8125</fpage>
<lpage>8148</lpage>
<pub-id pub-id-type="pmid">3313277</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/15.20.8125</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanfrey</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Elliott</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Franceschetti</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Illingworth</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Michael</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>A dual upstream open reading frame-based autoregulatory circuit controlling polyamine-responsive translation</article-title>
<source>J Biol Chem</source>
<year>2005</year>
<volume>280</volume>
<fpage>39229</fpage>
<lpage>39237</lpage>
<pub-id pub-id-type="pmid">16176926</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M509340200</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Law</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Raney</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Heusner</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Polyamine regulation of ribosome pausing at the upstream open reading frame of
<italic>S</italic>
-adenosylmethionine decarboxylase</article-title>
<source>J Biol Chem</source>
<year>2001</year>
<volume>276</volume>
<fpage>38036</fpage>
<lpage>38043</lpage>
<pub-id pub-id-type="pmid">11489903</pub-id>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raney</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Law</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Mize</surname>
<given-names>GJ</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Regulated translation termination at the upstream open reading frame in
<italic>S</italic>
-adenosylmethionine decarboxylase mRNA</article-title>
<source>J Biol Chem</source>
<year>2002</year>
<volume>277</volume>
<fpage>5988</fpage>
<lpage>5994</lpage>
<pub-id pub-id-type="pmid">11741992</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M108375200</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tabuchi</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Okada</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Azuma</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Nanmori</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Yasuda</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Posttranscriptional regulation by the upstream open reading frame of the phosphoethanolamine
<italic>N</italic>
-methyltransferase gene</article-title>
<source>Biosci Biotechnol Biochem</source>
<year>2006</year>
<volume>70</volume>
<fpage>2330</fpage>
<lpage>2334</lpage>
<pub-id pub-id-type="pmid">16960350</pub-id>
<pub-id pub-id-type="doi">10.1271/bbb.60309</pub-id>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bayer</surname>
<given-names>TS</given-names>
</name>
<name>
<surname>Booth</surname>
<given-names>LN</given-names>
</name>
<name>
<surname>Knudsen</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Ellington</surname>
<given-names>AD</given-names>
</name>
</person-group>
<article-title>Arginine-rich motifs present multiple interfaces for specific binding by RNA</article-title>
<source>RNA</source>
<year>2005</year>
<volume>11</volume>
<fpage>1848</fpage>
<lpage>1857</lpage>
<pub-id pub-id-type="pmid">16314457</pub-id>
<pub-id pub-id-type="doi">10.1261/rna.2167605</pub-id>
</citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Imai</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hanzawa</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Komura</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yamamoto</surname>
<given-names>KT</given-names>
</name>
<name>
<surname>Komeda</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Takahashi</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>The dwarf phenotype of the
<italic>Arabidopsis acl5 </italic>
mutant is suppressed by a mutation in an upstream ORF of a bHLH gene</article-title>
<source>Development</source>
<year>2006</year>
<volume>133</volume>
<fpage>3575</fpage>
<lpage>3585</lpage>
<pub-id pub-id-type="pmid">16936072</pub-id>
<pub-id pub-id-type="doi">10.1242/dev.02535</pub-id>
</citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hollick</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Patterson</surname>
<given-names>GI</given-names>
</name>
<name>
<surname>Asmundsson</surname>
<given-names>IM</given-names>
</name>
<name>
<surname>Chandler</surname>
<given-names>VL</given-names>
</name>
</person-group>
<article-title>Paramutation alters regulatory control of the maize pl locus</article-title>
<source>Genetics</source>
<year>2000</year>
<volume>154</volume>
<fpage>1827</fpage>
<lpage>1838</lpage>
<pub-id pub-id-type="pmid">10747073</pub-id>
</citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lai</surname>
<given-names>EC</given-names>
</name>
</person-group>
<article-title>RNA sensors and riboswitches: self-regulating messages</article-title>
<source>Curr Biol</source>
<year>2003</year>
<volume>13</volume>
<fpage>R285</fpage>
<lpage>291</lpage>
<pub-id pub-id-type="pmid">12676109</pub-id>
<pub-id pub-id-type="doi">10.1016/S0960-9822(03)00203-3</pub-id>
</citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yoine</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ohto</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Onai</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Mita</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nakamura</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>The
<italic>lba1 </italic>
mutation of UPF1 RNA helicase involved in nonsense-mediated mRNA decay causes pleiotropic phenotypic changes and altered sugar signalling in Arabidopsis</article-title>
<source>Plant J</source>
<year>2006</year>
<volume>47</volume>
<fpage>49</fpage>
<lpage>62</lpage>
<pub-id pub-id-type="pmid">16740149</pub-id>
<pub-id pub-id-type="doi">10.1111/j.1365-313X.2006.02771.x</pub-id>
</citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gaba</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Jacobson</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sachs</surname>
<given-names>MS</given-names>
</name>
</person-group>
<article-title>Ribosome occupancy of the yeast
<italic>CPA1 </italic>
upstream open reading frame termination codon modulates nonsense-mediated mRNA decay</article-title>
<source>Mol Cell</source>
<year>2005</year>
<volume>20</volume>
<fpage>449</fpage>
<lpage>460</lpage>
<pub-id pub-id-type="pmid">16285926</pub-id>
<pub-id pub-id-type="doi">10.1016/j.molcel.2005.09.019</pub-id>
</citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akiva</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Toporik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Edelheit</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Peretz</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Diber</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Shemesh</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Novik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sorek</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Transcription-mediated gene fusion in the human genome</article-title>
<source>Genome Res</source>
<year>2006</year>
<volume>16</volume>
<fpage>30</fpage>
<lpage>36</lpage>
<pub-id pub-id-type="pmid">16344562</pub-id>
<pub-id pub-id-type="doi">10.1101/gr.4137606</pub-id>
</citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Douzery</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Snell</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Bapteste</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Delsuc</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Philippe</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils?</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2004</year>
<volume>101</volume>
<fpage>15386</fpage>
<lpage>15391</lpage>
<pub-id pub-id-type="pmid">15494441</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.0403984101</pub-id>
</citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parola</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Kobilka</surname>
<given-names>BK</given-names>
</name>
</person-group>
<article-title>The peptide product of a 5' leader cistron in the beta 2 adrenergic receptor mRNA inhibits receptor synthesis</article-title>
<source>J Biol Chem</source>
<year>1994</year>
<volume>269</volume>
<fpage>4497</fpage>
<lpage>4505</lpage>
<pub-id pub-id-type="pmid">8308019</pub-id>
</citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pendleton</surname>
<given-names>LC</given-names>
</name>
<name>
<surname>Goodwin</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Solomonson</surname>
<given-names>LP</given-names>
</name>
<name>
<surname>Eichler</surname>
<given-names>DC</given-names>
</name>
</person-group>
<article-title>Regulation of endothelial argininosuccinate synthase expression and NO production by an upstream open reading frame</article-title>
<source>J Biol Chem</source>
<year>2005</year>
<volume>280</volume>
<fpage>24252</fpage>
<lpage>24260</lpage>
<pub-id pub-id-type="pmid">15851478</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.M500106200</pub-id>
</citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marton</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Cordts</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Broadhvest</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Dresselhaus</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Micropylar pollen tube guidance by egg apparatus 1 of maize</article-title>
<source>Science</source>
<year>2005</year>
<volume>307</volume>
<fpage>573</fpage>
<lpage>576</lpage>
<pub-id pub-id-type="pmid">15681383</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1104954</pub-id>
</citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wen</surname>
<given-names>JQ</given-names>
</name>
<name>
<surname>Lease</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Walker</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>DVL, a novel class of small polypeptides: overexpression alters
<italic>Arabidopsis </italic>
development</article-title>
<source>Plant J</source>
<year>2004</year>
<volume>37</volume>
<fpage>668</fpage>
<lpage>677</lpage>
<pub-id pub-id-type="pmid">14871303</pub-id>
<pub-id pub-id-type="doi">10.1111/j.1365-313X.2003.01994.x</pub-id>
</citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Graveley</surname>
<given-names>BR</given-names>
</name>
</person-group>
<article-title>Sorting out the complexity of SR protein functions</article-title>
<source>RNA</source>
<year>2000</year>
<volume>6</volume>
<fpage>1197</fpage>
<lpage>1211</lpage>
<pub-id pub-id-type="pmid">10999598</pub-id>
<pub-id pub-id-type="doi">10.1017/S1355838200000960</pub-id>
</citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fu</surname>
<given-names>XD</given-names>
</name>
</person-group>
<article-title>The superfamily of arginine/serine-rich splicing factors</article-title>
<source>RNA</source>
<year>1995</year>
<volume>1</volume>
<fpage>663</fpage>
<lpage>680</lpage>
<pub-id pub-id-type="pmid">7585252</pub-id>
</citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanford</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Gray</surname>
<given-names>NK</given-names>
</name>
<name>
<surname>Beckmann</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Caceres</surname>
<given-names>JF</given-names>
</name>
</person-group>
<article-title>A novel role for shuttling SR proteins in mRNA translation</article-title>
<source>Genes Dev</source>
<year>2004</year>
<volume>18</volume>
<fpage>755</fpage>
<lpage>768</lpage>
<pub-id pub-id-type="pmid">15082528</pub-id>
<pub-id pub-id-type="doi">10.1101/gad.286404</pub-id>
</citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crowe</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>XQ</given-names>
</name>
<name>
<surname>Rothnagel</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides</article-title>
<source>BMC Genomics</source>
<year>2006</year>
<volume>7</volume>
<fpage>16</fpage>
<pub-id pub-id-type="pmid">16438715</pub-id>
<pub-id pub-id-type="doi">10.1186/1471-2164-7-16</pub-id>
</citation>
</ref>
<ref id="B67">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Toledo-Ortiz</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Huq</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>PH</given-names>
</name>
</person-group>
<article-title>The Arabidopsis basic/helix-loop-helix transcription factor family</article-title>
<source>Plant Cell</source>
<year>2003</year>
<volume>15</volume>
<fpage>1749</fpage>
<lpage>1770</lpage>
<pub-id pub-id-type="pmid">12897250</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.013839</pub-id>
</citation>
</ref>
<ref id="B68">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bailey</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Toledo-Ortiz</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>PH</given-names>
</name>
<name>
<surname>Huq</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Heim</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Jakoby</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Werber</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Weisshaar</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana</article-title>
<source>Plant Cell</source>
<year>2003</year>
<volume>15</volume>
<fpage>2497</fpage>
<lpage>2502</lpage>
<pub-id pub-id-type="pmid">14600211</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.151140</pub-id>
</citation>
</ref>
<ref id="B69">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heim</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Jakoby</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Werber</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Weisshaar</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>PC</given-names>
</name>
</person-group>
<article-title>The basic helix-loop-helix transcription factor family in plants: a genome-wide study of protein structure and functional diversity</article-title>
<source>Mol Biol Evol</source>
<year>2003</year>
<volume>20</volume>
<fpage>735</fpage>
<lpage>747</lpage>
<pub-id pub-id-type="pmid">12679534</pub-id>
<pub-id pub-id-type="doi">10.1093/molbev/msg088</pub-id>
</citation>
</ref>
<ref id="B70">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hayden</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Wheeler</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Jorgensen</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>Evaluating and improving cDNA sequence quality with cQC</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<fpage>4414</fpage>
<lpage>4415</lpage>
<pub-id pub-id-type="pmid">16234324</pub-id>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bti709</pub-id>
</citation>
</ref>
<ref id="B71">
<citation citation-type="other">
<article-title>EMBOSS-European Molecular Biology Open Software Suite</article-title>
<ext-link ext-link-type="uri" xlink:href="http://emboss.sourceforge.net"></ext-link>
</citation>
</ref>
<ref id="B72">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>States</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Identification of protein coding regions by database similarity search</article-title>
<source>Nat Genet</source>
<year>1993</year>
<volume>3</volume>
<fpage>266</fpage>
<lpage>272</lpage>
<pub-id pub-id-type="pmid">8485583</pub-id>
<pub-id pub-id-type="doi">10.1038/ng0393-266</pub-id>
</citation>
</ref>
<ref id="B73">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Basic local alignment search tool</article-title>
<source>J Mol Biol</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="pmid">2231712</pub-id>
</citation>
</ref>
<ref id="B74">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>TJ</given-names>
</name>
</person-group>
<article-title>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</article-title>
<source>Nucleic Acids Res</source>
<year>1994</year>
<volume>22</volume>
<fpage>4673</fpage>
<lpage>4680</lpage>
<pub-id pub-id-type="pmid">7984417</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/22.22.4673</pub-id>
</citation>
</ref>
<ref id="B75">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clamp</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cuff</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Searle</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Barton</surname>
<given-names>GJ</given-names>
</name>
</person-group>
<article-title>The Jalview Java Alignment Editor</article-title>
<source>Bioinformatics</source>
<year>2004</year>
<volume>20</volume>
<fpage>426</fpage>
<lpage>427</lpage>
<pub-id pub-id-type="pmid">14960472</pub-id>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btg430</pub-id>
</citation>
</ref>
<ref id="B76">
<citation citation-type="other">
<article-title>The Arabidopsis Information Resource</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.arabidopsis.org"></ext-link>
</citation>
</ref>
<ref id="B77">
<citation citation-type="other">
<article-title>TIGR Rice Genome Annotation Project – web BLASTserver</article-title>
<ext-link ext-link-type="uri" xlink:href="http://tigrblast.tigr.org/euk-blast/index.cgi?project=osa1"></ext-link>
</citation>
</ref>
<ref id="B78">
<citation citation-type="other">
<article-title>TAIR locus At1g73600</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.arabidopsis.org/servlets/TairObject?id=29540&type=locus"></ext-link>
</citation>
</ref>
<ref id="B79">
<citation citation-type="other">
<article-title>Pairwise KaKs Perl script</article-title>
<ext-link ext-link-type="uri" xlink:href="http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/scripts/utilities/pairwise_kaks.PLS?cvsroot=bioperl&rev=HEAD"></ext-link>
</citation>
</ref>
<ref id="B80">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution</article-title>
<source>Mol Biol Evol</source>
<year>1998</year>
<volume>15</volume>
<fpage>568</fpage>
<lpage>573</lpage>
<pub-id pub-id-type="pmid">9580986</pub-id>
</citation>
</ref>
<ref id="B81">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anisimova</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bielawski</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution</article-title>
<source>Mol Biol Evol</source>
<year>2001</year>
<volume>18</volume>
<fpage>1585</fpage>
<lpage>1592</lpage>
<pub-id pub-id-type="pmid">11470850</pub-id>
</citation>
</ref>
<ref id="B82">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Blake</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Botstein</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Butler</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Cherry</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Dolinski</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Dwight</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Eppig</surname>
<given-names>JT</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</article-title>
<source>Nat Genet</source>
<year>2000</year>
<volume>25</volume>
<fpage>25</fpage>
<lpage>29</lpage>
<pub-id pub-id-type="pmid">10802651</pub-id>
<pub-id pub-id-type="doi">10.1038/75556</pub-id>
</citation>
</ref>
<ref id="B83">
<citation citation-type="other">
<article-title>The R Project for statistical computing</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.r-project.org"></ext-link>
</citation>
</ref>
<ref id="B84">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Castillo-Davis</surname>
<given-names>CI</given-names>
</name>
<name>
<surname>Hartl</surname>
<given-names>DL</given-names>
</name>
</person-group>
<article-title>GeneMerge – post-genomic analysis, data mining, and hypothesis testing</article-title>
<source>Bioinformatics</source>
<year>2003</year>
<volume>19</volume>
<fpage>891</fpage>
<lpage>892</lpage>
<pub-id pub-id-type="pmid">12724301</pub-id>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btg114</pub-id>
</citation>
</ref>
<ref id="B85">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
</person-group>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<fpage>1792</fpage>
<lpage>1797</lpage>
<pub-id pub-id-type="pmid">15034147</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id>
</citation>
</ref>
<ref id="B86">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
</person-group>
<article-title>MrBayes 3: Bayesian phylogenetic inference under mixed models</article-title>
<source>Bioinformatics</source>
<year>2003</year>
<volume>19</volume>
<fpage>1572</fpage>
<lpage>1574</lpage>
<pub-id pub-id-type="pmid">12912839</pub-id>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btg180</pub-id>
</citation>
</ref>
<ref id="B87">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>PHYLIP-Phylogeny Inference Package (Version 3.2)</article-title>
<source>Cladistics</source>
<year>1989</year>
<volume>5</volume>
<fpage>164</fpage>
<lpage>166</lpage>
</citation>
</ref>
<ref id="B88">
<citation citation-type="journal">
<person-group person-group-type="author">
<collab>AGI</collab>
</person-group>
<article-title>Analysis of the genome sequence of the flowering plant
<italic>Arabidopsis thaliana</italic>
</article-title>
<source>Nature</source>
<year>2000</year>
<volume>408</volume>
<fpage>796</fpage>
<lpage>815</lpage>
<pub-id pub-id-type="pmid">11130711</pub-id>
<pub-id pub-id-type="doi">10.1038/35048692</pub-id>
</citation>
</ref>
<ref id="B89">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rook</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Gerrits</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Kortstee</surname>
<given-names>A</given-names>
</name>
<name>
<surname>van Kampen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Borrias</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Weisbeek</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Smeekens</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Sucrose-specific signalling represses translation of the
<italic>Arabidopsis ATB2 </italic>
bZIP transcription factor gene</article-title>
<source>Plant J</source>
<year>1998</year>
<volume>15</volume>
<fpage>253</fpage>
<lpage>263</lpage>
<pub-id pub-id-type="pmid">9721683</pub-id>
<pub-id pub-id-type="doi">10.1046/j.1365-313X.1998.00205.x</pub-id>
</citation>
</ref>
<ref id="B90">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Satoh-Nagasawa</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Nagasawa</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Malcomber</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sakai</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>A trehalose metabolic enzyme controls inflorescence architecture in maize</article-title>
<source>Nature</source>
<year>2006</year>
<volume>441</volume>
<fpage>227</fpage>
<lpage>230</lpage>
<pub-id pub-id-type="pmid">16688177</pub-id>
<pub-id pub-id-type="doi">10.1038/nature04725</pub-id>
</citation>
</ref>
<ref id="B91">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verhagen</surname>
<given-names>BW</given-names>
</name>
<name>
<surname>Glazebrook</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>HS</given-names>
</name>
<name>
<surname>van Loon</surname>
<given-names>LC</given-names>
</name>
<name>
<surname>Pieterse</surname>
<given-names>CM</given-names>
</name>
</person-group>
<article-title>The transcriptome of rhizobacteria-induced systemic resistance in Arabidopsis</article-title>
<source>Mol Plant Microbe Interact</source>
<year>2004</year>
<volume>17</volume>
<fpage>895</fpage>
<lpage>908</lpage>
<pub-id pub-id-type="pmid">15305611</pub-id>
<pub-id pub-id-type="doi">10.1094/MPMI.2004.17.8.895</pub-id>
</citation>
</ref>
<ref id="B92">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Henriksson</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Olsson</surname>
<given-names>AS</given-names>
</name>
<name>
<surname>Johannesson</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Johansson</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hanson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Engstrom</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Soderman</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Homeodomain leucine zipper class I genes in Arabidopsis. Expression patterns and phylogenetic relationships</article-title>
<source>Plant Physiol</source>
<year>2005</year>
<volume>139</volume>
<fpage>509</fpage>
<lpage>518</lpage>
<pub-id pub-id-type="pmid">16055682</pub-id>
<pub-id pub-id-type="doi">10.1104/pp.105.063461</pub-id>
</citation>
</ref>
<ref id="B93">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aoyama</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Carabelli</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sessa</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Ruberti</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Morelli</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Chua</surname>
<given-names>NH</given-names>
</name>
</person-group>
<article-title>Ectopic expression of the Arabidopsis transcriptional activator Athb-1 alters leaf cell fate in tobacco</article-title>
<source>Plant Cell</source>
<year>1995</year>
<volume>7</volume>
<fpage>1773</fpage>
<lpage>1785</lpage>
<pub-id pub-id-type="pmid">8535134</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.7.11.1773</pub-id>
</citation>
</ref>
<ref id="B94">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bharti</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Von Koskull-Doring</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bharti</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Tintschl-Korbitzer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Treuter</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Nover</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Tomato heat stress transcription factor HsfB1 represents a novel type of general transcription coactivator with a histone-like motif interacting with the plant CREB binding protein ortholog HAC1</article-title>
<source>Plant Cell</source>
<year>2004</year>
<volume>16</volume>
<fpage>1521</fpage>
<lpage>1535</lpage>
<pub-id pub-id-type="pmid">15131252</pub-id>
<pub-id pub-id-type="doi">10.1105/tpc.019927</pub-id>
</citation>
</ref>
<ref id="B95">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nover</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Scharf</surname>
<given-names>KD</given-names>
</name>
<name>
<surname>Gagliardi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Vergne</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Czarnecka-Verner</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gurley</surname>
<given-names>WB</given-names>
</name>
</person-group>
<article-title>The Hsf world: classification and properties of plant heat stress transcription factors</article-title>
<source>Cell Stress Chaperones</source>
<year>1996</year>
<volume>1</volume>
<fpage>215</fpage>
<lpage>223</lpage>
<pub-id pub-id-type="pmid">9222607</pub-id>
<pub-id pub-id-type="doi">10.1379/1466-1268(1996)001<0215:THWCAP>2.3.CO;2</pub-id>
</citation>
</ref>
<ref id="B96">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Poovaiah</surname>
<given-names>BW</given-names>
</name>
</person-group>
<article-title>Molecular and biochemical evidence for the involvement of calcium/calmodulin in auxin action</article-title>
<source>J Biol Chem</source>
<year>2000</year>
<volume>275</volume>
<fpage>3137</fpage>
<lpage>3143</lpage>
<pub-id pub-id-type="pmid">10652297</pub-id>
<pub-id pub-id-type="doi">10.1074/jbc.275.5.3137</pub-id>
</citation>
</ref>
<ref id="B97">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nakano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Fujimura</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Shinshi</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Genome-wide analysis of the ERF gene family in Arabidopsis and rice</article-title>
<source>Plant Physiol</source>
<year>2006</year>
<volume>140</volume>
<fpage>411</fpage>
<lpage>432</lpage>
<pub-id pub-id-type="pmid">16407444</pub-id>
<pub-id pub-id-type="doi">10.1104/pp.105.073783</pub-id>
</citation>
</ref>
<ref id="B98">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gutterson</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Reuber</surname>
<given-names>TL</given-names>
</name>
</person-group>
<article-title>Regulation of disease resistance pathways by AP2/ERF transcription factors</article-title>
<source>Curr Opin Plant Biol</source>
<year>2004</year>
<volume>7</volume>
<fpage>465</fpage>
<lpage>471</lpage>
<pub-id pub-id-type="pmid">15231271</pub-id>
<pub-id pub-id-type="doi">10.1016/j.pbi.2004.04.007</pub-id>
</citation>
</ref>
<ref id="B99">
<citation citation-type="other">
<article-title>PlantsP kinase classification</article-title>
<ext-link ext-link-type="uri" xlink:href="http://plantsp.genomics.purdue.edu/html/families.html"></ext-link>
</citation>
</ref>
<ref id="B100">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sakuma</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Dubouzet</surname>
<given-names>JG</given-names>
</name>
<name>
<surname>Abe</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Shinozaki</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Yamaguchi-Shinozaki</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>DNA-binding specificity of the ERF/AP2 domain of
<italic>Arabidopsis </italic>
DREBs, transcription factors involved in dehydration- and cold-inducible gene expression</article-title>
<source>Biochem Biophys Res Commun</source>
<year>2002</year>
<volume>290</volume>
<fpage>998</fpage>
<lpage>1009</lpage>
<pub-id pub-id-type="pmid">11798174</pub-id>
<pub-id pub-id-type="doi">10.1006/bbrc.2001.6299</pub-id>
</citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Bois/explor/OrangerV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000958  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000958  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Bois
   |area=    OrangerV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Sat Dec 3 17:11:04 2016. Site generation: Wed Mar 6 18:18:32 2024