Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.

Identifieur interne : 003489 ( PubMed/Corpus ); précédent : 003488; suivant : 003490

Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.

Auteurs : Damien Ulveling ; Marcel E. Dinger ; Claire Francastel ; Florent Hubé

Source :

RBID : pubmed:25250049

Abstract

To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of "bona fide" lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs.

DOI: 10.3389/fgene.2014.00316
PubMed: 25250049

Links to Exploration step

pubmed:25250049

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.</title>
<author>
<name sortKey="Ulveling, Damien" sort="Ulveling, Damien" uniqKey="Ulveling D" first="Damien" last="Ulveling">Damien Ulveling</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Dinger, Marcel E" sort="Dinger, Marcel E" uniqKey="Dinger M" first="Marcel E" last="Dinger">Marcel E. Dinger</name>
<affiliation>
<nlm:affiliation>The University of Queensland Diamantina Institute, The University of Queensland Brisbane, QLD, Australia.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Francastel, Claire" sort="Francastel, Claire" uniqKey="Francastel C" first="Claire" last="Francastel">Claire Francastel</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Hube, Florent" sort="Hube, Florent" uniqKey="Hube F" first="Florent" last="Hubé">Florent Hubé</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2014">2014</date>
<idno type="RBID">pubmed:25250049</idno>
<idno type="pmid">25250049</idno>
<idno type="doi">10.3389/fgene.2014.00316</idno>
<idno type="wicri:Area/PubMed/Corpus">003489</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">003489</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.</title>
<author>
<name sortKey="Ulveling, Damien" sort="Ulveling, Damien" uniqKey="Ulveling D" first="Damien" last="Ulveling">Damien Ulveling</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Dinger, Marcel E" sort="Dinger, Marcel E" uniqKey="Dinger M" first="Marcel E" last="Dinger">Marcel E. Dinger</name>
<affiliation>
<nlm:affiliation>The University of Queensland Diamantina Institute, The University of Queensland Brisbane, QLD, Australia.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Francastel, Claire" sort="Francastel, Claire" uniqKey="Francastel C" first="Claire" last="Francastel">Claire Francastel</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Hube, Florent" sort="Hube, Florent" uniqKey="Hube F" first="Florent" last="Hubé">Florent Hubé</name>
<affiliation>
<nlm:affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in genetics</title>
<idno type="ISSN">1664-8021</idno>
<imprint>
<date when="2014" type="published">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of "bona fide" lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">25250049</PMID>
<DateCreated>
<Year>2014</Year>
<Month>09</Month>
<Day>24</Day>
</DateCreated>
<DateCompleted>
<Year>2014</Year>
<Month>09</Month>
<Day>24</Day>
</DateCompleted>
<DateRevised>
<Year>2017</Year>
<Month>02</Month>
<Day>20</Day>
</DateRevised>
<Article PubModel="Electronic-eCollection">
<Journal>
<ISSN IssnType="Print">1664-8021</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>5</Volume>
<PubDate>
<Year>2014</Year>
</PubDate>
</JournalIssue>
<Title>Frontiers in genetics</Title>
<ISOAbbreviation>Front Genet</ISOAbbreviation>
</Journal>
<ArticleTitle>Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.</ArticleTitle>
<Pagination>
<MedlinePgn>316</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.3389/fgene.2014.00316</ELocationID>
<Abstract>
<AbstractText>To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of "bona fide" lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Ulveling</LastName>
<ForeName>Damien</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Dinger</LastName>
<ForeName>Marcel E</ForeName>
<Initials>ME</Initials>
<AffiliationInfo>
<Affiliation>The University of Queensland Diamantina Institute, The University of Queensland Brisbane, QLD, Australia.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Francastel</LastName>
<ForeName>Claire</ForeName>
<Initials>C</Initials>
<AffiliationInfo>
<Affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Hubé</LastName>
<ForeName>Florent</ForeName>
<Initials>F</Initials>
<AffiliationInfo>
<Affiliation>CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2014</Year>
<Month>09</Month>
<Day>09</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Switzerland</Country>
<MedlineTA>Front Genet</MedlineTA>
<NlmUniqueID>101560621</NlmUniqueID>
<ISSNLinking>1664-8021</ISSNLinking>
</MedlineJournalInfo>
<CommentsCorrectionsList>
<CommentsCorrections RefType="Cites">
<RefSource>Mol Biol Evol. 1987 Jul;4(4):395-405</RefSource>
<PMID Version="1">3447014</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Semin Cell Dev Biol. 2011 Jun;22(4):366-76</RefSource>
<PMID Version="1">21256239</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Biochimie. 2011 Nov;93(11):2024-7</RefSource>
<PMID Version="1">21729736</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nat Rev Genet. 2009 Feb;10(2):94-108</RefSource>
<PMID Version="1">19148191</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2005 Jan 1;33(Database issue):D125-30</RefSource>
<PMID Version="1">15608161</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Genes Dev. 2009 Jul 1;23(13):1494-504</RefSource>
<PMID Version="1">19571179</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Biochim Biophys Acta. 1963 Apr 30;68:653-6</RefSource>
<PMID Version="1">13975159</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2005 Jan 1;33(Database issue):D121-4</RefSource>
<PMID Version="1">15608160</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Front Genet. 2012 Apr 23;3:60</RefSource>
<PMID Version="1">22536205</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2004 Jan 1;32(Database issue):D109-11</RefSource>
<PMID Version="1">14681370</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>J Pathol. 2011 Jan;223(2):102-15</RefSource>
<PMID Version="1">21125669</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Curr Opin Microbiol. 1998 Oct;1(5):598-610</RefSource>
<PMID Version="1">10066522</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Science. 1965 Mar 19;147(3664):1462-5</RefSource>
<PMID Version="1">14263761</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7</RefSource>
<PMID Version="1">21071399</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2003 Jan 1;31(1):439-41</RefSource>
<PMID Version="1">12520045</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Bioinformatics. 2005 Feb 15;21(4):545-7</RefSource>
<PMID Version="1">15374859</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2007 Jul;35(Web Server issue):W345-9</RefSource>
<PMID Version="1">17631615</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2011 Jan;39(Database issue):D514-9</RefSource>
<PMID Version="1">20929869</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2011 Jan;39(2):513-25</RefSource>
<PMID Version="1">20855289</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nature. 2011 Jan 6;469(7328):97-101</RefSource>
<PMID Version="1">21085120</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Biochimie. 2011 Apr;93(4):633-44</RefSource>
<PMID Version="1">21111023</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>EMBO Rep. 2001 Nov;2(11):986-91</RefSource>
<PMID Version="1">11713189</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Cell Mol Life Sci. 2006 Aug;63(16):1813-8</RefSource>
<PMID Version="1">16794784</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2008 Jan;36(Database issue):D445-8</RefSource>
<PMID Version="1">17984084</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>FEBS Lett. 2011 Jun 6;585(11):1600-16</RefSource>
<PMID Version="1">21557942</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Am J Hum Genet. 2009 Mar;84(3):316-27</RefSource>
<PMID Version="1">19232555</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nephrology (Carlton). 2010 Sep;15(6):599-608</RefSource>
<PMID Version="1">20883280</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2004 Jan 1;32(Database issue):D101-3</RefSource>
<PMID Version="1">14681368</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>DNA Cell Biol. 2006 Jul;25(7):418-28</RefSource>
<PMID Version="1">16848684</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Genome Res. 2001 Apr;11(4):540-6</RefSource>
<PMID Version="1">11282969</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>PLoS Genet. 2008 Jan;4(1):e22</RefSource>
<PMID Version="1">18225959</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>PLoS Comput Biol. 2008 Nov;4(11):e1000176</RefSource>
<PMID Version="1">19043537</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Science. 2007 Jun 8;316(5830):1484-8</RefSource>
<PMID Version="1">17510325</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Mol Cell. 2009 Aug 28;35(4):467-78</RefSource>
<PMID Version="1">19716791</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Genome Biol. 2009;10(11):R124</RefSource>
<PMID Version="1">19895688</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Biochimie. 2011 Nov;93(11):2013-8</RefSource>
<PMID Version="1">21802485</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 1998 Jan 1;26(1):148-53</RefSource>
<PMID Version="1">9399820</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nucleic Acids Res. 2011 Jan;39(Database issue):D146-51</RefSource>
<PMID Version="1">21112873</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Bioessays. 2003 Oct;25(10):930-9</RefSource>
<PMID Version="1">14505360</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="Cites">
<RefSource>Nat Rev Genet. 2009 Mar;10(3):155-9</RefSource>
<PMID Version="1">19188922</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
<OtherID Source="NLM">PMC4158813</OtherID>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">CG dinucleotide</Keyword>
<Keyword MajorTopicYN="N">database</Keyword>
<Keyword MajorTopicYN="N">exon</Keyword>
<Keyword MajorTopicYN="N">intron</Keyword>
<Keyword MajorTopicYN="N">mRNA</Keyword>
<Keyword MajorTopicYN="N">ncRNA</Keyword>
<Keyword MajorTopicYN="N">pseudogene</Keyword>
<Keyword MajorTopicYN="N">sequence biais</Keyword>
</KeywordList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2014</Year>
<Month>06</Month>
<Day>26</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2014</Year>
<Month>08</Month>
<Day>22</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2014</Year>
<Month>9</Month>
<Day>25</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2014</Year>
<Month>9</Month>
<Day>25</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2014</Year>
<Month>9</Month>
<Day>25</Day>
<Hour>6</Hour>
<Minute>1</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">25250049</ArticleId>
<ArticleId IdType="doi">10.3389/fgene.2014.00316</ArticleId>
<ArticleId IdType="pmc">PMC4158813</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003489 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 003489 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:25250049
   |texte=   Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:25250049" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a AustralieFrV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024