Serveur d'exploration Covid (26 mars)

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A high-throughput approach to profile RNA structure

Identifieur interne : 000967 ( Pmc/Corpus ); précédent : 000966; suivant : 000968

A high-throughput approach to profile RNA structure

Auteurs : Riccardo Delli Ponti ; Stefanie Marti ; Alexandros Armaos ; Gian Gaetano Tartaglia

Source :

RBID : PMC:5389523

Abstract

Abstract

Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%.


Url:
DOI: 10.1093/nar/gkw1094
PubMed: 27899588
PubMed Central: 5389523

Links to Exploration step

PMC:5389523

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A high-throughput approach to profile RNA structure</title>
<author>
<name sortKey="Delli Ponti, Riccardo" sort="Delli Ponti, Riccardo" uniqKey="Delli Ponti R" first="Riccardo" last="Delli Ponti">Riccardo Delli Ponti</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Marti, Stefanie" sort="Marti, Stefanie" uniqKey="Marti S" first="Stefanie" last="Marti">Stefanie Marti</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Armaos, Alexandros" sort="Armaos, Alexandros" uniqKey="Armaos A" first="Alexandros" last="Armaos">Alexandros Armaos</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tartaglia, Gian Gaetano" sort="Tartaglia, Gian Gaetano" uniqKey="Tartaglia G" first="Gian Gaetano" last="Tartaglia">Gian Gaetano Tartaglia</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF3">Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Passeig Lluıs Companys, 08010 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27899588</idno>
<idno type="pmc">5389523</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389523</idno>
<idno type="RBID">PMC:5389523</idno>
<idno type="doi">10.1093/nar/gkw1094</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000967</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000967</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A high-throughput approach to profile RNA structure</title>
<author>
<name sortKey="Delli Ponti, Riccardo" sort="Delli Ponti, Riccardo" uniqKey="Delli Ponti R" first="Riccardo" last="Delli Ponti">Riccardo Delli Ponti</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Marti, Stefanie" sort="Marti, Stefanie" uniqKey="Marti S" first="Stefanie" last="Marti">Stefanie Marti</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Armaos, Alexandros" sort="Armaos, Alexandros" uniqKey="Armaos A" first="Alexandros" last="Armaos">Alexandros Armaos</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tartaglia, Gian Gaetano" sort="Tartaglia, Gian Gaetano" uniqKey="Tartaglia G" first="Gian Gaetano" last="Tartaglia">Gian Gaetano Tartaglia</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF2">Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF3">Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Passeig Lluıs Companys, 08010 Barcelona, Spain</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<title>Abstract</title>
<p>Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as
<italic>Xist</italic>
(17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Mortimer, S A" uniqKey="Mortimer S">S.A. Mortimer</name>
</author>
<author>
<name sortKey="Kidwell, M A" uniqKey="Kidwell M">M.A. Kidwell</name>
</author>
<author>
<name sortKey="Doudna, J A" uniqKey="Doudna J">J.A. Doudna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tartaglia, G G" uniqKey="Tartaglia G">G.G. Tartaglia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kertesz, M" uniqKey="Kertesz M">M. Kertesz</name>
</author>
<author>
<name sortKey="Wan, Y" uniqKey="Wan Y">Y. Wan</name>
</author>
<author>
<name sortKey="Mazor, E" uniqKey="Mazor E">E. Mazor</name>
</author>
<author>
<name sortKey="Rinn, J L" uniqKey="Rinn J">J.L. Rinn</name>
</author>
<author>
<name sortKey="Nutter, R C" uniqKey="Nutter R">R.C. Nutter</name>
</author>
<author>
<name sortKey="Chang, H Y" uniqKey="Chang H">H.Y. Chang</name>
</author>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E. Segal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wan, Y" uniqKey="Wan Y">Y. Wan</name>
</author>
<author>
<name sortKey="Qu, K" uniqKey="Qu K">K. Qu</name>
</author>
<author>
<name sortKey="Zhang, Q C" uniqKey="Zhang Q">Q.C. Zhang</name>
</author>
<author>
<name sortKey="Flynn, R A" uniqKey="Flynn R">R.A. Flynn</name>
</author>
<author>
<name sortKey="Manor, O" uniqKey="Manor O">O. Manor</name>
</author>
<author>
<name sortKey="Ouyang, Z" uniqKey="Ouyang Z">Z. Ouyang</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J. Zhang</name>
</author>
<author>
<name sortKey="Spitale, R C" uniqKey="Spitale R">R.C. Spitale</name>
</author>
<author>
<name sortKey="Snyder, M P" uniqKey="Snyder M">M.P. Snyder</name>
</author>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E. Segal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Spitale, R C" uniqKey="Spitale R">R.C. Spitale</name>
</author>
<author>
<name sortKey="Flynn, R A" uniqKey="Flynn R">R.A. Flynn</name>
</author>
<author>
<name sortKey="Zhang, Q C" uniqKey="Zhang Q">Q.C. Zhang</name>
</author>
<author>
<name sortKey="Crisalli, P" uniqKey="Crisalli P">P. Crisalli</name>
</author>
<author>
<name sortKey="Lee, B" uniqKey="Lee B">B. Lee</name>
</author>
<author>
<name sortKey="Jung, J W" uniqKey="Jung J">J.-W. Jung</name>
</author>
<author>
<name sortKey="Kuchelmeister, H Y" uniqKey="Kuchelmeister H">H.Y. Kuchelmeister</name>
</author>
<author>
<name sortKey="Batista, P J" uniqKey="Batista P">P.J. Batista</name>
</author>
<author>
<name sortKey="Torre, E A" uniqKey="Torre E">E.A. Torre</name>
</author>
<author>
<name sortKey="Kool, E T" uniqKey="Kool E">E.T. Kool</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilkinson, K A" uniqKey="Wilkinson K">K.A. Wilkinson</name>
</author>
<author>
<name sortKey="Merino, E J" uniqKey="Merino E">E.J. Merino</name>
</author>
<author>
<name sortKey="Weeks, K M" uniqKey="Weeks K">K.M. Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cordero, P" uniqKey="Cordero P">P. Cordero</name>
</author>
<author>
<name sortKey="Kladwang, W" uniqKey="Kladwang W">W. Kladwang</name>
</author>
<author>
<name sortKey="Vanlang, C C" uniqKey="Vanlang C">C.C. VanLang</name>
</author>
<author>
<name sortKey="Das, R" uniqKey="Das R">R. Das</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rouskin, S" uniqKey="Rouskin S">S. Rouskin</name>
</author>
<author>
<name sortKey="Zubradt, M" uniqKey="Zubradt M">M. Zubradt</name>
</author>
<author>
<name sortKey="Washietl, S" uniqKey="Washietl S">S. Washietl</name>
</author>
<author>
<name sortKey="Kellis, M" uniqKey="Kellis M">M. Kellis</name>
</author>
<author>
<name sortKey="Weissman, J S" uniqKey="Weissman J">J.S. Weissman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wells, S E" uniqKey="Wells S">S.E. Wells</name>
</author>
<author>
<name sortKey="Hughes, J M" uniqKey="Hughes J">J.M. Hughes</name>
</author>
<author>
<name sortKey="Igel, A H" uniqKey="Igel A">A.H. Igel</name>
</author>
<author>
<name sortKey="Ares, M" uniqKey="Ares M">M. Ares</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Watts, J M" uniqKey="Watts J">J.M. Watts</name>
</author>
<author>
<name sortKey="Dang, K K" uniqKey="Dang K">K.K. Dang</name>
</author>
<author>
<name sortKey="Gorelick, R J" uniqKey="Gorelick R">R.J. Gorelick</name>
</author>
<author>
<name sortKey="Leonard, C W" uniqKey="Leonard C">C.W. Leonard</name>
</author>
<author>
<name sortKey="Bess, J W" uniqKey="Bess J">J.W. Bess</name>
</author>
<author>
<name sortKey="Swanstrom, R" uniqKey="Swanstrom R">R. Swanstrom</name>
</author>
<author>
<name sortKey="Burch, C L" uniqKey="Burch C">C.L. Burch</name>
</author>
<author>
<name sortKey="Weeks, K M" uniqKey="Weeks K">K.M. Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andronescu, M" uniqKey="Andronescu M">M. Andronescu</name>
</author>
<author>
<name sortKey="Bereg, V" uniqKey="Bereg V">V. Bereg</name>
</author>
<author>
<name sortKey="Hoos, H H" uniqKey="Hoos H">H.H. Hoos</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A. Condon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deigan, K E" uniqKey="Deigan K">K.E. Deigan</name>
</author>
<author>
<name sortKey="Li, T W" uniqKey="Li T">T.W. Li</name>
</author>
<author>
<name sortKey="Mathews, D H" uniqKey="Mathews D">D.H. Mathews</name>
</author>
<author>
<name sortKey="Weeks, K M" uniqKey="Weeks K">K.M. Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bellucci, M" uniqKey="Bellucci M">M. Bellucci</name>
</author>
<author>
<name sortKey="Agostini, F" uniqKey="Agostini F">F. Agostini</name>
</author>
<author>
<name sortKey="Masin, M" uniqKey="Masin M">M. Masin</name>
</author>
<author>
<name sortKey="Tartaglia, G G" uniqKey="Tartaglia G">G.G. Tartaglia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Novikova, I V" uniqKey="Novikova I">I.V. Novikova</name>
</author>
<author>
<name sortKey="Hennelly, S P" uniqKey="Hennelly S">S.P. Hennelly</name>
</author>
<author>
<name sortKey="Sanbonmatsu, K Y" uniqKey="Sanbonmatsu K">K.Y. Sanbonmatsu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lorenz, R" uniqKey="Lorenz R">R. Lorenz</name>
</author>
<author>
<name sortKey="Luntzer, D" uniqKey="Luntzer D">D. Luntzer</name>
</author>
<author>
<name sortKey="Hofacker, I L" uniqKey="Hofacker I">I.L. Hofacker</name>
</author>
<author>
<name sortKey="Stadler, P F" uniqKey="Stadler P">P.F. Stadler</name>
</author>
<author>
<name sortKey="Wolfinger, M T" uniqKey="Wolfinger M">M.T. Wolfinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fang, R" uniqKey="Fang R">R. Fang</name>
</author>
<author>
<name sortKey="Moss, W N" uniqKey="Moss W">W.N. Moss</name>
</author>
<author>
<name sortKey="Rutenberg Schoenberg, M" uniqKey="Rutenberg Schoenberg M">M. Rutenberg-Schoenberg</name>
</author>
<author>
<name sortKey="Simon, M D" uniqKey="Simon M">M.D. Simon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, D H" uniqKey="Mathews D">D.H. Mathews</name>
</author>
<author>
<name sortKey="Sabina, J" uniqKey="Sabina J">J. Sabina</name>
</author>
<author>
<name sortKey="Zuker, M" uniqKey="Zuker M">M. Zuker</name>
</author>
<author>
<name sortKey="Turner, D H" uniqKey="Turner D">D.H. Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reuter, J S" uniqKey="Reuter J">J.S. Reuter</name>
</author>
<author>
<name sortKey="Mathews, D H" uniqKey="Mathews D">D.H. Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bailey, T L" uniqKey="Bailey T">T.L. Bailey</name>
</author>
<author>
<name sortKey="Johnson, J" uniqKey="Johnson J">J. Johnson</name>
</author>
<author>
<name sortKey="Grant, C E" uniqKey="Grant C">C.E. Grant</name>
</author>
<author>
<name sortKey="Noble, W S" uniqKey="Noble W">W.S. Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alipanahi, B" uniqKey="Alipanahi B">B. Alipanahi</name>
</author>
<author>
<name sortKey="Delong, A" uniqKey="Delong A">A. Delong</name>
</author>
<author>
<name sortKey="Weirauch, M T" uniqKey="Weirauch M">M.T. Weirauch</name>
</author>
<author>
<name sortKey="Frey, B J" uniqKey="Frey B">B.J. Frey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, Y" uniqKey="Wu Y">Y. Wu</name>
</author>
<author>
<name sortKey="Shi, B" uniqKey="Shi B">B. Shi</name>
</author>
<author>
<name sortKey="Ding, X" uniqKey="Ding X">X. Ding</name>
</author>
<author>
<name sortKey="Liu, T" uniqKey="Liu T">T. Liu</name>
</author>
<author>
<name sortKey="Hu, X" uniqKey="Hu X">X. Hu</name>
</author>
<author>
<name sortKey="Yip, K Y" uniqKey="Yip K">K.Y. Yip</name>
</author>
<author>
<name sortKey="Yang, Z R" uniqKey="Yang Z">Z.R. Yang</name>
</author>
<author>
<name sortKey="Mathews, D H" uniqKey="Mathews D">D.H. Mathews</name>
</author>
<author>
<name sortKey="Lu, Z J" uniqKey="Lu Z">Z.J. Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lange, S J" uniqKey="Lange S">S.J. Lange</name>
</author>
<author>
<name sortKey="Maticzka, D" uniqKey="Maticzka D">D. Maticzka</name>
</author>
<author>
<name sortKey="Mohl, M" uniqKey="Mohl M">M. Möhl</name>
</author>
<author>
<name sortKey="Gagnon, J N" uniqKey="Gagnon J">J.N. Gagnon</name>
</author>
<author>
<name sortKey="Brown, C M" uniqKey="Brown C">C.M. Brown</name>
</author>
<author>
<name sortKey="Backofen, R" uniqKey="Backofen R">R. Backofen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ulitsky, I" uniqKey="Ulitsky I">I. Ulitsky</name>
</author>
<author>
<name sortKey="Bartel, D P" uniqKey="Bartel D">D.P. Bartel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nesterova, T B" uniqKey="Nesterova T">T.B. Nesterova</name>
</author>
<author>
<name sortKey="Slobodyanyuk, S Y" uniqKey="Slobodyanyuk S">S.Y. Slobodyanyuk</name>
</author>
<author>
<name sortKey="Elisaphenko, E A" uniqKey="Elisaphenko E">E.A. Elisaphenko</name>
</author>
<author>
<name sortKey="Shevchenko, A I" uniqKey="Shevchenko A">A.I. Shevchenko</name>
</author>
<author>
<name sortKey="Johnston, C" uniqKey="Johnston C">C. Johnston</name>
</author>
<author>
<name sortKey="Pavlova, M E" uniqKey="Pavlova M">M.E. Pavlova</name>
</author>
<author>
<name sortKey="Rogozin, I B" uniqKey="Rogozin I">I.B. Rogozin</name>
</author>
<author>
<name sortKey="Kolesnikov, N N" uniqKey="Kolesnikov N">N.N. Kolesnikov</name>
</author>
<author>
<name sortKey="Brockdorff, N" uniqKey="Brockdorff N">N. Brockdorff</name>
</author>
<author>
<name sortKey="Zakian, S M" uniqKey="Zakian S">S.M. Zakian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wan, Y" uniqKey="Wan Y">Y. Wan</name>
</author>
<author>
<name sortKey="Qu, K" uniqKey="Qu K">K. Qu</name>
</author>
<author>
<name sortKey="Ouyang, Z" uniqKey="Ouyang Z">Z. Ouyang</name>
</author>
<author>
<name sortKey="Kertesz, M" uniqKey="Kertesz M">M. Kertesz</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J. Li</name>
</author>
<author>
<name sortKey="Tibshirani, R" uniqKey="Tibshirani R">R. Tibshirani</name>
</author>
<author>
<name sortKey="Makino, D L" uniqKey="Makino D">D.L. Makino</name>
</author>
<author>
<name sortKey="Nutter, R C" uniqKey="Nutter R">R.C. Nutter</name>
</author>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E. Segal</name>
</author>
<author>
<name sortKey="Chang, H Y" uniqKey="Chang H">H.Y. Chang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rinn, J L" uniqKey="Rinn J">J.L. Rinn</name>
</author>
<author>
<name sortKey="Chang, H Y" uniqKey="Chang H">H.Y. Chang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gsponer, J" uniqKey="Gsponer J">J. Gsponer</name>
</author>
<author>
<name sortKey="Babu, M M" uniqKey="Babu M">M.M. Babu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gruber, A R" uniqKey="Gruber A">A.R. Gruber</name>
</author>
<author>
<name sortKey="Lorenz, R" uniqKey="Lorenz R">R. Lorenz</name>
</author>
<author>
<name sortKey="Bernhart, S H" uniqKey="Bernhart S">S.H. Bernhart</name>
</author>
<author>
<name sortKey="Neubock, R" uniqKey="Neubock R">R. Neubock</name>
</author>
<author>
<name sortKey="Hofacker, I L" uniqKey="Hofacker I">I.L. Hofacker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Agostini, F" uniqKey="Agostini F">F. Agostini</name>
</author>
<author>
<name sortKey="Zanzoni, A" uniqKey="Zanzoni A">A. Zanzoni</name>
</author>
<author>
<name sortKey="Klus, P" uniqKey="Klus P">P. Klus</name>
</author>
<author>
<name sortKey="Marchese, D" uniqKey="Marchese D">D. Marchese</name>
</author>
<author>
<name sortKey="Cirillo, D" uniqKey="Cirillo D">D. Cirillo</name>
</author>
<author>
<name sortKey="Tartaglia, G G" uniqKey="Tartaglia G">G.G. Tartaglia</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-title-group>
<journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27899588</article-id>
<article-id pub-id-type="pmc">5389523</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkw1094</article-id>
<article-id pub-id-type="publisher-id">gkw1094</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Methods Online</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A high-throughput approach to profile RNA structure</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Delli Ponti</surname>
<given-names>Riccardo</given-names>
</name>
<xref ref-type="aff" rid="AFF1">1</xref>
<xref ref-type="aff" rid="AFF2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Marti</surname>
<given-names>Stefanie</given-names>
</name>
<xref ref-type="aff" rid="AFF1">1</xref>
<xref ref-type="aff" rid="AFF2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Armaos</surname>
<given-names>Alexandros</given-names>
</name>
<xref ref-type="aff" rid="AFF1">1</xref>
<xref ref-type="aff" rid="AFF2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tartaglia</surname>
<given-names>Gian Gaetano</given-names>
</name>
<pmc-comment>gian.tartaglia@crg.es</pmc-comment>
<xref ref-type="aff" rid="AFF1">1</xref>
<xref ref-type="aff" rid="AFF2">2</xref>
<xref ref-type="aff" rid="AFF3">3</xref>
<xref ref-type="corresp" rid="COR1"></xref>
</contrib>
</contrib-group>
<aff id="AFF1">
<label>1</label>
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003 Barcelona, Spain</aff>
<aff id="AFF2">
<label>2</label>
Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain</aff>
<aff id="AFF3">
<label>3</label>
Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Passeig Lluıs Companys, 08010 Barcelona, Spain</aff>
<author-notes>
<corresp id="COR1">
<label>*</label>
To whom correspondence should be addressed. Tel: +34 93 316 01 16; Fax: +34 93 396 99 83; Email:
<email>gian.tartaglia@crg.es</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<day>17</day>
<month>3</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="epub" iso-8601-date="2016-11-29">
<day>29</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>29</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>45</volume>
<issue>5</issue>
<fpage>e35</fpage>
<lpage>e35</lpage>
<history>
<date date-type="accepted">
<day>28</day>
<month>10</month>
<year>2016</year>
</date>
<date date-type="rev-recd">
<day>05</day>
<month>10</month>
<year>2016</year>
</date>
<date date-type="received">
<day>02</day>
<month>8</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="cc-by-nc" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<uri xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</uri>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact
<email>journals.permissions@oup.com</email>
</license-p>
</license>
</permissions>
<self-uri xlink:href="gkw1094.pdf"></self-uri>
<abstract>
<title>Abstract</title>
<p>Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as
<italic>Xist</italic>
(17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%.</p>
</abstract>
<counts>
<page-count count="8"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="SEC1">
<title>INTRODUCTION</title>
<p>The structure of an RNA determines its interactions and functions (
<xref rid="B1" ref-type="bibr">1</xref>
,
<xref rid="B2" ref-type="bibr">2</xref>
). RNA structure can be studied using low-throughput techniques such as nuclear magnetic resonance (NMR) and X-ray crystallography. More recent approaches have started to exploit biochemical reactions to perform high-throughput profiling of the RNA structure: Parallel Analysis of RNA Structure (PARS) distinguishes double- and single-stranded regions using the catalytic activity of two enzymes, RNase V1 (able to cut double-stranded nucleotides) and S1 (able to cut single-stranded nucleotides) (
<xref rid="B3" ref-type="bibr">3</xref>
,
<xref rid="B4" ref-type="bibr">4</xref>
), while Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) (
<xref rid="B5" ref-type="bibr">5</xref>
,
<xref rid="B6" ref-type="bibr">6</xref>
) employs highly reactive chemical probes such as 1M6, NMIA (SHAPE) and NAI-N
<sub>3</sub>
(icSHAPE) to characterize RNA backbone flexibility. Another technique based on dimethyl sulfate (DMS) (
<xref rid="B7" ref-type="bibr">7</xref>
) is often used for
<italic>in vivo</italic>
probing of transcriptomes (
<xref rid="B8" ref-type="bibr">8</xref>
,
<xref rid="B9" ref-type="bibr">9</xref>
). DMS experiments are of high quality due to the smaller size of the (CH
<sub>3</sub>
O)
<sub>2</sub>
SO
<sub>2</sub>
probe, yet have low coverage, since the alkylating agent only reacts to adenine and cytosine.</p>
<p>Transcriptomic studies require intense experimental work that could be substantially reduced by using computational approaches. We built Computational Recognition of Secondary Structure (CROSS) to perform high-throughput predictions of transcript structure using the information contained in RNA sequences. The algorithm predicts the structural profile (single- and double-stranded state) of a transcript at single-nucleotide resolution using sequence information only and without sequence length restrictions.</p>
<p>We trained CROSS on data from high-throughput [PARS: yeast and human transcriptomes (
<xref rid="B3" ref-type="bibr">3</xref>
,
<xref rid="B4" ref-type="bibr">4</xref>
) and icSHAPE: mouse transcriptome (
<xref rid="B5" ref-type="bibr">5</xref>
)] and low-throughput [SHAPE: HIV RNA (
<xref rid="B10" ref-type="bibr">10</xref>
)] experiments as well as high-quality NMR/X-ray structures (
<xref rid="B11" ref-type="bibr">11</xref>
). We did not use DMS experiments because they do not provide information on the structural state of all the nucleotides (
<xref rid="B1" ref-type="bibr">1</xref>
,
<xref rid="B5" ref-type="bibr">5</xref>
). Each of the five models reflects the specificities of the experimental technique used to generate the data. Since each approach has practical limitations and a different range of applicability, we also evaluated different methods to integrate the five models into a single algorithm,
<italic>Global Score</italic>
, to provide a
<italic>consensus</italic>
prediction.</p>
<p>The core of CROSS is an artificial neural network yielding a propensity score ranging from −1 (bottom values; single-stranded RNA) to 1 (top values; double-stranded RNA). CROSS was designed to investigate large-scale data sets and to provide information that can be integrated in methods for prediction of RNA secondary structure (
<xref rid="B12" ref-type="bibr">12</xref>
) as well as interactions with other molecules (
<xref rid="B13" ref-type="bibr">13</xref>
).</p>
</sec>
<sec sec-type="materials|methods" id="SEC2">
<title>MATERIALS AND METHODS</title>
<sec id="SEC2-1">
<title>CROSS architecture</title>
<p>We trained CROSS models using an artificial neural network with one hidden layer and two adaptive weight matrices
<inline-formula>
<tex-math id="M3">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\omega _k^i$\end{document}</tex-math>
</inline-formula>
and
<inline-formula>
<tex-math id="M4">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${{\rm{\Omega }}^k}$\end{document}</tex-math>
</inline-formula>
that are optimized using backpropagation.</p>
<p>In our approach, we use the 4-mer notation to represent each nucleotide: A = (1, 0, 0, 0), C = (0, 1, 0, 0), G = (0, 0, 1, 0) and U = (0, 0, 0, 1). The input of our method (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Data sets</italic>
) is the vector
<inline-formula>
<tex-math id="M5">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\ {F_{\rm i}}$\end{document}</tex-math>
</inline-formula>
encoding the information on fragments of fixed length (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Selection of the optimal window</italic>
). The input information required to predict the structural state of a specific nucleotide was extracted using a sliding window spanning the precedent and subsequent 6 residues (i.e. 13 nucleotides; longer fragments do not substantially improve the method;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Selection of the optimal window</italic>
;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
<p>This input
<inline-formula>
<tex-math id="M6">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${F_i}$\end{document}</tex-math>
</inline-formula>
is propagated to the first hidden layer of
<inline-formula>
<tex-math id="M7">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k$\end{document}</tex-math>
</inline-formula>
nodes as
<disp-formula id="M1">
<label>(1)</label>
<tex-math id="M8">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}\begin{equation*}{h_{\rm k}} = tanh(\omega _{\rm k}^i F_{i})\end{equation*}\end{document}</tex-math>
</disp-formula>
where
<inline-formula>
<tex-math id="M9">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$tanh( x )$\end{document}</tex-math>
</inline-formula>
is the hyperbolic tangent of
<inline-formula>
<tex-math id="M10">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$x$\end{document}</tex-math>
</inline-formula>
and the sum follows Einstein's notation.</p>
<p>The score
<inline-formula>
<tex-math id="M11">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\rm{\Pi }}$\end{document}</tex-math>
</inline-formula>
of the nucleotide in the center of the window is then given by
<disp-formula id="M2">
<label>(2)</label>
<tex-math id="M12">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}\begin{equation*}{\rm{\Pi }} = tanh({{\rm{\Omega }}^k}{h_{\rm k}})\end{equation*}\end{document}</tex-math>
</disp-formula>
where the contributions
<inline-formula>
<tex-math id="M13">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${h_{\rm k}}$\end{document}</tex-math>
</inline-formula>
of the hidden layer are weighted by
<inline-formula>
<tex-math id="M14">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${{\rm{\Omega }}^k}$\end{document}</tex-math>
</inline-formula>
.</p>
<p>To avoid over-fitting when optimizing
<inline-formula>
<tex-math id="M15">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\omega _{\rm k}^i$\end{document}</tex-math>
</inline-formula>
and
<inline-formula>
<tex-math id="M16">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${{\rm{\Omega }}^k}$\end{document}</tex-math>
</inline-formula>
, we varied the number of nodes proportionally to the size of the training set and performed a 5-fold cross-validation at each optimization. For
<inline-formula>
<tex-math id="M17">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$k\ = \ 20$\end{document}</tex-math>
</inline-formula>
we obtain the performances reported in the
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
.</p>
</sec>
<sec id="SEC2-2">
<title>Consensus models</title>
<p>Since one technique might not be sufficient to capture structural properties of long transcripts (
<xref rid="B14" ref-type="bibr">14</xref>
), we evaluated different approaches to combine the five CROSS models (PARS-Human, PARS-Yeast, SHAPE-HIV, icSHAPE-Mouse and NMR/X-ray) into a
<italic>consensus</italic>
prediction. To this aim, we measured the performances of the models on an independent test set of 67 NMR/X-ray structures (
<xref rid="B15" ref-type="bibr">15</xref>
) for which SHAPE data are available (17 145 fragments in total), evaluating precision (PPV) and Area Under the ROC Curve (AUC;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). Consistently with the type of information contained in the training set, we observed the best performances for the NMR/X-ray model (PPV: 0.69; AUC: 0.64) followed by HIV-SHAPE (PPV: 0.64; AUC: 0.63). Comparing the scores of the five models, we did not find strong correlations (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), except for PARS-Human and PARS-Yeast (Pearson's correlation = 0.50) that were trained on data obtained with the same experimental techniques (Figure
<xref ref-type="fig" rid="F1">1</xref>
;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
<fig id="F1" orientation="portrait" position="float">
<label>Figure 1.</label>
<caption>
<p>(
<bold>A</bold>
) To build the Computational Recognition of Secondary Structure (CROSS) models we used experimental data from four transcriptome-wide studies (
<italic>H. sapiens, M. musculus, S. cerevisiae</italic>
and HIV-1) as well as NMR/X-ray structures (
<xref rid="B3" ref-type="bibr">3</xref>
,
<xref rid="B5" ref-type="bibr">5</xref>
,
<xref rid="B4" ref-type="bibr">4</xref>
<xref rid="B11" ref-type="bibr">11</xref>
). Each model was trained on one data set (PARS-Human, PARS-Yeast, icSHAPE-Mouse, SHAPE-HIV and NMR/X-ray) and tested on the others. (
<bold>B</bold>
) Performances increase from low- (median) to high-confidence (top and bottom 5%) values of the CROSS scores distribution (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). The plot illustrates the performances of the icSHAPE-Mouse model tested on the SHAPE-HIV data set. (
<bold>C</bold>
) High-confidence predictions: the arrows connect the training and testing sets along withrelative accuracies (cross-validation on training sets are marked with circular arrows). We used the same number of nucleotides with high (double-stranded) and low (single-stranded) propensity scores for comparison with experimental data. Negligible overlap exists between training and testing sets (Jaccard index < 0.002 between each couple of sets analyzed;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
</caption>
<graphic xlink:href="gkw1094fig1"></graphic>
</fig>
</sec>
<sec id="SEC2-3">
<title>
<italic>Z-Score</italic>
</title>
<p>We combined the five CROSS models into a
<italic>Z-Score</italic>
variable. For each nucleotide in the sequence, the
<italic>Z-Score</italic>
is computed using the mean of the individual scores and the associated standard deviation: the double-stranded propensity is proportional to the
<italic>Z-Score</italic>
. We used this method to predict the structural profile of the
<italic>Xist</italic>
non-coding RNA (17 900 nt) and found an AUC of 0.75 on data from DMS experiments (
<xref rid="B16" ref-type="bibr">16</xref>
).</p>
</sec>
<sec id="SEC2-4">
<title>
<italic>Global Score</italic>
</title>
<p>We employed the scores of the five CROSS models to train a single classifier. The training set comprised 43 sequences (11 670 fragments) and the test set was composed of 24 transcripts (5 475 fragments; not in the training set of any of the CROSS models) (
<xref rid="B15" ref-type="bibr">15</xref>
). Among different classifiers the support vector machine with radial basis function kernel shows the best performances (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
<p>The
<italic>Z-Score</italic>
and
<italic>Global Score</italic>
predictions show a correlation of 0.85 (0.97 with a smoothing window of 200 nt) when applied to the
<italic>Xist</italic>
non-coding RNA (17 900 nt) (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). The high correlation indicates that the five models are assigned similar weights by
<italic>Global Score</italic>
and thus have similar performances. Since CROSS
<italic>Z-Score</italic>
and
<italic>Global Score</italic>
are correlated, we only provide
<italic>Global Score</italic>
on our webserver.</p>
</sec>
<sec id="SEC2-5">
<title>RNAstructure</title>
<p>We used
<italic>RNAstructure</italic>
with the
<italic>Fold</italic>
module and the
<italic>minimum free energy</italic>
flag to predict the best RNA secondary structure of each RNA sequence (
<xref rid="B17" ref-type="bibr">17</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
). To mimic experimental constraints in the
<italic>RNAstructure</italic>
algorithm, CROSS Global scores were normalized to lie in the range of SHAPE reactivities: first the scores were multiplied by −1, then linearly mapped to [0,1]. Scores >0.65 were then assigned a SHAPE reactivity of 1; scores <0.35 were assigned a reactivity of 0; scores >0.35 and <0.65 were linearly mapped to (0,1). We used the
<italic>Partition</italic>
and
<italic>Probability Plot</italic>
(with
<italic>-text</italic>
flag) modules of
<italic>RNAstructure</italic>
to compute the AUC based on the probabilities (
<xref rid="B17" ref-type="bibr">17</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
). We employed the package 
<italic>Scorer</italic>
 to calculate the positive predictive values (PPVs) and true positive rates (TPRs) for the specific structures.</p>
</sec>
<sec id="SEC2-6">
<title>Sequence patterns</title>
<p>We used DREME from the MEME suite (
<ext-link ext-link-type="uri" xlink:href="http://meme-suite.org/doc/dreme.html">http://meme-suite.org/doc/dreme.html</ext-link>
) to search for patterns in the positive and negative fragment sets (
<xref rid="B19" ref-type="bibr">19</xref>
). The flag
<italic>–n</italic>
was selected to specify a negative data set as comparison during the search of the motives.</p>
</sec>
</sec>
<sec id="SEC3">
<title>RESULTS AND DISCUSSION</title>
<sec id="SEC3-1">
<title>The CROSS algorithm</title>
<p>CROSS predicts the structural profile of an RNA sequence at single-nucleotide resolution and without sequence length restrictions. The algorithm is an artificial neural network with one hidden layer and two adaptive weight matrices to predict the structural state of a nucleotide considering its flanking residues (Materials and Methods:
<italic>CROSS architecture;</italic>
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Selection of the optimal window</italic>
). We built five independent models using data from SHAPE (Mouse and HIV transcriptomes (
<xref rid="B5" ref-type="bibr">5</xref>
,
<xref rid="B10" ref-type="bibr">10</xref>
): icSHAPE-Mouse and SHAPE-HIV) and PARS experiments (Human and Yeast transcriptomes (
<xref rid="B3" ref-type="bibr">3</xref>
,
<xref rid="B4" ref-type="bibr">4</xref>
): PARS-Human and PARS-Yeast) as well as data from NMR/X-ray studies (PDB database: NMR/X-ray) (Figure
<xref ref-type="fig" rid="F1">1A</xref>
). The training of each model was carried out on strong-signal sequences (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Data sets</italic>
) with the central nucleotide in either single-stranded (negative cases) or double-stranded (positive cases) configuration. Each model was then tested on all the other data sets. Negligible overlap exists between training and testing sets (Jaccard index < 0.002 between each couple of sets analyzed;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
<p>From low- (top and bottom 50% of the CROSS score distribution) to high-confidence (top and bottom 5%) predictions, we observed an increase in the accuracies of our models, which indicates good ability to capture strong-signal regions. For instance, the accuracy of the icSHAPE-Mouse model applied to the SHAPE-HIV data set improves from 0.63 (low-confidence) to 0.81 (high-confidence; Figure
<xref ref-type="fig" rid="F1">1B</xref>
), and the same trend is found with respect to other data sets (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). High-confidence predictions of PARS-Human (training fragments: 26 444; testing fragments: 77 476) and icSHAPE-Mouse (711 480 training fragments, 35 516 testing fragments) models on all the other sets reach accuracies of 0.77 and 0.76, PPVs of 0.80 and 0.77, TPRs of 0.76 and 0.76 and true negative rates of 0.79 and 0.77, respectively (Figure
<xref ref-type="fig" rid="F1">1C</xref>
;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). As for PARS-Yeast, the accuracy, PPV and TPR are 0.64, 0.68 and 0.64, respectively. The model trained on NMR/X-ray data (29 428 training fragments, 77 176 testing fragments) shows an accuracy of 0.76, a PPV of 0.73 and a TPR of 0.79. SHAPE-HIV (fragments: 6 474 for training, 410 578 for testing) has an average accuracy, PPV and TPR of 0.66, 0.64 and 0.69.</p>
<p>We observed comparable cross-validation performances on the PARS datasets (area under the ROC curve AUC of 0.89 for PARS-Yeast applied to PARS-Human, and 0.90 for PARS-Human applied to PARS-Yeast), even though the experiments were carried out in different organisms and with slightly modified protocols, confirming the high quality of our predictions (Figures
<xref ref-type="fig" rid="F2">2</xref>
and 
<xref ref-type="fig" rid="F3">3</xref>
).</p>
<fig id="F2" orientation="portrait" position="float">
<label>Figure 2.</label>
<caption>
<p>Receiver Operating Characteristic (ROC) curves reveal significant cross-validation performances on the complete data sets of PARS-Human (left panel; area under the ROC curve (AUC) = 0.90) and PARS-Yeast (right panel; AUC = 0.89).</p>
</caption>
<graphic xlink:href="gkw1094fig2"></graphic>
</fig>
<fig id="F3" orientation="portrait" position="float">
<label>Figure 3.</label>
<caption>
<p>Performances on complete data sets. Testing performances with AUC > 0.70 are highlighted in bold (intra-set 5-fold cross-validations are in grey).</p>
</caption>
<graphic xlink:href="gkw1094fig3"></graphic>
</fig>
<p>From low- (top and bottom 50% of the PARS score distribution) to high-confidence (top and bottom 1%) experimental values, we found a consistent increase in the performances of all models (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), thus providing strong evidence on the reliability of CROSS predictions. For instance, the SHAPE-HIV model predicts the whole PARS-Human data set with an AUC of 0.70 and the top and bottom 1% of the scores are with an AUC of 0.80 (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). We note that very negligible overlap exists between yeast and human fragment sets (overlap: 0.001%; Jaccard index: 0.001;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), which indicates that our method is not biased by specific sequences. On the same sets, approaches based on thermodynamic principles (
<xref rid="B15" ref-type="bibr">15</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
) show lower performances (Yeast: accuracies in the range 0.72–0.74, Human: accuracies in the range 0.67–0.69) than CROSS (Yeast: 0.80 accuracy using PARS-Human model; Human: 0.81 accuracy using PARS-Yeast model;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), indicating that our method is particularly useful for predictions on high-throughput data sets.</p>
</sec>
<sec id="SEC3-2">
<title>The HIV-1 case: correlation between
<italic>in silico</italic>
and
<italic>in vitro</italic>
data</title>
<p>The model built on PARS-Human is able to predict SHAPE-HIV data with an AUC of 0.75 (Figure
<xref ref-type="fig" rid="F4">4A</xref>
and 
<xref ref-type="fig" rid="F4">B</xref>
). Increasing the confidence threshold of SHAPE data (from >0.5 for single-stranded, <0.2 for double-stranded to >1 for single-stranded and <0.1 for double-stranded) improves CROSS performances to an AUC of 0.80 (Figure
<xref ref-type="fig" rid="F4">4B</xref>
). We compared experimental and predicted values on fragments of 200 nucleotides, reporting an average correlation of 0.60 (peak of 0.86 in the region 3 800–4 000) for the HIV transcriptome (Figure
<xref ref-type="fig" rid="F4">4A</xref>
and 
<xref ref-type="fig" rid="F4">C</xref>
).</p>
<fig id="F4" orientation="portrait" position="float">
<label>Figure 4.</label>
<caption>
<p>(
<bold>A</bold>
) Example of the secondary structure profile of the HIV transcriptome (nucleotides 3 800–4 000) calculated with the PARS-Human model. CROSS predictions show a correlation of 0.86 with the experimental Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) profile. (
<bold>B</bold>
) ROC curves of CROSS-Human applied to HIV-SHAPE data. The performances increase when selecting a high confidence threshold (>1 for single-stranded; <0.1 for double-stranded) on SHAPE experimental data. (
<bold>C</bold>
) Pearson correlations between experimental and predicted data for the HIV transcriptome calculated on 200-nucleotide regions using a smoothing window of 7 nucleotides. The average correlation is 0.6.</p>
</caption>
<graphic xlink:href="gkw1094fig4"></graphic>
</fig>
</sec>
<sec id="SEC3-3">
<title>Recognition of complex patterns</title>
<p>CROSS is able to identify sequence patterns that cannot be captured by a position weight matrix approach. We searched the positive and negative fragment sets extracted from icSHAPE-Mouse and PARS-Human data for sequence patterns (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
) using DREME (Materials and Methods:
<italic>Sequence patterns</italic>
) (
<xref rid="B19" ref-type="bibr">19</xref>
). In icSHAPE-Mouse sequences, the G/GC/ACGU/GC motif occurs with frequencies of 63% and 43% in the positive (556 645 fragments) and negative (355 740 fragments) sets (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), indicating poor discrimination. As for PARS-Human, the top motif in the positive fragment set GCU/GC/AG/G (71% frequency) is also non-specific (frequency of 47% in the negative set). This observation indicates that the neural network approach is particularly suitable to identifying complex patterns in biological sequences, which is key to discover trends in large data sets (
<xref rid="B20" ref-type="bibr">20</xref>
). We also note that CROSS models are sensitive to single point mutations: the signal drops progressively upon insertion of random mutations in the original sequences (PARS-Yeast;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). As expected, mutations in the central position of the fragment produce the most dramatic reduction in the predictive power of the method (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
).</p>
</sec>
<sec id="SEC3-4">
<title>The consensus model
<italic>Global Score</italic>
</title>
<p>The consensus model
<italic>Global Score</italic>
was trained and tested on independent sets of NMR/X-ray structures (11 670 training fragments, 5 475 testing fragments;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Data sets;</italic>
Materials and Methods:
<italic>Consensus models</italic>
) (
<xref rid="B15" ref-type="bibr">15</xref>
,
<xref rid="B21" ref-type="bibr">21</xref>
). In the testing phase, single and double-stranded nucleotides were recognized with an AUC of 0.72 and a PPV of 0.74. Comparing the structures with experimental SHAPE data, we observed similar performances (AUC of 0.76 and PPV of 0.76 on the same data set; Figure
<xref ref-type="fig" rid="F5">5</xref>
). As PARS-Yeast and PARS-Human models show a 0.5 correlation (
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), we decided to train the method without PARS-Yeast or PARS-Human. The procedure reduces
<italic>Global Score</italic>
performances (without PARS-Yeast: AUC from 0.72 to 0.65, PPV from 0.74 to 0.68; without PARS-Human AUC from 0.72 to 0.66, PPV from 0.74 to 0.65), which indicates that the methods should be used together.</p>
<fig id="F5" orientation="portrait" position="float">
<label>Figure 5.</label>
<caption>
<p>Performance comparison of SHAPE data and CROSS predictions (
<italic>Global Score</italic>
) on sequences with available structural information derived from NMR/X-ray data (
<xref rid="B21" ref-type="bibr">21</xref>
). The performances are calculated on 24 RNAs that were not employed for training (
<xref rid="B15" ref-type="bibr">15</xref>
). The precision is measured at Youden's cut-off.</p>
</caption>
<graphic xlink:href="gkw1094fig5"></graphic>
</fig>
</sec>
<sec id="SEC3-5">
<title>
<italic>Global score</italic>
as experimental constraint for thermodynamic approaches</title>
<p>As previously done with experimental SHAPE data, we used
<italic>Global Score</italic>
as a constraint in
<italic>RNAstructure</italic>
(
<xref rid="B12" ref-type="bibr">12</xref>
). On the test set (
<xref rid="B15" ref-type="bibr">15</xref>
),
<italic>Global Score</italic>
increases the PPV of
<italic>RNAstructure</italic>
from 0.68 to 0.72, with remarkable improvements in 13 cases (from 0.44 to 0.72;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
; Figure
<xref ref-type="fig" rid="F6">6A</xref>
and 
<xref ref-type="fig" rid="F6">C</xref>
;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>RNAstructure</italic>
), and decreases the PPV in three cases for which real SHAPE data does not improve performances (PPV: Group II intron
<italic>O. iheyensis</italic>
: 0.97 with
<italic>RNAstructure</italic>
versus 0.84 with SHAPE data; HIV-1 5΄ pseudoknot domain: 0.62 versus 0.55; SARS corona virus pseudoknot: 0.90 versus 0.75). To assess to what extent
<italic>Global Score</italic>
improves
<italic>RNAstructure</italic>
(
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
), we randomized the
<italic>Global Score</italic>
input and observed an overall PPV decrease to 0.64. Moreover, using the partition function computed with
<italic>RNAstructure</italic>
, we calculated the AUC for each structure with and without CROSS constraints and observed an improvement from 0.81 to 0.86 when CROSS is integrated in the algorithm (Figure
<xref ref-type="fig" rid="F6">6B</xref>
). On the test set (
<xref rid="B15" ref-type="bibr">15</xref>
), we found a similar trend using
<italic>RNAfold</italic>
(
<xref rid="B15" ref-type="bibr">15</xref>
) (the PPV increases from 0.67 to 0.70 using
<italic>Global Score</italic>
and the AUC remains at 0.85).</p>
<fig id="F6" orientation="portrait" position="float">
<label>Figure 6.</label>
<caption>
<p>(
<bold>A</bold>
and 
<bold>B</bold>
) Performances of
<italic>RNAstructure</italic>
(
<xref rid="B12" ref-type="bibr">12</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
) without constraints and using either CROSS
<italic>Global Score</italic>
predictions or SHAPE data. Precision (PPV) and area under the ROC curve (AUC) increase when CROSS predictions are employed as constraints in
<italic>RNAstructure</italic>
, indicating a global improvement of predictive power. Randomizing the CROSS signal (in the range of SHAPE data) decreases the performances of
<italic>RNAstructure</italic>
. (
<bold>C</bold>
) Prediction of the structure of the P546 domain bI3 group I intron using CROSS predictions as constraints to
<italic>RNAstructure</italic>
: Sensitivity (TPR) (without CROSS: TPR = 0.43; with CROSS: TPR = 0.63) and precision (without CROSS: PPV = 0.44; with CROSS: PPV = 0.71) are reported.</p>
</caption>
<graphic xlink:href="gkw1094fig6"></graphic>
</fig>
</sec>
<sec id="SEC3-6">
<title>The X
<italic>ist</italic>
case and comparison with DMS experiments</title>
<p>Due to the complexity of the configuration space, the structural profile of sequences >1 000–1 500 nucleotides is extremely difficult to predict with thermodynamic approaches (
<xref rid="B22" ref-type="bibr">22</xref>
), which makes CROSS a valid alternative to study long non-coding RNAs (
<xref rid="B23" ref-type="bibr">23</xref>
). To illustrate CROSS performances on large RNAs, we predicted the structural profile of murine
<italic>Xist</italic>
non-coding RNA (17 900 nt) using the
<italic>consensus</italic>
of our five models (Materials and Methods:
<italic>Consensus models</italic>
; Figure
<xref ref-type="fig" rid="F7">7A</xref>
).
<italic>Xist</italic>
was analyzed using DMS probing (
<xref rid="B16" ref-type="bibr">16</xref>
), an independent technique not used in the training of CROSS (the transcript was not present in any training set of CROSS). Using the top and bottom 10% of the experimental DMS data on
<italic>Xist</italic>
profile (3 580 fragments removing regions with unreliable scores in Rep B) the Z-Score shows an AUC of 0.75 (Figure
<xref ref-type="fig" rid="F7">7A</xref>
, right corner). In agreement with DMS experiments (
<xref rid="B16" ref-type="bibr">16</xref>
), CROSS identifies the structural elements associated with repetitive regions Rep A, B and F and resolves their internal structures with correlations of 0.35, 0.46 and 0.75, respectively (see
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). Although the method slightly overestimates the structural content of Rep E, it is able to accurately predict its profile (correlation of 0.63,
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
). While the sequences of Rep A and B are conserved across species and show a high degree of structural content, the 3΄ region of
<italic>Xist</italic>
is variable (
<xref rid="B24" ref-type="bibr">24</xref>
) and predicted by CROSS to be more single-stranded.</p>
<fig id="F7" orientation="portrait" position="float">
<label>Figure 7.</label>
<caption>
<p>(
<bold>A</bold>
) CROSS
<italic>Z-Score consensus</italic>
prediction of the secondary structure profile of murine
<italic>Xist</italic>
long non-coding RNA (a 200 nt window is employed for smoothing). Structured regions, in correspondence to known repetitive domains (Rep A, B and F), are highlighted and the correlations with dimethyl sulfate (DMS) data (
<xref rid="B16" ref-type="bibr">16</xref>
) are reported on top. A detailed view of the CROSS and DMS profiles for Rep A, B, F and E is provided in
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
. We note that Rep B contains regions with insufficient sequencing data to determine DMS reactivity (
<xref rid="B16" ref-type="bibr">16</xref>
) that were excluded from the analysis. Our predictions indicate lower structural content at 3΄, in line with previous reports indicating poor sequence conservation (only Rep A and B are highly conserved) (
<xref rid="B24" ref-type="bibr">24</xref>
). The ROC curve of the
<italic>Z-Score</italic>
predictions on high-confidence DMS data (10% top and bottom nucleotides, 3 580 fragments) is reported in the corner (AUC 0.75). (
<bold>B</bold>
) Predictions of human coding DNA sequences (CDSs), untranslated regions (UTRs) and long-intergenic non-coding RNA (lincRNA) (ENSEMBL version 82; total number of transcripts: 50 000; 14 000 lincRNA isoforms). We predict that the UTRs are more structured than the CDSs, in agreement with previous studies (
<italic>P</italic>
-value < 2.2e-16, Kolmogorov–Smirnov) (
<xref rid="B1" ref-type="bibr">1</xref>
). For each set we show a known example [the APP 5΄ UTR is more structured, as shown in previous studies (
<xref rid="B27" ref-type="bibr">27</xref>
); SERPINE3 has a structured CDS in agreement with PARS data (
<xref rid="B4" ref-type="bibr">4</xref>
);
<italic>Xist</italic>
structural content is in agreement with DMSdata (
<xref rid="B16" ref-type="bibr">16</xref>
)].</p>
</caption>
<graphic xlink:href="gkw1094fig7"></graphic>
</fig>
</sec>
<sec id="SEC3-7">
<title>Structural differences in human CDS, UTRs and lincRNAs</title>
<p>We employed CROSS to analyze the structural differences between human coding DNA sequences (CDSs), untranslated regions (UTRs) (total of 217 000 non-redundant sequences each with 3΄ and 5΄ UTRs; ENSEMBL 82) and long intergenic non-coding transcripts (14 000 non-redundant sequences; ENSEMBL 82;
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
; Figure
<xref ref-type="fig" rid="F7">7B</xref>
). In agreement with previous evidence (
<xref rid="B1" ref-type="bibr">1</xref>
), we predict that UTRs are more structured than CDSs (
<italic>P</italic>
-value < 2.2e-16; Kolmogorov–Smirnov). Long intergenic non-coding transcripts (see
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
:
<italic>Long intergenic non-coding RNAs</italic>
) are found to be less structured, as reported in other studies (
<xref rid="B25" ref-type="bibr">25</xref>
) (
<italic>P</italic>
-value < 2.2e-16; Kolmogorov–Smirnov). Indeed, long non-coding RNAs have complex regulatory abilities and their structure could be more flexible and less structured to provide a wide range of interactions (
<xref rid="B26" ref-type="bibr">26</xref>
). In agreement with previous data (
<xref rid="B27" ref-type="bibr">27</xref>
), we also observe that the 5΄ UTR of the amyloid precursor protein APP transcript is highly structured (>65% double-stranded). Similarly, the mRNA of serpin peptidase inhibitor SERPINE3 is predicted to be highly structured (>55%), as reported in PARS screenings (
<xref rid="B4" ref-type="bibr">4</xref>
). We predict that 45% of
<italic>Xist</italic>
is structured in domains, as revealed by DMS profiling (
<xref rid="B16" ref-type="bibr">16</xref>
).</p>
</sec>
</sec>
<sec sec-type="conclusions" id="SEC4">
<title>CONCLUSION</title>
<p>The study of large transcripts requires intense experimental work that could be substantially reduced by using computational approaches to characterize their structural features (
<xref rid="B16" ref-type="bibr">16</xref>
). Methods based on thermodynamic principles (
<xref rid="B18" ref-type="bibr">18</xref>
,
<xref rid="B28" ref-type="bibr">28</xref>
) can be employed for RNAs < 1 000–1 500 nucleotides and do not work for larger molecules because of the complexity of the configuration space (
<xref rid="B22" ref-type="bibr">22</xref>
). In our approach, we use local sequence properties of RNAs, which is key to perform fast high-throughput profiling of sequences, since the computational load scales linearly with the sequence length. Therefore, CROSS allows the prediction of the structural profile without sequence length restrictions.</p>
<p>We built CROSS using data from SHAPE (
<xref rid="B5" ref-type="bibr">5</xref>
,
<xref rid="B6" ref-type="bibr">6</xref>
) and PARS (
<xref rid="B3" ref-type="bibr">3</xref>
,
<xref rid="B4" ref-type="bibr">4</xref>
) studies as well as NMR/X-ray experiments. Models based on PARS and icSHAPE experiments show the highest predictive power with an average accuracy of 0.77 and 0.76, and a positive predictive value PPV of 0.8 and 0.77. The different algorithms can be used independently or combined together to obtain insights into the secondary structure of a transcript. Since each technique has its specificities and biases, the combination of multiple approaches is recommendable to achieve a better understanding of structural properties (
<xref rid="B14" ref-type="bibr">14</xref>
).</p>
<p>On high-throughput experimental data sets CROSS outperforms
<italic>RNAstructure</italic>
(
<xref rid="B18" ref-type="bibr">18</xref>
) and
<italic>RNAfold</italic>
(
<xref rid="B15" ref-type="bibr">15</xref>
) (CROSS: accuracy of 0.80 for PARS-Yeast and 0.81 for PARS-Human;
<italic>RNAstructure</italic>
and
<italic>RNAfold</italic>
: 0.72–0.74 for PARS-Yeast, 0.67–0.69 for PARS-Human). Yet, previous studies indicate that thermodynamic methods have a higher predictive power when the information derived from SHAPE experiments is integrated (
<xref rid="B12" ref-type="bibr">12</xref>
). Comparing SHAPE experiments and CROSS predictions on RNA molecules for which NMR/X-ray data are available (
<xref rid="B15" ref-type="bibr">15</xref>
), we found similar performances with an average precision of 0.74 (CROSS) and 0.76 (SHAPE), and an area under the receiver operating characteristics of 0.72 (CROSS) and 0.76 (SHAPE). Thus, CROSS can be considered an
<italic>in silico</italic>
alternative to SHAPE experiments (
<xref rid="B5" ref-type="bibr">5</xref>
,
<xref rid="B6" ref-type="bibr">6</xref>
) and its integration in
<italic>RNAstructure</italic>
(
<xref rid="B17" ref-type="bibr">17</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
) shows performances (PPV: 0.72; AUC: 0.85) that are comparable to those achieved using real SHAPE data (PPV: 0.80, AUC: 0.88).</p>
<p>Since CROSS is fast (less than 2 min to profile a transcript of 20 000 nucleotides), it can be used for high-throughput predictions of the RNA secondary structure. We used CROSS to investigate profiles of sequences taken from CDSs as well as untranslated regions UTRs > 200 000 isoforms (calculated in <72 h) reporting a structural content that is compatible with what is available in literature (
<xref rid="B1" ref-type="bibr">1</xref>
). We also studied the structural profile of
<italic>Xist</italic>
and identified specific regions in agreement with DMS experiments [correlations of 0.63 and 0.75 for Rep E and Rep F (
<xref rid="B16" ref-type="bibr">16</xref>
)].</p>
<p>Our predictions of structural features will facilitate the design of experimental studies on long transcripts by revealing the structural state of their regions. The calculations can be employed to shed light on the evolution of RNA molecules and on their interactions with other molecules. Our approach can be also exploited to improve the predictive power of algorithms such as for instance
<italic>cat</italic>
RAPID, which computes the interaction propensity of protein and RNA molecules (
<xref rid="B29" ref-type="bibr">29</xref>
). We envisage that the combination of CROSS with thermodynamics-based approaches will be the key ingredient to improve predictions of RNA structure.</p>
</sec>
<sec id="SEC5">
<title>AVAILABILITY</title>
<p>CROSS is freely available at
<ext-link ext-link-type="uri" xlink:href="http://service.tartaglialab.com/new_submission/cross">http://service.tartaglialab.com/new_submission/cross</ext-link>
.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="sup1">
<label>Supplementary Data</label>
<media xlink:href="gkw1094_supplementary_data.zip">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>ACKNOWLEDGEMENTS</title>
<p>The authors thank Philipp Germann, Irene Julca, Davide Cirillo and the other members of our group for useful comments.</p>
</ack>
<sec id="SEC6">
<title>SUPPLEMENTARY DATA</title>
<p>
<xref ref-type="supplementary-material" rid="sup1">Supplementary Data</xref>
are available at NAR Online.</p>
</sec>
<sec id="SEC7">
<title>FUNDING</title>
<p>The research leading to these results has received funding from European Union Seventh Framework Programme [FP7/2007-2013]; European Research Council [RIBOMYLOME_309545 to GGT]; Spanish Ministry of Economy and Competitiveness [BFU2014-55054-P to GGT]; AGAUR [2014 SGR 00685 to GGT]; Spanish Ministry of Economy and Competitiveness, European Research Development Fund ERDF, ‘Centro de Excelencia Severo Ochoa 2013-2017’ [SEV-2012-0208]. Funding for open access charge: European Research Council [RIBOMYLOME_309545 to GGT]; Spanish Ministry of Economy and Competitiveness [BFU2014-55054-P to GGT]. The authors also thank the CRG fellowship to SM.</p>
<p>
<italic>Conflict of interest statement</italic>
. None declared.</p>
</sec>
<ref-list>
<title>REFERENCES</title>
<ref id="B1">
<label>1.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Mortimer</surname>
<given-names>S.A.</given-names>
</name>
,
<name name-style="western">
<surname>Kidwell</surname>
<given-names>M.A.</given-names>
</name>
,
<name name-style="western">
<surname>Doudna</surname>
<given-names>J.A.</given-names>
</name>
</person-group>
<article-title>Insights into RNA structure and function from genome-wide studies</article-title>
.
<source>Nat. Rev. Genet.</source>
<year>2014</year>
;
<volume>15</volume>
:
<fpage>469</fpage>
<lpage>479</lpage>
.
<pub-id pub-id-type="pmid">24821474</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<label>2.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Tartaglia</surname>
<given-names>G.G.</given-names>
</name>
</person-group>
<article-title>The grand challenge of characterizing ribonucleoprotein networks</article-title>
.
<source>Front. Mol. Biosci</source>
.
<year>2016</year>
;
<volume>3</volume>
,
<comment>doi:10.3389/fmolb.2016.00024</comment>
.</mixed-citation>
</ref>
<ref id="B3">
<label>3.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Kertesz</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Wan</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Mazor</surname>
<given-names>E.</given-names>
</name>
,
<name name-style="western">
<surname>Rinn</surname>
<given-names>J.L.</given-names>
</name>
,
<name name-style="western">
<surname>Nutter</surname>
<given-names>R.C.</given-names>
</name>
,
<name name-style="western">
<surname>Chang</surname>
<given-names>H.Y.</given-names>
</name>
,
<name name-style="western">
<surname>Segal</surname>
<given-names>E.</given-names>
</name>
</person-group>
<article-title>Genome-wide measurement of RNA secondary structure in yeast</article-title>
.
<source>Nature</source>
.
<year>2010</year>
;
<volume>467</volume>
:
<fpage>103</fpage>
<lpage>107</lpage>
.
<pub-id pub-id-type="pmid">20811459</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<label>4.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Wan</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Qu</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>Q.C.</given-names>
</name>
,
<name name-style="western">
<surname>Flynn</surname>
<given-names>R.A.</given-names>
</name>
,
<name name-style="western">
<surname>Manor</surname>
<given-names>O.</given-names>
</name>
,
<name name-style="western">
<surname>Ouyang</surname>
<given-names>Z.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Spitale</surname>
<given-names>R.C.</given-names>
</name>
,
<name name-style="western">
<surname>Snyder</surname>
<given-names>M.P.</given-names>
</name>
,
<name name-style="western">
<surname>Segal</surname>
<given-names>E.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Landscape and variation of RNA secondary structure across the human transcriptome</article-title>
.
<source>Nature</source>
.
<year>2014</year>
;
<volume>505</volume>
:
<fpage>706</fpage>
<lpage>709</lpage>
.
<pub-id pub-id-type="pmid">24476892</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<label>5.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Spitale</surname>
<given-names>R.C.</given-names>
</name>
,
<name name-style="western">
<surname>Flynn</surname>
<given-names>R.A.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>Q.C.</given-names>
</name>
,
<name name-style="western">
<surname>Crisalli</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Lee</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Jung</surname>
<given-names>J.-W.</given-names>
</name>
,
<name name-style="western">
<surname>Kuchelmeister</surname>
<given-names>H.Y.</given-names>
</name>
,
<name name-style="western">
<surname>Batista</surname>
<given-names>P.J.</given-names>
</name>
,
<name name-style="western">
<surname>Torre</surname>
<given-names>E.A.</given-names>
</name>
,
<name name-style="western">
<surname>Kool</surname>
<given-names>E.T.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Structural imprints in vivo decode RNA regulatory mechanisms</article-title>
.
<source>Nature</source>
.
<year>2015</year>
;
<volume>519</volume>
:
<fpage>486</fpage>
<lpage>490</lpage>
.
<pub-id pub-id-type="pmid">25799993</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<label>6.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Wilkinson</surname>
<given-names>K.A.</given-names>
</name>
,
<name name-style="western">
<surname>Merino</surname>
<given-names>E.J.</given-names>
</name>
,
<name name-style="western">
<surname>Weeks</surname>
<given-names>K.M.</given-names>
</name>
</person-group>
<article-title>Selective 2’-hydroxyl acylation analyzed by primer extension (SHAPE): quantitative RNA structure analysis at single nucleotide resolution</article-title>
.
<source>Nat. Protoc.</source>
<year>2006</year>
;
<volume>1</volume>
:
<fpage>1610</fpage>
<lpage>1616</lpage>
.
<pub-id pub-id-type="pmid">17406453</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<label>7.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Cordero</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Kladwang</surname>
<given-names>W.</given-names>
</name>
,
<name name-style="western">
<surname>VanLang</surname>
<given-names>C.C.</given-names>
</name>
,
<name name-style="western">
<surname>Das</surname>
<given-names>R.</given-names>
</name>
</person-group>
<article-title>Quantitative dimethyl sulfate mapping for automated RNA secondary structure inference</article-title>
.
<source>Biochemistry</source>
.
<year>2012</year>
;
<volume>51</volume>
:
<fpage>7037</fpage>
<lpage>7039</lpage>
.
<pub-id pub-id-type="pmid">22913637</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<label>8.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Rouskin</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Zubradt</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Washietl</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Kellis</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Weissman</surname>
<given-names>J.S.</given-names>
</name>
</person-group>
<article-title>Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo</article-title>
.
<source>Nature</source>
.
<year>2014</year>
;
<volume>505</volume>
:
<fpage>701</fpage>
<lpage>705</lpage>
.
<pub-id pub-id-type="pmid">24336214</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<label>9.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Wells</surname>
<given-names>S.E.</given-names>
</name>
,
<name name-style="western">
<surname>Hughes</surname>
<given-names>J.M.</given-names>
</name>
,
<name name-style="western">
<surname>Igel</surname>
<given-names>A.H.</given-names>
</name>
,
<name name-style="western">
<surname>Ares</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>Use of dimethyl sulfate to probe RNA structure in vivo</article-title>
.
<source>Methods Enzymol.</source>
<year>2000</year>
;
<volume>318</volume>
:
<fpage>479</fpage>
<lpage>493</lpage>
.
<pub-id pub-id-type="pmid">10890007</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<label>10.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Watts</surname>
<given-names>J.M.</given-names>
</name>
,
<name name-style="western">
<surname>Dang</surname>
<given-names>K.K.</given-names>
</name>
,
<name name-style="western">
<surname>Gorelick</surname>
<given-names>R.J.</given-names>
</name>
,
<name name-style="western">
<surname>Leonard</surname>
<given-names>C.W.</given-names>
</name>
,
<name name-style="western">
<surname>Bess</surname>
<given-names>J.W.</given-names>
</name>
,
<name name-style="western">
<surname>Swanstrom</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Burch</surname>
<given-names>C.L.</given-names>
</name>
,
<name name-style="western">
<surname>Weeks</surname>
<given-names>K.M.</given-names>
</name>
</person-group>
<article-title>Architecture and secondary structure of an entire HIV-1 RNA genome</article-title>
.
<source>Nature</source>
.
<year>2009</year>
;
<volume>460</volume>
:
<fpage>711</fpage>
<lpage>716</lpage>
.
<pub-id pub-id-type="pmid">19661910</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<label>11.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Andronescu</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Bereg</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Hoos</surname>
<given-names>H.H.</given-names>
</name>
,
<name name-style="western">
<surname>Condon</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>RNA STRAND: the RNA secondary structure and statistical analysis database</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2008</year>
;
<volume>9</volume>
:
<fpage>340</fpage>
<lpage>349</lpage>
.
<pub-id pub-id-type="pmid">18700982</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<label>12.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Deigan</surname>
<given-names>K.E.</given-names>
</name>
,
<name name-style="western">
<surname>Li</surname>
<given-names>T.W.</given-names>
</name>
,
<name name-style="western">
<surname>Mathews</surname>
<given-names>D.H.</given-names>
</name>
,
<name name-style="western">
<surname>Weeks</surname>
<given-names>K.M.</given-names>
</name>
</person-group>
<article-title>Accurate SHAPE-directed RNA structure determination</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<year>2009</year>
;
<volume>106</volume>
:
<fpage>97</fpage>
<lpage>102</lpage>
.
<pub-id pub-id-type="pmid">19109441</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<label>13.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bellucci</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Agostini</surname>
<given-names>F.</given-names>
</name>
,
<name name-style="western">
<surname>Masin</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Tartaglia</surname>
<given-names>G.G.</given-names>
</name>
</person-group>
<article-title>Predicting protein associations with long noncoding RNAs</article-title>
.
<source>Nat. Methods</source>
.
<year>2011</year>
;
<volume>8</volume>
:
<fpage>444</fpage>
<lpage>445</lpage>
.
<pub-id pub-id-type="pmid">21623348</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<label>14.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Novikova</surname>
<given-names>I.V.</given-names>
</name>
,
<name name-style="western">
<surname>Hennelly</surname>
<given-names>S.P.</given-names>
</name>
,
<name name-style="western">
<surname>Sanbonmatsu</surname>
<given-names>K.Y.</given-names>
</name>
</person-group>
<article-title>Structural architecture of the human long non-coding RNA, steroid receptor RNA activator</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2012</year>
;
<volume>40</volume>
:
<fpage>5034</fpage>
<lpage>5051</lpage>
.
<pub-id pub-id-type="pmid">22362738</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<label>15.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Lorenz</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Luntzer</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Hofacker</surname>
<given-names>I.L.</given-names>
</name>
,
<name name-style="western">
<surname>Stadler</surname>
<given-names>P.F.</given-names>
</name>
,
<name name-style="western">
<surname>Wolfinger</surname>
<given-names>M.T.</given-names>
</name>
</person-group>
<article-title>SHAPE directed RNA folding</article-title>
.
<source>Bioinformatics</source>
.
<year>2016</year>
;
<volume>32</volume>
:
<fpage>145</fpage>
<lpage>147</lpage>
.
<pub-id pub-id-type="pmid">26353838</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<label>16.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Fang</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Moss</surname>
<given-names>W.N.</given-names>
</name>
,
<name name-style="western">
<surname>Rutenberg-Schoenberg</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Simon</surname>
<given-names>M.D.</given-names>
</name>
</person-group>
<article-title>Probing Xist RNA structure in cells using targeted structure-seq</article-title>
.
<source>PLoS Genet.</source>
<year>2015</year>
;
<volume>11</volume>
:
<fpage>e1005668</fpage>
.
<pub-id pub-id-type="pmid">26646615</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<label>17.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Mathews</surname>
<given-names>D.H.</given-names>
</name>
,
<name name-style="western">
<surname>Sabina</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Zuker</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Turner</surname>
<given-names>D.H.</given-names>
</name>
</person-group>
<article-title>Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure</article-title>
.
<source>J. Mol. Biol.</source>
<year>1999</year>
;
<volume>288</volume>
:
<fpage>911</fpage>
<lpage>940</lpage>
.
<pub-id pub-id-type="pmid">10329189</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<label>18.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Reuter</surname>
<given-names>J.S.</given-names>
</name>
,
<name name-style="western">
<surname>Mathews</surname>
<given-names>D.H.</given-names>
</name>
</person-group>
<article-title>RNAstructure: software for RNA secondary structure prediction and analysis</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2010</year>
;
<volume>11</volume>
:
<fpage>129</fpage>
<lpage>138</lpage>
.
<pub-id pub-id-type="pmid">20230624</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<label>19.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bailey</surname>
<given-names>T.L.</given-names>
</name>
,
<name name-style="western">
<surname>Johnson</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Grant</surname>
<given-names>C.E.</given-names>
</name>
,
<name name-style="western">
<surname>Noble</surname>
<given-names>W.S.</given-names>
</name>
</person-group>
<article-title>The MEME Suite</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2015</year>
;
<volume>43</volume>
:
<fpage>W39</fpage>
<lpage>W49</lpage>
.
<pub-id pub-id-type="pmid">25953851</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<label>20.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Alipanahi</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Delong</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Weirauch</surname>
<given-names>M.T.</given-names>
</name>
,
<name name-style="western">
<surname>Frey</surname>
<given-names>B.J.</given-names>
</name>
</person-group>
<article-title>Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning</article-title>
.
<source>Nat. Biotech.</source>
<year>2015</year>
;
<volume>33</volume>
:
<fpage>831</fpage>
<lpage>838</lpage>
.</mixed-citation>
</ref>
<ref id="B21">
<label>21.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Wu</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Shi</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Ding</surname>
<given-names>X.</given-names>
</name>
,
<name name-style="western">
<surname>Liu</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Hu</surname>
<given-names>X.</given-names>
</name>
,
<name name-style="western">
<surname>Yip</surname>
<given-names>K.Y.</given-names>
</name>
,
<name name-style="western">
<surname>Yang</surname>
<given-names>Z.R.</given-names>
</name>
,
<name name-style="western">
<surname>Mathews</surname>
<given-names>D.H.</given-names>
</name>
,
<name name-style="western">
<surname>Lu</surname>
<given-names>Z.J.</given-names>
</name>
</person-group>
<article-title>Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2015</year>
;
<volume>43</volume>
:
<fpage>7247</fpage>
<lpage>7259</lpage>
.
<pub-id pub-id-type="pmid">26170232</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<label>22.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Lange</surname>
<given-names>S.J.</given-names>
</name>
,
<name name-style="western">
<surname>Maticzka</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Möhl</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Gagnon</surname>
<given-names>J.N.</given-names>
</name>
,
<name name-style="western">
<surname>Brown</surname>
<given-names>C.M.</given-names>
</name>
,
<name name-style="western">
<surname>Backofen</surname>
<given-names>R.</given-names>
</name>
</person-group>
<article-title>Global or local? Predicting secondary structure and accessibility in mRNAs</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2012</year>
;
<volume>40</volume>
:
<fpage>5215</fpage>
<lpage>5226</lpage>
.
<pub-id pub-id-type="pmid">22373926</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<label>23.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Ulitsky</surname>
<given-names>I.</given-names>
</name>
,
<name name-style="western">
<surname>Bartel</surname>
<given-names>D.P.</given-names>
</name>
</person-group>
<article-title>lincRNAs: genomics, evolution, and mechanisms</article-title>
.
<source>Cell</source>
.
<year>2013</year>
;
<volume>154</volume>
:
<fpage>26</fpage>
<lpage>46</lpage>
.
<pub-id pub-id-type="pmid">23827673</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<label>24.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Nesterova</surname>
<given-names>T.B.</given-names>
</name>
,
<name name-style="western">
<surname>Slobodyanyuk</surname>
<given-names>S.Y.</given-names>
</name>
,
<name name-style="western">
<surname>Elisaphenko</surname>
<given-names>E.A.</given-names>
</name>
,
<name name-style="western">
<surname>Shevchenko</surname>
<given-names>A.I.</given-names>
</name>
,
<name name-style="western">
<surname>Johnston</surname>
<given-names>C.</given-names>
</name>
,
<name name-style="western">
<surname>Pavlova</surname>
<given-names>M.E.</given-names>
</name>
,
<name name-style="western">
<surname>Rogozin</surname>
<given-names>I.B.</given-names>
</name>
,
<name name-style="western">
<surname>Kolesnikov</surname>
<given-names>N.N.</given-names>
</name>
,
<name name-style="western">
<surname>Brockdorff</surname>
<given-names>N.</given-names>
</name>
,
<name name-style="western">
<surname>Zakian</surname>
<given-names>S.M.</given-names>
</name>
</person-group>
<article-title>Characterization of the genomic Xist locus in rodents reveals conservation of overall gene structure and tandem repeats but rapid evolution of unique sequence</article-title>
.
<source>Genome Res.</source>
<year>2001</year>
;
<volume>11</volume>
:
<fpage>833</fpage>
<lpage>849</lpage>
.
<pub-id pub-id-type="pmid">11337478</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<label>25.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Wan</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Qu</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Ouyang</surname>
<given-names>Z.</given-names>
</name>
,
<name name-style="western">
<surname>Kertesz</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Li</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Tibshirani</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Makino</surname>
<given-names>D.L.</given-names>
</name>
,
<name name-style="western">
<surname>Nutter</surname>
<given-names>R.C.</given-names>
</name>
,
<name name-style="western">
<surname>Segal</surname>
<given-names>E.</given-names>
</name>
,
<name name-style="western">
<surname>Chang</surname>
<given-names>H.Y.</given-names>
</name>
</person-group>
<article-title>Genome-wide measurement of RNA folding energies</article-title>
.
<source>Mol. Cell</source>
.
<year>2012</year>
;
<volume>48</volume>
:
<fpage>169</fpage>
<lpage>181</lpage>
.
<pub-id pub-id-type="pmid">22981864</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<label>26.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Rinn</surname>
<given-names>J.L.</given-names>
</name>
,
<name name-style="western">
<surname>Chang</surname>
<given-names>H.Y.</given-names>
</name>
</person-group>
<article-title>Genome regulation by long noncoding RNAs</article-title>
.
<source>Annu. Rev. Biochem.</source>
<year>2012</year>
;
<volume>81</volume>
:
<fpage>145</fpage>
<lpage>166</lpage>
.
<pub-id pub-id-type="pmid">22663078</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<label>27.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Gsponer</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Babu</surname>
<given-names>M.M.</given-names>
</name>
</person-group>
<article-title>Cellular strategies for regulating functional and nonfunctional protein aggregation</article-title>
.
<source>Cell Rep.</source>
<year>2012</year>
;
<volume>2</volume>
:
<fpage>1425</fpage>
<lpage>1437</lpage>
.
<pub-id pub-id-type="pmid">23168257</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<label>28.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Gruber</surname>
<given-names>A.R.</given-names>
</name>
,
<name name-style="western">
<surname>Lorenz</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Bernhart</surname>
<given-names>S.H.</given-names>
</name>
,
<name name-style="western">
<surname>Neubock</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Hofacker</surname>
<given-names>I.L.</given-names>
</name>
</person-group>
<article-title>The Vienna RNA Websuite</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2008</year>
;
<volume>36</volume>
:
<fpage>W70</fpage>
<lpage>W74</lpage>
.
<pub-id pub-id-type="pmid">18424795</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<label>29.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Agostini</surname>
<given-names>F.</given-names>
</name>
,
<name name-style="western">
<surname>Zanzoni</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Klus</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Marchese</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Cirillo</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Tartaglia</surname>
<given-names>G.G.</given-names>
</name>
</person-group>
<article-title>catRAPID omics: a web server for large-scale prediction of protein-RNA interactions</article-title>
.
<source>Bioinformatics</source>
.
<year>2013</year>
;
<volume>29</volume>
:
<fpage>2928</fpage>
<lpage>2930</lpage>
.
<pub-id pub-id-type="pmid">23975767</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/CovidV2/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000967 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000967 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sante
   |area=    CovidV2
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:5389523
   |texte=   A high-throughput approach to profile RNA structure
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:27899588" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CovidV2 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sat Mar 28 17:51:24 2020. Site generation: Sun Jan 31 15:35:48 2021