Serveur d'exploration Covid (26 mars)

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures

Identifieur interne : 000192 ( Pmc/Corpus ); précédent : 000191; suivant : 000193

A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures

Auteurs : Hosna Jabbari ; Anne Condon

Source :

RBID : PMC:4064103

Abstract

Background

Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information.

Results

We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0.

Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure.

Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets.

Conclusions

Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/~hjabbari/software.php.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-147) contains supplementary material, which is available to authorized users.


Url:
DOI: 10.1186/1471-2105-15-147
PubMed: 24884954
PubMed Central: 4064103

Links to Exploration step

PMC:4064103

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures</title>
<author>
<name sortKey="Jabbari, Hosna" sort="Jabbari, Hosna" uniqKey="Jabbari H" first="Hosna" last="Jabbari">Hosna Jabbari</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Condon, Anne" sort="Condon, Anne" uniqKey="Condon A" first="Anne" last="Condon">Anne Condon</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24884954</idno>
<idno type="pmc">4064103</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064103</idno>
<idno type="RBID">PMC:4064103</idno>
<idno type="doi">10.1186/1471-2105-15-147</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000192</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000192</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures</title>
<author>
<name sortKey="Jabbari, Hosna" sort="Jabbari, Hosna" uniqKey="Jabbari H" first="Hosna" last="Jabbari">Hosna Jabbari</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Condon, Anne" sort="Condon, Anne" uniqKey="Condon A" first="Anne" last="Condon">Anne Condon</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information.</p>
</sec>
<sec>
<title>Results</title>
<p>We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0.</p>
<p>Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure.</p>
<p>Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at
<ext-link ext-link-type="uri" xlink:href="http://www.cs.ubc.ca/~hjabbari/software.php">http://www.cs.ubc.ca/~hjabbari/software.php</ext-link>
.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/1471-2105-15-147) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Hale, Bj" uniqKey="Hale B">BJ Hale</name>
</author>
<author>
<name sortKey="Yang, C X" uniqKey="Yang C">C-X Yang</name>
</author>
<author>
<name sortKey="Ross, Jw" uniqKey="Ross J">JW Ross</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deryusheva, S" uniqKey="Deryusheva S">S Deryusheva</name>
</author>
<author>
<name sortKey="Gall, Jg" uniqKey="Gall J">JG Gall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holt, Ce" uniqKey="Holt C">CE Holt</name>
</author>
<author>
<name sortKey="Schuman, Em" uniqKey="Schuman E">EM Schuman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mattick, Js" uniqKey="Mattick J">JS Mattick</name>
</author>
<author>
<name sortKey="Makunin, Iv" uniqKey="Makunin I">IV Makunin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carninci, P" uniqKey="Carninci P">P Carninci</name>
</author>
<author>
<name sortKey="Kasukawa, T" uniqKey="Kasukawa T">T Kasukawa</name>
</author>
<author>
<name sortKey="Katayama, S" uniqKey="Katayama S">S Katayama</name>
</author>
<author>
<name sortKey="Gough, J" uniqKey="Gough J">J Gough</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Maeda, N" uniqKey="Maeda N">N Maeda</name>
</author>
<author>
<name sortKey="Oyama, R" uniqKey="Oyama R">R Oyama</name>
</author>
<author>
<name sortKey="Ravasi, T" uniqKey="Ravasi T">T Ravasi</name>
</author>
<author>
<name sortKey="Lenhard, B" uniqKey="Lenhard B">B Lenhard</name>
</author>
<author>
<name sortKey="Wells, C" uniqKey="Wells C">C Wells</name>
</author>
<author>
<name sortKey="Kodzius, R" uniqKey="Kodzius R">R Kodzius</name>
</author>
<author>
<name sortKey="Shimokawa, K" uniqKey="Shimokawa K">K Shimokawa</name>
</author>
<author>
<name sortKey="Bajic, Vb" uniqKey="Bajic V">VB Bajic</name>
</author>
<author>
<name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
<author>
<name sortKey="Batalov, S" uniqKey="Batalov S">S Batalov</name>
</author>
<author>
<name sortKey="Forrest, Arr" uniqKey="Forrest A">ARR Forrest</name>
</author>
<author>
<name sortKey="Zavolan, M" uniqKey="Zavolan M">M Zavolan</name>
</author>
<author>
<name sortKey="Davis, Mj" uniqKey="Davis M">MJ Davis</name>
</author>
<author>
<name sortKey="Wilming, Lg" uniqKey="Wilming L">LG Wilming</name>
</author>
<author>
<name sortKey="Aidinis, V" uniqKey="Aidinis V">V Aidinis</name>
</author>
<author>
<name sortKey="Allen, Je" uniqKey="Allen J">JE Allen</name>
</author>
<author>
<name sortKey="Ambesi Impiombato, A" uniqKey="Ambesi Impiombato A">A Ambesi-Impiombato</name>
</author>
<author>
<name sortKey="Apweiler, R" uniqKey="Apweiler R">R Apweiler</name>
</author>
<author>
<name sortKey="Aturaliya, Rn" uniqKey="Aturaliya R">RN Aturaliya</name>
</author>
<author>
<name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author>
<name sortKey="Bansal, M" uniqKey="Bansal M">M Bansal</name>
</author>
<author>
<name sortKey="Baxter, L" uniqKey="Baxter L">L Baxter</name>
</author>
<author>
<name sortKey="Beisel, Kw" uniqKey="Beisel K">KW Beisel</name>
</author>
<author>
<name sortKey="Bersano, T" uniqKey="Bersano T">T Bersano</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dennis, C" uniqKey="Dennis C">C Dennis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, K" uniqKey="Lee K">K Lee</name>
</author>
<author>
<name sortKey="Varma, S" uniqKey="Varma S">S Varma</name>
</author>
<author>
<name sortKey="Santalucia, J" uniqKey="Santalucia J">J Santalucia</name>
</author>
<author>
<name sortKey="Cunningham, Pr" uniqKey="Cunningham P">PR Cunningham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abdi, Nm" uniqKey="Abdi N">NM Abdi</name>
</author>
<author>
<name sortKey="Fredrick, K" uniqKey="Fredrick K">K Fredrick</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saraiya, Aa" uniqKey="Saraiya A">AA Saraiya</name>
</author>
<author>
<name sortKey="Lamichhane, Tn" uniqKey="Lamichhane T">TN Lamichhane</name>
</author>
<author>
<name sortKey="Chow, Cs" uniqKey="Chow C">CS Chow</name>
</author>
<author>
<name sortKey="Santalucia, J" uniqKey="Santalucia J">J SantaLucia</name>
</author>
<author>
<name sortKey="Cunningham, Pr" uniqKey="Cunningham P">PR Cunningham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Calidas, D" uniqKey="Calidas D">D Calidas</name>
</author>
<author>
<name sortKey="Lyon, H" uniqKey="Lyon H">H Lyon</name>
</author>
<author>
<name sortKey="Culver, Gm" uniqKey="Culver G">GM Culver</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Kato, Y" uniqKey="Kato Y">Y Kato</name>
</author>
<author>
<name sortKey="Akutsu, T" uniqKey="Akutsu T">T Akutsu</name>
</author>
<author>
<name sortKey="Asai, K" uniqKey="Asai K">K Asai</name>
</author>
<author>
<name sortKey="Sakakibara, Y" uniqKey="Sakakibara Y">Y Sakakibara</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hamada, M" uniqKey="Hamada M">M Hamada</name>
</author>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Asai, K" uniqKey="Asai K">K Asai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hamada, M" uniqKey="Hamada M">M Hamada</name>
</author>
<author>
<name sortKey="Yamada, K" uniqKey="Yamada K">K Yamada</name>
</author>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Asai, K" uniqKey="Asai K">K Asai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xu, Z" uniqKey="Xu Z">Z Xu</name>
</author>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wiebe, Njp" uniqKey="Wiebe N">NJP Wiebe</name>
</author>
<author>
<name sortKey="Meyer, Im" uniqKey="Meyer I">IM Meyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bernhart, S" uniqKey="Bernhart S">S Bernhart</name>
</author>
<author>
<name sortKey="Hofacker, I" uniqKey="Hofacker I">I Hofacker</name>
</author>
<author>
<name sortKey="Will, S" uniqKey="Will S">S Will</name>
</author>
<author>
<name sortKey="Gruber, A" uniqKey="Gruber A">A Gruber</name>
</author>
<author>
<name sortKey="Stadler, P" uniqKey="Stadler P">P Stadler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Meyer, Im" uniqKey="Meyer I">IM Meyer</name>
</author>
<author>
<name sortKey="Mikl S, I" uniqKey="Mikl S I">I Miklós</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pedersen, Js" uniqKey="Pedersen J">JS Pedersen</name>
</author>
<author>
<name sortKey="Bejerano, G" uniqKey="Bejerano G">G Bejerano</name>
</author>
<author>
<name sortKey="Siepel, A" uniqKey="Siepel A">A Siepel</name>
</author>
<author>
<name sortKey="Rosenbloom, K" uniqKey="Rosenbloom K">K Rosenbloom</name>
</author>
<author>
<name sortKey="Lindblad Toh, K" uniqKey="Lindblad Toh K">K Lindblad-Toh</name>
</author>
<author>
<name sortKey="Lander, Es" uniqKey="Lander E">ES Lander</name>
</author>
<author>
<name sortKey="Kent, J" uniqKey="Kent J">J Kent</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Haussler, D" uniqKey="Haussler D">D Haussler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Griffiths Jones, S" uniqKey="Griffiths Jones S">S Griffiths-Jones</name>
</author>
<author>
<name sortKey="Moxon, S" uniqKey="Moxon S">S Moxon</name>
</author>
<author>
<name sortKey="Marshall, M" uniqKey="Marshall M">M Marshall</name>
</author>
<author>
<name sortKey="Khanna, A" uniqKey="Khanna A">A Khanna</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
<author>
<name sortKey="Bateman, A" uniqKey="Bateman A">A Bateman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Touzet, H" uniqKey="Touzet H">H Touzet</name>
</author>
<author>
<name sortKey="Perriquet, O" uniqKey="Perriquet O">O Perriquet</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Knudsen, B" uniqKey="Knudsen B">B Knudsen</name>
</author>
<author>
<name sortKey="Hein, J" uniqKey="Hein J">J Hein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
<author>
<name sortKey="Krogh, A" uniqKey="Krogh A">A Krogh</name>
</author>
<author>
<name sortKey="Mitchison, G" uniqKey="Mitchison G">G Mitchison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
<author>
<name sortKey="Sabina, J" uniqKey="Sabina J">J Sabina</name>
</author>
<author>
<name sortKey="Zuker, M" uniqKey="Zuker M">M Zuker</name>
</author>
<author>
<name sortKey="Turner, Dh" uniqKey="Turner D">DH Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hofacker, Il" uniqKey="Hofacker I">IL Hofacker</name>
</author>
<author>
<name sortKey="Fontana, W" uniqKey="Fontana W">W Fontana</name>
</author>
<author>
<name sortKey="Stadler, Pf" uniqKey="Stadler P">PF Stadler</name>
</author>
<author>
<name sortKey="Bonhoeffer, Ls" uniqKey="Bonhoeffer L">LS Bonhoeffer</name>
</author>
<author>
<name sortKey="Tacker, M" uniqKey="Tacker M">M Tacker</name>
</author>
<author>
<name sortKey="Schuster, P" uniqKey="Schuster P">P Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Proctor, Jr" uniqKey="Proctor J">JR Proctor</name>
</author>
<author>
<name sortKey="Meyer, Im" uniqKey="Meyer I">IM Meyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Staple, Dw" uniqKey="Staple D">DW Staple</name>
</author>
<author>
<name sortKey="Butcher, Se" uniqKey="Butcher S">SE Butcher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Batenburg, Fh" uniqKey="Van Batenburg F">FH van Batenburg</name>
</author>
<author>
<name sortKey="Gultyaev, Ap" uniqKey="Gultyaev A">AP Gultyaev</name>
</author>
<author>
<name sortKey="Pleij, Cw" uniqKey="Pleij C">CW Pleij</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deiman, Balm" uniqKey="Deiman B">BALM Deiman</name>
</author>
<author>
<name sortKey="Pleij, Cwa" uniqKey="Pleij C">CWA Pleij</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Akutsu, T" uniqKey="Akutsu T">T Akutsu</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pedersen, Cn" uniqKey="Pedersen C">CN Pedersen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rivas, E" uniqKey="Rivas E">E Rivas</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dirks, Rm" uniqKey="Dirks R">RM Dirks</name>
</author>
<author>
<name sortKey="Pierce, Na" uniqKey="Pierce N">NA Pierce</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reeder, J" uniqKey="Reeder J">J Reeder</name>
</author>
<author>
<name sortKey="Giegerich, R" uniqKey="Giegerich R">R Giegerich</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andronescu, Ms" uniqKey="Andronescu M">MS Andronescu</name>
</author>
<author>
<name sortKey="Pop, C" uniqKey="Pop C">C Pop</name>
</author>
<author>
<name sortKey="Condon, Ae" uniqKey="Condon A">AE Condon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sperschneider, J" uniqKey="Sperschneider J">J Sperschneider</name>
</author>
<author>
<name sortKey="Datta, A" uniqKey="Datta A">A Datta</name>
</author>
<author>
<name sortKey="Wise, Mj" uniqKey="Wise M">MJ Wise</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sperschneider, J" uniqKey="Sperschneider J">J Sperschneider</name>
</author>
<author>
<name sortKey="Datta, A" uniqKey="Datta A">A Datta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sperschneider, J" uniqKey="Sperschneider J">J Sperschneider</name>
</author>
<author>
<name sortKey="Datta, A" uniqKey="Datta A">A Datta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huang, C H" uniqKey="Huang C">C-H Huang</name>
</author>
<author>
<name sortKey="Lu, Cl" uniqKey="Lu C">CL Lu</name>
</author>
<author>
<name sortKey="Chiu, H T" uniqKey="Chiu H">H-T Chiu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ren, J" uniqKey="Ren J">J Ren</name>
</author>
<author>
<name sortKey="Rastegari, B" uniqKey="Rastegari B">B Rastegari</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A Condon</name>
</author>
<author>
<name sortKey="Hoos, Hh" uniqKey="Hoos H">HH Hoos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Kato, Y" uniqKey="Kato Y">Y Kato</name>
</author>
<author>
<name sortKey="Hamada, M" uniqKey="Hamada M">M Hamada</name>
</author>
<author>
<name sortKey="Akutsu, T" uniqKey="Akutsu T">T Akutsu</name>
</author>
<author>
<name sortKey="Asai, K" uniqKey="Asai K">K Asai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Puton, T" uniqKey="Puton T">T Puton</name>
</author>
<author>
<name sortKey="Kozlowski, Lp" uniqKey="Kozlowski L">LP Kozlowski</name>
</author>
<author>
<name sortKey="Rother, Km" uniqKey="Rother K">KM Rother</name>
</author>
<author>
<name sortKey="Bujnicki, Jm" uniqKey="Bujnicki J">JM Bujnicki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
<author>
<name sortKey="Disney, Md" uniqKey="Disney M">MD Disney</name>
</author>
<author>
<name sortKey="Childs, Jl" uniqKey="Childs J">JL Childs</name>
</author>
<author>
<name sortKey="Schroeder, Sj" uniqKey="Schroeder S">SJ Schroeder</name>
</author>
<author>
<name sortKey="Zuker, M" uniqKey="Zuker M">M Zuker</name>
</author>
<author>
<name sortKey="Turner, Dh" uniqKey="Turner D">DH Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deigan, Ke" uniqKey="Deigan K">KE Deigan</name>
</author>
<author>
<name sortKey="Li, Tw" uniqKey="Li T">TW Li</name>
</author>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
<author>
<name sortKey="Weeks, Km" uniqKey="Weeks K">KM Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hajdin, Ce" uniqKey="Hajdin C">CE Hajdin</name>
</author>
<author>
<name sortKey="Bellaousov, S" uniqKey="Bellaousov S">S Bellaousov</name>
</author>
<author>
<name sortKey="Huggins, W" uniqKey="Huggins W">W Huggins</name>
</author>
<author>
<name sortKey="Leonard, Cw" uniqKey="Leonard C">CW Leonard</name>
</author>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
<author>
<name sortKey="Weeks, Km" uniqKey="Weeks K">KM Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jabbari, H" uniqKey="Jabbari H">H Jabbari</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A Condon</name>
</author>
<author>
<name sortKey="Zhao, S" uniqKey="Zhao S">S Zhao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tinoco, I" uniqKey="Tinoco I">I Tinoco</name>
</author>
<author>
<name sortKey="Bustamante, C" uniqKey="Bustamante C">C Bustamante</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cho, Ss" uniqKey="Cho S">SS Cho</name>
</author>
<author>
<name sortKey="Pincus, Dl" uniqKey="Pincus D">DL Pincus</name>
</author>
<author>
<name sortKey="Thirumalai, D" uniqKey="Thirumalai D">D Thirumalai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bailor, Mh" uniqKey="Bailor M">MH Bailor</name>
</author>
<author>
<name sortKey="Sun, X" uniqKey="Sun X">X Sun</name>
</author>
<author>
<name sortKey="Al Hashimi, Hm" uniqKey="Al Hashimi H">HM Al-Hashimi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilkinson, Ka" uniqKey="Wilkinson K">KA Wilkinson</name>
</author>
<author>
<name sortKey="Merino, Ej" uniqKey="Merino E">EJ Merino</name>
</author>
<author>
<name sortKey="Weeks, Km" uniqKey="Weeks K">KM Weeks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ding, F" uniqKey="Ding F">F Ding</name>
</author>
<author>
<name sortKey="Sharma, S" uniqKey="Sharma S">S Sharma</name>
</author>
<author>
<name sortKey="Chalasani, P" uniqKey="Chalasani P">P Chalasani</name>
</author>
<author>
<name sortKey="Demidov, Vv" uniqKey="Demidov V">VV Demidov</name>
</author>
<author>
<name sortKey="Broude, Ne" uniqKey="Broude N">NE Broude</name>
</author>
<author>
<name sortKey="Dokholyan, Nv" uniqKey="Dokholyan N">NV Dokholyan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Darty, K" uniqKey="Darty K">K Darty</name>
</author>
<author>
<name sortKey="Denise, A" uniqKey="Denise A">A Denise</name>
</author>
<author>
<name sortKey="Ponty, Y" uniqKey="Ponty Y">Y Ponty</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rastegari, B" uniqKey="Rastegari B">B Rastegari</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A Condon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sperschneider, J" uniqKey="Sperschneider J">J Sperschneider</name>
</author>
<author>
<name sortKey="Datta, A" uniqKey="Datta A">A Datta</name>
</author>
<author>
<name sortKey="Wise, Mj" uniqKey="Wise M">MJ Wise</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hajiaghayi, M" uniqKey="Hajiaghayi M">M Hajiaghayi</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A Condon</name>
</author>
<author>
<name sortKey="Hoos, H" uniqKey="Hoos H">H Hoos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Varian, H" uniqKey="Varian H">H Varian</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aghaeepour, N" uniqKey="Aghaeepour N">N Aghaeepour</name>
</author>
<author>
<name sortKey="Hoos, H" uniqKey="Hoos H">H Hoos</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andronescu, M" uniqKey="Andronescu M">M Andronescu</name>
</author>
<author>
<name sortKey="Chuan, Z" uniqKey="Chuan Z">Z Chuan</name>
</author>
<author>
<name sortKey="Condon, A" uniqKey="Condon A">A Condon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zuker, M" uniqKey="Zuker M">M Zuker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bellaousov, S" uniqKey="Bellaousov S">S Bellaousov</name>
</author>
<author>
<name sortKey="Mathews, Dh" uniqKey="Mathews D">DH Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nethercote, N" uniqKey="Nethercote N">N Nethercote</name>
</author>
<author>
<name sortKey="Seward, J" uniqKey="Seward J">J Seward</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">24884954</article-id>
<article-id pub-id-type="pmc">4064103</article-id>
<article-id pub-id-type="publisher-id">6443</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-15-147</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Jabbari</surname>
<given-names>Hosna</given-names>
</name>
<address>
<email>hjabbari@cs.ubc.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Condon</surname>
<given-names>Anne</given-names>
</name>
<address>
<email>condon@cs.ubc.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="GRID">grid.17091.3e</institution-id>
<institution-id institution-id-type="ISNI">0000000122889830</institution-id>
<institution>Department of Computer Science,</institution>
<institution>University of British Columbia,</institution>
</institution-wrap>
2366 Main Mall, Vancouver, Canada</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>18</day>
<month>5</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>18</day>
<month>5</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="collection">
<year>2014</year>
</pub-date>
<volume>15</volume>
<elocation-id>147</elocation-id>
<history>
<date date-type="received">
<day>7</day>
<month>1</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>8</day>
<month>5</month>
<year>2014</year>
</date>
</history>
<permissions>
<copyright-statement>© Jabbari and Condon; licensee BioMed Central Ltd. 2014</copyright-statement>
<license license-type="OpenAccess">
<license-p>This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information.</p>
</sec>
<sec>
<title>Results</title>
<p>We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0.</p>
<p>Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure.</p>
<p>Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at
<ext-link ext-link-type="uri" xlink:href="http://www.cs.ubc.ca/~hjabbari/software.php">http://www.cs.ubc.ca/~hjabbari/software.php</ext-link>
.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/1471-2105-15-147) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>RNA</kwd>
<kwd>Secondary structure prediction</kwd>
<kwd>Pseudoknot</kwd>
<kwd>Hierarchical folding</kwd>
<kwd>Minimum free energy</kwd>
</kwd-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2014</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p>RNA molecules are crucial in different levels of cellular function, ranging from translation and regulation of genes to coding for proteins [
<xref ref-type="bibr" rid="CR1">1</xref>
<xref ref-type="bibr" rid="CR6">6</xref>
]. Understanding the structure of an RNA molecule is important in inferring its function [
<xref ref-type="bibr" rid="CR7">7</xref>
<xref ref-type="bibr" rid="CR10">10</xref>
]. Since experimental methods for determining RNA structure, such as X-ray, crystallography and NMR, are time consuming, expensive and in some cases infeasible, computational methods for prediction of RNA structure are valuable.</p>
<p>Currently computational RNA structure prediction methods mainly focus on predicting RNA secondary structure—the set of base pairs that form when RNA molecules fold. When multiple homologous (evolutionarily related) RNA sequences are available, the secondary structure of the sequences can be predicted using multiple sequence alignment and comparative sequence analysis [
<xref ref-type="bibr" rid="CR11">11</xref>
<xref ref-type="bibr" rid="CR22">22</xref>
]. Alternative approaches, which can be used to predict secondary structure of a single sequence, are based on thermodynamic parameters derived in part from experimental data [
<xref ref-type="bibr" rid="CR23">23</xref>
]. While thermodynamics-based approaches can be less accurate than comparative-based algorithms, thermodynamics-based approaches are applicable in cases of novel RNAs such as the many RNAs of unknown function recently reported by the ENCODE consortium [
<xref ref-type="bibr" rid="CR24">24</xref>
]. Thermodynamics-based approaches can also be easier to apply to prediction of the structure of interacting RNA molecules, for example, in gene knockdown studies.</p>
<p>Many computational thermodynamics-based methods find the structures with the minimum free energy (MFE) from the set of all possible structures, when each structure feature is assigned a free energy value and the energy of a structure is calculated as the sum of the features’ energies. There has been significant success in prediction of
<italic>pseudoknot-free</italic>
secondary structures (structures with no crossing base pairs) [
<xref ref-type="bibr" rid="CR23">23</xref>
,
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR26">26</xref>
]. While many small RNA secondary structures are pseudoknot-free, many biologically important RNA molecules, both in the cell [
<xref ref-type="bibr" rid="CR27">27</xref>
,
<xref ref-type="bibr" rid="CR28">28</xref>
], and in viral RNA [
<xref ref-type="bibr" rid="CR29">29</xref>
] are found to be pseudoknotted.</p>
<p>Since finding the MFE pseudoknotted secondary structure is NP-hard [
<xref ref-type="bibr" rid="CR30">30</xref>
<xref ref-type="bibr" rid="CR32">32</xref>
], polynomial time MFE-based methods for prediction of pseudoknotted secondary structures predict a restricted class of pseudoknotted structures [
<xref ref-type="bibr" rid="CR33">33</xref>
<xref ref-type="bibr" rid="CR35">35</xref>
]. These methods trade off run-time complexity and the generality of the class of structures they can predict. For example, the most general algorithm of Rivas and Eddy [
<xref ref-type="bibr" rid="CR33">33</xref>
], whose running time is
<italic>Θ</italic>
(
<italic>n</italic>
<sup>6</sup>
) on inputs of length
<italic>n</italic>
, is not practical for RNA sequences of length more than 100 nucleotides. This has been the main reason for development of heuristic methods for prediction of pseudoknotted structures [
<xref ref-type="bibr" rid="CR36">36</xref>
<xref ref-type="bibr" rid="CR41">41</xref>
]. Although heuristic methods may not find the MFE structure, they usually run faster than the MFE-based methods that handle the same class of structures. For example, HotKnots V2.0 [
<xref ref-type="bibr" rid="CR36">36</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
] is a heuristic approach that uses carefully trained energy parameters, is guided by energy minimization and can handle kissing hairpin structures. However, HotKnots is still slow on long sequences.</p>
<p>Other methods for prediction of pseudoknotted structures, such as the IPknot method of Sato et al. [
<xref ref-type="bibr" rid="CR42">42</xref>
], are motivated by the finding of Mathews [
<xref ref-type="bibr" rid="CR43">43</xref>
] that base pairs with high base pairing probabilities in the thermodynamic ensemble are more likely to be in the known structure. In a comprehensive comparison performed by Puton et al. [
<xref ref-type="bibr" rid="CR44">44</xref>
] on the performance of publicly available non-comparative RNA secondary structure prediction methods that can handle pseudoknotted structures, IPknot ranks first for general length RNA sequences.</p>
<p>Incorporating known structural information can improve the accuracy of structure prediction. For example, Mathews et al. [
<xref ref-type="bibr" rid="CR45">45</xref>
] used SHAPE reactivity data to improve the prediction accuracy from 26.3% to 86.8% for 5S rRNA of E. coli. Roughly, the larger the SHAPE reactivity value for a given nucleotide, the more likely it is that the nucleotide is unpaired in the structure. However, limited SHAPE reactivity data is available, and the data does not unambiguously determine whether a base is paired or not or, if it is paired, to what other nucleotide. Deigan et al. [
<xref ref-type="bibr" rid="CR46">46</xref>
] created pseudo energy terms from SHAPE reactivity data, as a means of integrating such data into prediction software. They reported prediction accuracy of 96% to 100% for three moderate-sized RNAs (<200 nucleotides) and for 16S rRNA (1500 nucleotides). ShapeKnots [
<xref ref-type="bibr" rid="CR47">47</xref>
] is a new method for incorporating SHAPE reactivity data for pseudoknotted structures that incorporates the pseudo energy terms into a heuristic method similar to that of Ren et al. [
<xref ref-type="bibr" rid="CR41">41</xref>
].</p>
<p>We previously presented HFold [
<xref ref-type="bibr" rid="CR48">48</xref>
], an approach for prediction of pseudoknotted structures, motivated by two goals, namely to avoid the high running time complexity of other methods for pseudoknotted secondary structure prediction and to leverage the
<italic>hierarchical folding hypothesis</italic>
. This hypothesis posits that an RNA molecule first folds into a pseudoknot-free structure; then additional base pairs are added that may form pseudoknots with the first structure so as to lower the structure’s free energy [
<xref ref-type="bibr" rid="CR49">49</xref>
]. Given a pseudoknot-free structure as input, HFold predicts a possibly pseudoknotted structure from a broad class that contains the given input structure and, relative to that constraint, has minimum free energy. HFold’s running time is
<italic>O</italic>
(
<italic>n</italic>
<sup>3</sup>
), significantly faster than other methods for predicting pseudoknotted structures. Several experts have provided evidence for, and support, the hierarchical folding hypothesis [
<xref ref-type="bibr" rid="CR49">49</xref>
<xref ref-type="bibr" rid="CR52">52</xref>
]. The class of structures that HFold can handle, density-2 structures, is quite general and includes many important pseudoknots including H-type pseudoknots, kissing hairpins and infinite chains of interleaved bands, with arbitrary nested (pseudoknotted) substructures. (Roughly, a structure is density-2 if no base is enclosed by more than two overlapping pseudoknotted stems.)</p>
<p>Another advantage of HFold over heuristic methods such as HotKnots or ShapeKnots is that unlike these methods, HFold minimizes the free energy of the possibly pseudoknotted output structure relative to the given input structure. Therefore HFold’s method of adding pseudoknotted stems is better motivated energetically than that of HotKnots or ShapeKnots.</p>
<p>While HFold is fast, our earlier implementation of HFold had its own shortcomings. First, due to a high pseudoknot initiation penalty in its underlying energy model, many of its predicted structures did not have pseudoknots. Also low band penalty (i.e., penalty for addition of pseudoknotted stems or bands) in its energy model encouraged addition of pseudoknotted stems when a pseudoknot was predicted. Second, if the first structure input to HFold contains base pairs that are not in the true pseudoknot-free structure for the given RNA sequence or is not the complete pseudoknot-free structure (i.e., it does not include all the base pairs in the pseudoknot-free structure), HFold is often unable to predict the known pseudoknotted structure as output.</p>
<p>To summarize, existing methods for prediction of pseudoknotted structures suffer from one or both of the following shortcomings: 1) slow running time, or 2) poor prediction accuracy. Moreover there is limited opportunity for the user to provide structural information, or constraints, that can guide prediction. In cases of a prediction method that incorporates user-defined constraints, it is also useful to understand the degree to which the method’s accuracy persists as the input information degrades. We use the term
<italic>robustness with respect to partial information</italic>
or
<italic>robustness</italic>
to refer to this property of a method. (We note that in our definition of robustness we do not mean robust with respect to noise.) To the best of our knowledge, the concept of robustness in secondary structure prediction methods has not been studied before.</p>
<p>In this work we present a new method that addresses these shortcomings. Our method, Iterative HFold, takes a pseudoknot-free input structure and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold incorporates four different methods and reports as its final structure the structure with the lowest energy, among all structures produced by these methods. While one of its underlying methods, HFold, strictly adheres to the hierarchical folding hypothesis, the other three use iterations to extend or remove the base pairs of input structure, with the goal of finding a structure that has lower energy than the structure found by HFold. Thus, unlike HFold, iterative HFold is able to modify the input structure (while the class of structures handled by both methods is the same). This is valuable since 1) computationally produced structures may not be completely accurate and 2) while the hierarchical folding hypothesis is a useful guiding principle, there is evidence that allowing for disruption of some base pairs in the initially formed pseudoknot-free secondary structure can improve prediction [
<xref ref-type="bibr" rid="CR53">53</xref>
,
<xref ref-type="bibr" rid="CR54">54</xref>
].</p>
<p>All of Iterative HFold’s underlying methods use the energy model of HotKnots V2.0 DP09 [
<xref ref-type="bibr" rid="CR36">36</xref>
]; with this model, HFold obtained predictions with higher accuracy than those obtained with our earlier implementation of HFold. One of Iterative HFold’s underlying methods is HFold-PKonly, which given the input structure only adds pseudoknotted base pairs. HFold-PKonly is especially useful for cases when the user has either complete information about the true pseudoknot-free structure or wants to check whether a single stem of the input structure can be part of a pseudoknot since, if the input structure only has the specific stem in question, the output structure of HFold-PKonly will determine if the given stem can be part of a pseudoknot.</p>
<p>Based on our experiments on our HK-PK and HK-PK-free data sets that include 88 pseudoknotted structures, and 337 pseudoknot-free structures respectively, ranging in length from 10 to 400 nucleotides, a single run of Iterative HFold does not take more than 9 seconds time and 62 MB of memory. In contrast, one of the best heuristic methods, HotKnots V2.0, takes 1.7 hours and 91 GB of memory for a sequence with 400 nucleotides. Therefore our method is practical for prediction of long RNA structures. Iterative HFold bootstrap 95% percentile confidence interval for average accuracy of pseudoknotted structures of the HK-PK data set is significantly higher than that of IPknot, ((72.83%, 83.37%) vs. (54.56%, 66.25%)) and is comparable to that of HotKnots V2.0, (vs. (73.60%, 83.35%)) two of the best prediction methods available. Iterative HFold’s accuracy is significantly higher than that of IPknot and HotKnots on our IP-pk168 data set. Iterative HFold also has higher accuracy than HFold even when just partial information about the true pseudoknot-free structure is provided, so it is more robust than HFold. Specifically, Iterative HFold’s average accuracy on pseudoknotted structures steadily increases from roughly 54% to 79% as the user provides up to 40% of the input structure, and improves with a more modest but still positive improvement in accuracy when further structural information is provided.</p>
</sec>
<sec id="Sec2">
<title>Methods</title>
<p>We represent an RNA molecule by a sequence,
<italic>S</italic>
, of its four bases, Adenine (A), Cytosine (C), Guanine (G) and Uracil (U). We denote the length of the RNA molecule by
<italic>n</italic>
and refer to each base by its index
<italic>i</italic>
, 1≤
<italic>i</italic>
<italic>n</italic>
.</p>
<p>When an RNA molecule folds, bonds may form between canonical pairs of bases (
<italic>A</italic>
-
<italic>U</italic>
,
<italic>C</italic>
-
<italic>G</italic>
, and
<italic>G</italic>
-
<italic>U</italic>
) (see Figure
<xref rid="Fig1" ref-type="fig">1</xref>
). Throughout this work, we consider only cases where each base may pair at most with one other base, and represent base pairing between
<italic>i</italic>
and
<italic>j</italic>
by
<italic>i</italic>
.
<italic>j</italic>
. We define a
<italic>secondary structure</italic>
,
<italic>R</italic>
, as a set of pairs
<italic>i</italic>
.
<italic>j</italic>
, 1≤
<italic>i</italic>
<
<italic>j</italic>
<italic>n</italic>
;
<italic>i</italic>
.
<italic>j</italic>
and
<italic>k</italic>
.
<italic>j</italic>
can belong to the same set if and only if
<italic>i</italic>
=
<italic>k</italic>
.
<fig id="Fig1">
<label>Figure 1</label>
<caption>
<p>
<bold>Pseudoknotted and pseudoknot-free secondary structures.</bold>
Examples of loops and canonical base pairs in a pseudoknotted and a pseudoknot-free secondary structure. The blue base pairs belong to the
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure and the green base pairs belong to the
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
structure, as defined in Section ‘Definition of
<italic>G</italic>
<sub>
<italic>b</italic>
<italic>i</italic>
<italic>g</italic>
</sub>
and
<italic>G</italic>
<sub>
<italic>s</italic>
<italic>m</italic>
<italic>a</italic>
<italic>l</italic>
<italic>l</italic>
</sub>
’. This figure was produced using the VARNA software [
<xref ref-type="bibr" rid="CR55">55</xref>
].</p>
</caption>
<graphic xlink:href="12859_2014_Article_6443_Fig1_HTML" id="d29e563"></graphic>
</fig>
</p>
<p>If
<italic>i</italic>
.
<italic>j</italic>
and
<italic>k</italic>
.
<italic>l</italic>
are two base pairs of a secondary structure,
<italic>R</italic>
, and 1≤
<italic>i</italic>
<
<italic>k</italic>
<
<italic>j</italic>
<
<italic>l</italic>
<italic>n</italic>
, we say
<italic>i</italic>
.
<italic>j</italic>
crosses
<italic>k</italic>
.
<italic>l</italic>
. We refer to a secondary structure with crossing base pairs as a
<italic>pseudoknotted secondary structure</italic>
and a secondary structure with no crossing base pairs as a
<italic>pseudoknot-free secondary structure</italic>
(see Figure
<xref rid="Fig1" ref-type="fig">1</xref>
). Figure
<xref rid="Fig1" ref-type="fig">1</xref>
shows different kinds of loops in a secondary structure. We refer the readers to Jabbari et al. [
<xref ref-type="bibr" rid="CR48">48</xref>
] or Rastegari et al. [
<xref ref-type="bibr" rid="CR56">56</xref>
] for precise definition and illustration of terms used in the figure.</p>
<sec id="Sec3">
<title>Energy model</title>
<p>Many computational methods for predicting the secondary structure of an RNA (or DNA) molecule are based on models of the free energy of loops [
<xref ref-type="bibr" rid="CR23">23</xref>
,
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR26">26</xref>
,
<xref ref-type="bibr" rid="CR33">33</xref>
<xref ref-type="bibr" rid="CR36">36</xref>
,
<xref ref-type="bibr" rid="CR48">48</xref>
]. Table
<xref rid="Tab1" ref-type="table">1</xref>
summarizes the energy constants and functions used in our energy model for pseudoknotted structures. The values of these energy parameters are those of the DP09 parameter set of Andronescu et al. [
<xref ref-type="bibr" rid="CR36">36</xref>
], used by the HotKnots V2.0 prediction software.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>
<bold>Energy parameters</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Name</th>
<th align="center">Description</th>
<th align="center">Value (
<italic>Kcal/mol</italic>
)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>s</italic>
</sub>
</td>
<td align="center">Exterior pseudoloop</td>
<td align="center">−1.38</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">initiation penalty</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>s</italic>
<italic>m</italic>
</sub>
</td>
<td align="center">Penalty for introducing pseudoknot</td>
<td align="center">10.07</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">inside a multiloop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>s</italic>
<italic>p</italic>
</sub>
</td>
<td align="center">Penalty for introducing pseudoknot</td>
<td align="center">15.00</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">inside a pseudoloop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>b</italic>
</sub>
</td>
<td align="center">Band penalty</td>
<td align="center">2.46</td>
</tr>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>u</italic>
<italic>p</italic>
</sub>
</td>
<td align="center">Penalty for unpaired base</td>
<td align="center">0.06</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">in a pseudoloop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>P</italic>
<sub>
<italic>p</italic>
<italic>s</italic>
</sub>
</td>
<td align="center">Penalty for closed subregion</td>
<td align="center">0.96</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">inside a pseudoloop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>e</italic>
<sub>
<italic>H</italic>
</sub>
(
<italic>i</italic>
,
<italic>j</italic>
)</td>
<td align="center">Energy of a hairpin loop closed by
<italic>i</italic>
.
<italic>j</italic>
</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>e</italic>
<sub>
<italic>S</italic>
</sub>
(
<italic>i</italic>
,
<italic>j</italic>
)</td>
<td align="center">Energy of stacked pair closed by
<italic>i</italic>
.
<italic>j</italic>
</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>e</italic>
<sub>
<italic>s</italic>
<italic>t</italic>
<italic>P</italic>
</sub>
(
<italic>i</italic>
,
<italic>j</italic>
)</td>
<td align="center">Energy of stacked pair that</td>
<td align="center">0.89×
<italic>e</italic>
<sub>
<italic>S</italic>
</sub>
(
<italic>i</italic>
,
<italic>j</italic>
)</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">spans a band</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>e</italic>
<sub>
<italic>i</italic>
<italic>n</italic>
<italic>t</italic>
</sub>
(
<italic>i</italic>
,
<italic>r</italic>
,
<italic>r</italic>
<sup></sup>
,
<italic>j</italic>
)</td>
<td align="center">Energy of a pseudoknot-free</td>
<td align="center"></td>
</tr>
<tr>
<td align="left"></td>
<td align="center">internal loop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>e</italic>
<sub>
<italic>i</italic>
<italic>n</italic>
<italic>t</italic>
<italic>P</italic>
</sub>
(
<italic>i</italic>
,
<italic>r</italic>
,
<italic>r</italic>
<sup></sup>
,
<italic>j</italic>
)</td>
<td align="center">Energy of internal loop</td>
<td align="center">0.74×
<italic>e</italic>
<sub>
<italic>i</italic>
<italic>n</italic>
<italic>t</italic>
</sub>
(
<italic>i</italic>
,
<italic>r</italic>
,
<italic>r</italic>
<sup></sup>
,
<italic>j</italic>
)</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">that spans a band</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>a</italic>
</td>
<td align="center">Multiloop initiation penalty</td>
<td align="center">3.39</td>
</tr>
<tr>
<td align="left">
<italic>b</italic>
</td>
<td align="center">Multiloop base pair penalty</td>
<td align="center">0.03</td>
</tr>
<tr>
<td align="left">
<italic>c</italic>
</td>
<td align="center">Penalty for unpaired base</td>
<td align="center">0.02</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">in a multiloop</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>a</italic>
<sup></sup>
</td>
<td align="center">Penalty for introducing a multiloop</td>
<td align="center">3.41</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">that spans a band</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>b</italic>
<sup></sup>
</td>
<td align="center">Base pair penalty for a multiloop</td>
<td align="center">0.56</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">that spans a band</td>
<td align="center"></td>
</tr>
<tr>
<td align="left">
<italic>c</italic>
<sup></sup>
</td>
<td align="center">Penalty for unpaired base in a multiloop</td>
<td align="center">0.12</td>
</tr>
<tr>
<td align="left"></td>
<td align="center">that spans a band</td>
<td align="center"></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>This table provides the names, description and values of the energy parameters and functions that we used in our methods. The names and definitions are the same as in our original HFold [
<xref ref-type="bibr" rid="CR48">48</xref>
], and the values were updated based on the work of Andronescu et al. [
<xref ref-type="bibr" rid="CR36">36</xref>
]. These parameters were derived for a temperature of 37°C and 1 M salt concentration.</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec id="Sec4">
<title>Data sets</title>
<p>We use three data sets to analyze performance of our algorithms. Our first data set is the
<italic>test</italic>
data set of Andronescu et al. [
<xref ref-type="bibr" rid="CR36">36</xref>
], that contains 446 distinct RNA sequences and their reference structures, of which 348 are pseudoknot-free and 98 are pseudoknotted. This set has four structures that are not in the class of structures our methods can handle (i.e., have densities higher than 2 [
<xref ref-type="bibr" rid="CR48">48</xref>
]). Since the number of such structures is too small to be useful in an experimental analysis, we removed them from our set of pseudoknotted structures, resulting in a set of size 442.</p>
<p>There are eight cases in this data set for which the original sequence and structure were shortened to accommodate restrictions in length. We removed them from our data set, resulting in a set of size 425. From now on we use “HK-PK” to refer to the pseudoknotted structures in this set (with 88 structures) and “HK-PK-free” to refer to the pseudoknot-free structures in this set (with 337 structures). RNA sequences in HK-PK and HK-PK-free have length between 10 and 400 nucleotides.</p>
<p>Our second data set is the
<italic>pk168</italic>
data set of Sato et al. [
<xref ref-type="bibr" rid="CR42">42</xref>
]. This set contains 168 pseudoknotted structures from 16 categories of pseudoknots. The sequences in this set have at most 85
<italic>%</italic>
similarity and have length of at most 140 nucleotides. We refer to this data set as “IP-pk168”.</p>
<p>Our third data set is the
<italic>test</italic>
data set of Sperschneider et al. [
<xref ref-type="bibr" rid="CR57">57</xref>
]. This set contains 16 pseudoknotted structures with strong experimental support. RNA sequences in this set have length between 34 and 363 nucleotides. We refer to this data set as “DK-pk16”.</p>
</sec>
<sec id="Sec5">
<title>Definition of
<italic>G</italic>
<sub>
<italic>b</italic>
<italic>i</italic>
<italic>g</italic>
</sub>
and
<italic>G</italic>
<sub>
<italic>s</italic>
<italic>m</italic>
<italic>a</italic>
<italic>l</italic>
<italic>l</italic>
</sub>
</title>
<p>To test the robustness of our methods on a given RNA sequence, we need to provide partial information about the true pseudoknot-free structure as input structure for that sequence. To obtain the true pseudoknot-free structure,
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, we remove the minimum number of pseudoknotted base pairs from the reference structure to make the reference structure pseudoknot-free. If the reference structure is pseudoknot-free, then
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is the same as the reference structure itself. We call the removed base pairs from the reference structure
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
. Blue base pairs in Figure
<xref rid="Fig1" ref-type="fig">1</xref>
represent base pairs of the
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure and green base pairs represent the
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
structure.</p>
</sec>
<sec id="Sec6">
<title>Accuracy measures</title>
<p>Following common practice [
<xref ref-type="bibr" rid="CR36">36</xref>
,
<xref ref-type="bibr" rid="CR58">58</xref>
], we measure the accuracy of a predicted RNA secondary structure relative to a reference secondary structure by
<italic>F</italic>
-measure, which is the harmonic mean of
<italic>sensitivity</italic>
and
<italic>positive predictive value</italic>
(
<italic>PPV</italic>
). We define these values as follows:
<disp-formula id="Equ1">
<alternatives>
<mml:math id="M1">
<mml:mrow>
<mml:mspace width="-12.0pt"></mml:mspace>
<mml:mtext>Sensitivity</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">Number</mml:mi>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">of</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">correctly</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">predicted</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">base</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">pairs</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">Number</mml:mi>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">of</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">base</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">pairs</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">in</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">the</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">reference</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">structure</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<graphic xlink:href="12859_2014_Article_6443_Equa_HTML.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
<disp-formula id="Equ2">
<alternatives>
<mml:math id="M2">
<mml:mrow>
<mml:mtext>PPV</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">Number</mml:mi>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">of</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">correctly</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">predicted</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">base</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">pairs</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">Number</mml:mi>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">of</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">predicted</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">base</mml:mtext>
<mml:mspace width="2.0pt"></mml:mspace>
<mml:mtext mathvariant="italic">pairs</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<graphic xlink:href="12859_2014_Article_6443_Equb_HTML.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>and
<disp-formula id="Equ3">
<alternatives>
<mml:math id="M3">
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mstyle mathvariant="normal">
<mml:mo></mml:mo>
<mml:mtext mathvariant="italic">measure</mml:mtext>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mtext>sensitivity</mml:mtext>
<mml:mo>×</mml:mo>
<mml:mtext>PPV</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>sensitivity</mml:mtext>
<mml:mo>+</mml:mo>
<mml:mtext>PPV</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<graphic xlink:href="12859_2014_Article_6443_Equc_HTML.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>We also define these values as 0 when their denominators are 0. When a prediction agrees with the reference structure, the value of
<italic>F</italic>
-measure is equal to 1 (so are the values of sensitivity and PPV). When the values of sensitivity or PPV is equal to 0, the predicted structure does not have any base pairs in common with the reference structure.</p>
</sec>
<sec id="Sec7">
<title>Bootstrap percentile confidence intervals</title>
<p>To formally assess the dependency of measured prediction accuracy of results of a method on a given set of RNA we use bootstrap confidence intervals, a well-known statistical resampling technique [
<xref ref-type="bibr" rid="CR59">59</xref>
,
<xref ref-type="bibr" rid="CR60">60</xref>
]. Following the recent work of Aghaeepour and Hoos [
<xref ref-type="bibr" rid="CR61">61</xref>
] and Hajiaghayi et al. [
<xref ref-type="bibr" rid="CR58">58</xref>
] we calculate the bootstrap 95% percentile confidence interval of average
<italic>F</italic>
-measure as follows. For each vector
<italic>f</italic>
of
<italic>F</italic>
-measures (where, for example,
<italic>f</italic>
may be the F-measures of predictions obtained by Iterative HFold on pseudoknotted structures) we first take 10
<sup>4</sup>
resamples with replacement, where the resamples have the same length as the original sample vector
<italic>f</italic>
(|
<italic>f</italic>
|), and then calculate their average
<italic>F</italic>
-measures. These 10
<sup>4</sup>
calculated average
<italic>F</italic>
-measures represent the bootstrap distribution for the vector
<italic>f</italic>
. We then report the 2.5th and 97.5th percentile of this distribution (i.e., the bootstrap distribution of the 10
<sup>4</sup>
average
<italic>F</italic>
-measures calculated above) as the lower and upper bounds of the confidence interval respectively, and call it the bootstrap 95% percentile confidence interval. By reporting the bootstrap 95% percentile confidence interval for average
<italic>F</italic>
-measure of a method,
<italic>A</italic>
, on a data set,
<italic>D</italic>
, we say that we are 95% confident that the average
<italic>F</italic>
-measure of method
<italic>A</italic>
on data set
<italic>D</italic>
is in the reported interval. All calculations are performed using the “boot” package of the R statistics software environment [
<xref ref-type="bibr" rid="CR62">62</xref>
].</p>
</sec>
<sec id="Sec8">
<title>Permutation test</title>
<p>Following the recent work of Hajiaghayi et al. [
<xref ref-type="bibr" rid="CR58">58</xref>
], we use a two sided permutation test to assess the statistical significance of the observed performance differences between two methods. The test proceeds as follows, given a data set and two structure prediction procedures,
<italic>A</italic>
and
<italic>B</italic>
. First, we calculate the difference
<italic>m</italic>
<italic>e</italic>
<italic>a</italic>
<italic>n</italic>
(
<italic>f</italic>
<sub>
<italic>A</italic>
</sub>
)−
<italic>m</italic>
<italic>e</italic>
<italic>a</italic>
<italic>n</italic>
(
<italic>f</italic>
<sub>
<italic>B</italic>
</sub>
) in means between sets of
<italic>F</italic>
-measure values obtained by
<italic>A</italic>
and
<italic>B</italic>
. Then we combine the two sets
<italic>f</italic>
<sub>
<italic>A</italic>
</sub>
and
<italic>f</italic>
<sub>
<italic>B</italic>
</sub>
and record the difference in sample means for 10
<sup>4</sup>
randomly chosen ways of choosing two sets with the same size as |
<italic>f</italic>
<sub>
<italic>A</italic>
</sub>
| and |
<italic>f</italic>
<sub>
<italic>B</italic>
</sub>
| from the combined set. The
<italic>p</italic>
-value is the proportion of the sampled permutations where the absolute difference was greater than or equal to that of absolute difference of the means of sets
<italic>f</italic>
<sub>
<italic>A</italic>
</sub>
and
<italic>f</italic>
<sub>
<italic>B</italic>
</sub>
. Then, if the
<italic>p</italic>
-value of this test is less than the 5% significance level, we reject the null hypothesis that methods
<italic>A</italic>
and
<italic>B</italic>
have equal accuracy and thus accept the alternative hypothesis that the difference in accuracy of method
<italic>A</italic>
and
<italic>B</italic>
is significant. Otherwise, we cannot reject the null hypothesis. All calculations are performed using the “perm” package of the R statistics software environment.</p>
</sec>
<sec id="Sec9">
<title>Iterative HFold</title>
<p>We provide a high level description of our Iterative HFold algorithm.</p>
<p>Pseudocode of our Iterative HFold algorithm is available in Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
. The algorithm builds on two simpler methods, the first being our original HFold algorithm [
<xref ref-type="bibr" rid="CR48">48</xref>
]:
<bold>
<italic>HFold:</italic>
</bold>
Given an RNA sequence,
<italic>S</italic>
, and a pseudoknot-free input structure,
<italic>G</italic>
, find a pseudoknot-free structure,
<italic>G</italic>
<sup></sup>
such that
<italic>G</italic>
<italic>G</italic>
<sup></sup>
is the lowest energy structure that contains
<italic>G</italic>
. We note that
<italic>G</italic>
<italic>G</italic>
<sup></sup>
might not be pseudoknotted.</p>
<p>The second method on which Iterative HFold builds, called HFold-PKonly, is similar to HFold except that
<italic>G</italic>
<sup></sup>
may only contain base pairs that cross base pairs in
<italic>G</italic>
. The prediction provided by HFold-PKonly can be useful in cases where HFold does not produce a pseudoknotted structure.
<bold>
<italic>HFold-PKonly:</italic>
</bold>
Given an RNA sequence,
<italic>S</italic>
, and a pseudoknot-free input structure,
<italic>G</italic>
, find a pseudoknot-free structure,
<italic>G</italic>
<sup></sup>
such that every base pair in
<italic>G</italic>
<sup></sup>
crosses some base pair of
<italic>G</italic>
and such that
<italic>G</italic>
<italic>G</italic>
<sup></sup>
is the lowest energy structure that contains
<italic>G</italic>
among all such
<italic>G</italic>
<sup></sup>
s. Note that
<italic>G</italic>
<sup></sup>
may contain no base pairs.</p>
<p>Iterative HFold also uses the SimFold RNA secondary structure prediction method [
<xref ref-type="bibr" rid="CR63">63</xref>
], which predicts the minimum free energy pseudoknot-free secondary structure for a given RNA sequence. SimFold uses a dynamic programming method similar to Zuker’s MFold method [
<xref ref-type="bibr" rid="CR64">64</xref>
]. In this work we used the HotKnots energy parameters when running SimFold. In addition to an RNA sequence,
<italic>S</italic>
, SimFold can also take a pseudoknot-free secondary structure,
<italic>G</italic>
, as input and predict the MFE pseudoknot-free secondary structure that contains all base pairs of
<italic>G</italic>
.</p>
<p>Iterative HFold is distinguished from the above three methods, namely HFold, HFold-PKonly and SimFold, in two important ways. First, the output of HFold, HFoldPKonly and SimFold methods must contain the given pseudoknot-free input structure,
<italic>G</italic>
, whereas Iterative HFold may modify the input structure. This can be useful when the given input structure is not a high-accuracy estimate of
<italic>G</italic>
<sub>
<italic>b</italic>
<italic>i</italic>
<italic>g</italic>
</sub>
, the true pseudoknot-free substructure of the reference structure. Second, while HFold and HFold-PKonly can add base pairs that cross those in
<italic>G</italic>
, they cannot add base pairs that cross each other, and neither can SimFold. In contrast, Iterative HFold can add base pairs that cross each other. This is particularly useful when the input structure contains limited information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, and so it is necessary both to predict base pairs in
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
and in
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
in order to get a good prediction.</p>
<p>Iterative HFold is comprised of four different iterative methods. Following the description of each method, we motivate why we chose to include it as part of our overall algorithm. Iterative HFold takes as input both an RNA sequence,
<italic>S</italic>
and a pseudoknot-free secondary structure,
<italic>G</italic>
; later we show that structure
<italic>G</italic>
can be produced by computational methods, for example, HotKnots hotspots or SimFold suboptimal structures, when only the sequence
<italic>S</italic>
is initially available.
<bold>
<italic>Iterative HFold:</italic>
</bold>
Given an RNA sequence,
<italic>S</italic>
, and a pseudoknot free input structure,
<italic>G</italic>
, run the following four methods and pick the structure with the lowest free energy among these four as the output structure.</p>
<p>Iterative HFold runs in
<italic>O</italic>
(
<italic>n</italic>
<sup>3</sup>
) time, as it runs four methods sequentially, when each one is
<italic>O</italic>
(
<italic>n</italic>
<sup>3</sup>
).</p>
<p>
<italic>Method 1:</italic>
Run HFold on
<italic>S</italic>
and
<italic>G</italic>
, and store the resulting
<italic>G</italic>
<italic>G</italic>
<sup></sup>
.</p>
<p>
<italic>Motivation:</italic>
This is the core HFold method, motivated by the hierarchical folding hypothesis.</p>
<p>
<italic>Method 2:</italic>
First run HFold-PKonly on
<italic>S</italic>
and
<italic>G</italic>
. If HFold-PKonly results in a structure
<italic>G</italic>
<italic>G</italic>
<sup></sup>
such that
<italic>G</italic>
<sup></sup>
is not the empty structure, then run HFold with sequence
<italic>S</italic>
and structure
<italic>G</italic>
<sup></sup>
, and store the result. Otherwise, simply store
<italic>G</italic>
as the result. See the following example. (We note that running HFold with
<italic>S</italic>
and
<italic>G</italic>
<sup></sup>
results in a structure
<italic>G</italic>
<sup></sup>
<italic>G</italic>
<sup>′′</sup>
, where it may be the case that
<italic>G</italic>
<sup>′′</sup>
<italic>G</italic>
(i.e.,
<italic>G</italic>
may not be part of the result of method 2).)</p>
<p>
<italic>Motivation:</italic>
When input structure
<italic>G</italic>
does not agree with the reference
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure, it may still be the case that HFold-PKonly finds the pseudoknotted structure
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
(or a good approximation to
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
). A call to HFold with input
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
may then find a better approximation to
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
.</p>
<p>
<bold>Example 1</bold>
: Example of results of
<italic>method 1</italic>
and
<italic>method 2</italic>
of Iterative HFold.
<disp-formula id="Equ4">
<graphic xlink:href="12859_2014_Article_6443_Equd_HTML.gif" position="anchor"></graphic>
</disp-formula>
</p>
<p>In this example, method 2 of Iterative HFold outperforms method 1: although both HFold and HFold PKonly produce the same result on sequence
<italic>S</italic>
and input structure
<italic>G</italic>
, namely the structure
<inline-formula id="IEq1">
<alternatives>
<mml:math id="M4">
<mml:mi>G</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup>
</mml:math>
<inline-graphic xlink:href="12859_2014_Article_6443_IEq1_HTML.gif"></inline-graphic>
</alternatives>
</inline-formula>
, the additional iteration in method 2, in which HFold is run with
<italic>S</italic>
and
<inline-formula id="IEq2">
<alternatives>
<mml:math id="M5">
<mml:msup>
<mml:mrow>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup>
</mml:math>
<inline-graphic xlink:href="12859_2014_Article_6443_IEq2_HTML.gif"></inline-graphic>
</alternatives>
</inline-formula>
, finds a structure with lower energy than that of
<inline-formula id="IEq3">
<alternatives>
<mml:math id="M6">
<mml:mi>G</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow></mml:mrow>
<mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup>
</mml:math>
<inline-graphic xlink:href="12859_2014_Article_6443_IEq3_HTML.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>
<italic>Method 3:</italic>
First run
<italic>SimFold</italic>
on
<italic>S</italic>
and
<italic>G</italic>
to obtain result
<italic>G</italic>
<sup></sup>
—a pseudoknot-free structure that contains
<italic>G</italic>
. Then let
<italic>G</italic>
<sub>
<italic>updated</italic>
</sub>
be the secondary structure of
<italic>S</italic>
containing the relaxed stems of
<italic>G</italic>
<sup></sup>
that include the base pairs of
<italic>G</italic>
. By a
<italic>relaxed stem</italic>
, we mean a secondary structure containing stacked base pairs, bulges of size 1 and internal loops of maximum size of 3 (i.e., either the symmetric loop of 1×1 or the non-symmetric loop of 1×2 or 2×1 but no other loop types; this is motivated by common practice [
<xref ref-type="bibr" rid="CR65">65</xref>
]). Then run
<italic>method 2</italic>
on
<italic>S</italic>
and
<italic>G</italic>
<sub>
<italic>updated</italic>
</sub>
, and store the result. See Example 2.</p>
<p>
<italic>Motivation:</italic>
This method can work well when the given input structure has a small number of base pairs from
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, because
<italic>G</italic>
<sub>
<italic>updated</italic>
</sub>
contains stems that includes these base pairs, but avoids “overcrowding” with further base pairs that might prevent HFold-PKonly from finding pseudoknotted stems.</p>
<p>
<bold>Example 2:</bold>
Example of result of
<italic>method 3</italic>
compared to all four methods of Iterative HFold.
<disp-formula id="Equ5">
<graphic xlink:href="12859_2014_Article_6443_Eque_HTML.gif" position="anchor"></graphic>
</disp-formula>
</p>
<p>In this example, method 3 of Iterative HFold outperforms the other methods. Because the input structure
<italic>G</italic>
consists of just one base pair, method 1 (HFold) outputs a pseudoknot-free structure containing
<italic>G</italic>
. The output of both methods 2 and 4 are pseudoknotted but do not contain the base pair of the input structure
<italic>G</italic>
. In contrast, method 3 first adds base pairs to
<italic>G</italic>
, resulting in the pseudoknot-free structure
<italic>G</italic>
<sub>
<italic>updated</italic>
</sub>
, and then adds additional pseudoknotted base pairs via method 2.</p>
<p>
<italic>Method 4:</italic>
Let
<italic>S</italic>
<sub>1</sub>
be the subsequence of
<italic>S</italic>
obtained by removing bases that are external unpaired bases with respect to input structure
<italic>G</italic>
. Run
<italic>SimFold</italic>
on
<italic>S</italic>
<sub>1</sub>
and
<italic>G</italic>
(with base indices renumbered to agree with
<italic>S</italic>
<sub>1</sub>
), to obtain pseudoknot-free structure
<italic>G</italic>
<sup></sup>
. Then continue exactly as in method 3. See Example 3.</p>
<p>
<italic>Motivation:</italic>
This method is very similar to method 3, but further constrains
<italic>G</italic>
<sup></sup>
since the base pairs in
<italic>G</italic>
<sup></sup>
cannot involve bases that are removed from
<italic>S</italic>
to obtain
<italic>S</italic>
<sub>1</sub>
. This potentially increases the possibilities for pseudoknotted base pairs to be added by method 2.</p>
<p>
<bold>Example 3:</bold>
Example of result of
<italic>method 4</italic>
compared to all four methods of Iterative HFold.
<disp-formula id="Equ6">
<graphic xlink:href="12859_2014_Article_6443_Equf_HTML.gif" position="anchor"></graphic>
</disp-formula>
</p>
<p>In this example, method 4 of Iterative HFold outperforms the other methods. The input structure
<italic>G</italic>
has a high energy value and neither method 1 (HFold) nor method 2 (HFold-PKonly) can expand the pseudoknot-free structure to add the pseudoknotted stem. Also, by adding too many pseudoknot-free base pairs, method 3 fails to find the pseudoknotted base pairs. Thus, method 4 performs better than methods 1, 2 and 3.</p>
</sec>
<sec id="Sec10">
<title>Experimental settings</title>
<p>In this section we explain details of our computational experiments.</p>
<sec id="Sec11">
<title>Robustness test</title>
<p>One of our goals is to understand the degree to which our methods are
<italic>robust with respect to partial information</italic>
, that is, provide a reliable prediction even when limited information about the true pseudoknot-free structure,
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, is available. For this purpose we generate subset structures of the corresponding
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, for each RNA sequence in the HK-PK and HK-PK-free data sets. For each
<italic>α</italic>
, 0.05≤
<italic>α</italic>
≤0.95 with 0.05 steps, we choose each base pair of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure with probability
<italic>α</italic>
. We also generate 1
<italic>%</italic>
information and 99
<italic>%</italic>
information about the
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure (i.e.,
<italic>α</italic>
=0.01 and
<italic>α</italic>
=0.99). We repeat this step 100 times to generate 100 substructures of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
for each value of
<italic>α</italic>
for each RNA sequence in our data sets. We then run our methods on all 100 substructures for each RNA sequence in our data sets and
<italic>α</italic>
value and calculate the bootstrap 95
<italic>%</italic>
percentile confidence interval for average F-measure of these 100 cases as the accuracy interval for each method and each RNA sequence and
<italic>α</italic>
value in our data set.</p>
<p>We also compare our methods when the true pseudoknot-free structure,
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is provided.</p>
</sec>
<sec id="Sec12">
<title>Accuracy comparison tests</title>
<p>We compare the accuracy of HFold, HFold-PKonly and Iterative HFold with each other on different input structures, and with other methods, namely SimFold [
<xref ref-type="bibr" rid="CR63">63</xref>
], HotKnots V2.0 [
<xref ref-type="bibr" rid="CR36">36</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
] and IPknot [
<xref ref-type="bibr" rid="CR42">42</xref>
]. We first describe the latter two methods and the settings we choose for our experiments. We then describe the ways in which we choose input structures for HFold and its variants.</p>
<sec id="Sec13">
<title>HotKnots</title>
<p>HotKnots is a heuristic program that given an RNA sequence, first finds about 20 lowest energy stems (from the set of all stems for the given RNA sequence), called
<italic>hotspots</italic>
. Then keeping all these stems, it adds other non-overlapping low energy stems to the stems found in the first step, so as to minimize the energy of the overall structure, eventually producing up to 20 output structures. In our experiments, we choose the structure with the lowest energy value among the 20 output structures as the final structure predicted by HotKnots. When reporting prediction accuracy for HotKnots, we report the bootstrap 95
<italic>%</italic>
percentile confidence interval for the average F-measure of the lowest energy structure for all RNA sequences in our data set.</p>
</sec>
<sec id="Sec14">
<title>IPknot</title>
<p>IPknot is a secondary structure prediction method based on Maximum Expected Accuracy (MEA) of the base pairs. In addition to the RNA sequence, IPknot gets several parameters as input. Following, we describe each of these parameters and settings briefly.</p>
<p>
<list list-type="bullet">
<list-item>
<p>level: If structure
<italic>G</italic>
can be decomposed into
<italic>k</italic>
disjoint pseudoknot-free structures,
<italic>G</italic>
<sub>1</sub>
,
<italic>G</italic>
<sub>2</sub>
,…,
<italic>G</italic>
<sub>
<italic>k</italic>
</sub>
, such that every base pair in
<italic>G</italic>
<sub>
<italic>i</italic>
</sub>
crosses the base pairs of
<italic>G</italic>
<sub>
<italic>j</italic>
</sub>
, 1≤
<italic>i</italic>
<italic>j</italic>
<italic>k</italic>
, Sato et al. say that structure
<italic>G</italic>
has
<italic>k</italic>
levels. For example, a pseudoknot-free structure has level 1, and an H-type pseudoknot has level 2. In another example, when representing the secondary structure in dot bracket format, the number of different brackets used to represent the structure is the level of the structure. IPknot can handle structures up to level 3.</p>
</list-item>
<list-item>
<p>scoring model: The energy model used to produce posterior probabilities for each base pair is called “scoring model”. IPknot has 3 different scoring models, namely “CONTRAfold”, “McCaskill” and “NUPACK”.</p>
</list-item>
<list-item>
<p>refining parameters: The procedure of recalculating the base pair probabilities based on the original prediction results is referred to as “refining parameters”.</p>
</list-item>
<list-item>
<p>base pair weights for each level: Positive numbers representing the rate of true base pairs in each level.</p>
</list-item>
</list>
</p>
<p>We run IPknot using the provided source code and the default parameters for scoring model and level (i.e., scoring model = McCaskill and level =2). The default values provided for base pair weights are not the same on the IPknot website (i.e.,
<italic>γ</italic>
<sub>1</sub>
=2 and
<italic>γ</italic>
<sub>2</sub>
=16), its source code (i.e., for some cases
<italic>γ</italic>
<sub>1</sub>
=2 and
<italic>γ</italic>
<sub>2</sub>
=4 and for others
<italic>γ</italic>
<sub>1</sub>
=1 and
<italic>γ</italic>
<sub>2</sub>
=1) and the provided perl script (i.e.,
<italic>γ</italic>
<sub>1</sub>
=4 and
<italic>γ</italic>
<sub>2</sub>
=8). We run IPknot with all of these values with and without refinement and provide IPknot’s bootstrap 95
<italic>%</italic>
confidence intervals for average F-measures for all of our data sets as a table in the Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
. Based on its performance we present IPknot’s results with default settings (i.e., no refinement, scoring model = McCaskill and level =2) and
<italic>γ</italic>
<sub>1</sub>
=4 and
<italic>γ</italic>
<sub>2</sub>
=8, for comparison with other methods.</p>
</sec>
<sec id="Sec15">
<title>Different versions of HFold</title>
<p>We compare the average accuracy of HFold, HFold-PKonly and Iterative HFold with different input structures.</p>
<p>To determine which input structures are good to use when
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is not known, we compare two different options. Since HFold (HFold-PKonly and Iterative HFold) cannot accept pseudoknotted input structures we use the following methods to produce pseudoknot-free input structures to HFold (HFold-PKonly and Iterative HFold). First, we use HotKnots hotspots [
<xref ref-type="bibr" rid="CR36">36</xref>
], i.e., the 20 lowest energy pseudoknot-free stems produced in the first phase of HotKnots. We choose the lowest free energy structure predicted by each of our methods as their final prediction given these hotspots. Second, we use SimFold’s MFE structure [
<xref ref-type="bibr" rid="CR63">63</xref>
] where the energy parameters of SimFold are changed to match that of HotKnots V2.0.</p>
</sec>
</sec>
<sec id="Sec16">
<title>Running time</title>
<p>We ran all methods on the same platform (Macbook pro. OS X 10.5.8 with 2.53 GHz Intel Core 2 Duo processor and 4 GB 1067 MHz DDR3 RAM). We use the
<italic>time</italic>
command to measure the running time of our methods on each sequence, and record the wall clock time.</p>
</sec>
<sec id="Sec17">
<title>Memory usage</title>
<p>To find the memory usage of the programs, we use the Valgrind package [
<xref ref-type="bibr" rid="CR66">66</xref>
] and record the total heap usage as memory usage of each program. IPknot and HotKnots are completely written in C and so we can easily find their memory usage by running Valgrind. However, Iterative HFold program is a perl script that runs a few C programs (HFold, HFold-pkonly and SimFold) sequentially. So we find the memory usage of each C component using Valgrind and assign the maximum as the memory usage of Iterative HFold.</p>
</sec>
</sec>
</sec>
<sec id="Sec18">
<title>Results</title>
<p>As mentioned in Section ‘Background’, in the literature on hierarchical folding, there are reports of counter examples to the hierarchical folding hypothesis where bases that are initially part of the pseudoknot-free structure for a molecule later change as the pseudoknot forms. This motivates a comparison of HFold versus Iterative HFold, in order to see how a method that sticks strictly with the hypothesis (i.e., HFold) compares with a method that allows for some base changes (i.e., Iterative HFold). In Section ‘Robustness comparison’, we compare the robustness of HFold and Iterative HFold with respect to partial information; that is, the degree to which they provide accurate predictions as a function of how much information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, the true pseudoknot-free secondary structure, is provided as input. Then in Section ‘Accuracy comparison of different versions of HFold’ we compare HFold, HFold-PKonly and Iterative HFold when a (possibly inaccurate) computational prediction of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is provided as input. In Section ‘Accuracy comparison with existing methods’ we compare Iterative HFold—the method that performs best overall in Sections ‘Robustness comparison’ and ‘Accuracy comparison of different versions of HFold’ —with existing methods for pseudoknotted secondary structure prediction. Sections ‘Running time comparison’ and ‘Memory consumption comparison’ report on the running time and memory usage of our methods.</p>
<sec id="Sec19">
<title>Robustness comparison</title>
<p>One of our goals is to learn what is the accuracy of each of our methods when partial information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is available (see Section ‘Robustness test’ for experimental settings). Figure
<xref rid="Fig2" ref-type="fig">2</xref>
shows the results of this robustness evaluation, for pseudoknotted structures (Figure
<xref rid="Fig2" ref-type="fig">2</xref>
A), pseudoknot-free structures (Figure
<xref rid="Fig2" ref-type="fig">2</xref>
B) and the overall results (Figure
<xref rid="Fig2" ref-type="fig">2</xref>
C). Since HFold-PKonly cannot add pseudoknot-free base pairs to the given input structure, we do not compare its performance here with HFold and Iterative HFold. However we provide detailed performance of all versions of HFold including HFold-PKonly in Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
.
<fig id="Fig2">
<label>Figure 2</label>
<caption>
<p>
<bold>Comparison of robustness of HFold and Iterative HFold.</bold>
Robustness results for pseudoknotted structures of the HK-PK data set
<bold>(</bold>
2
<bold>A)</bold>
, pseudoknot-free structures of the HK-PK-free data set
<bold>(</bold>
2
<bold>B)</bold>
and all structures
<bold>(</bold>
2
<bold>C)</bold>
. The X axes show the available information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure in percentage format, and the Y axes show bootstrap 95% percentile confidence intervals for average F-measure. Dashed lines show the borders of the bootstrap 95% percentile for average F-measure and solid lines show the average F-measure itself.</p>
</caption>
<graphic xlink:href="12859_2014_Article_6443_Fig2_HTML" id="d29e2559"></graphic>
</fig>
</p>
<p>As shown in Figure
<xref rid="Fig2" ref-type="fig">2</xref>
A, which pertains to pseudoknotted structures of the HK-PK data set, when provided with ≈1
<italic>%</italic>
of the
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure as input, Iterative HFold’s bootstrap 95
<italic>%</italic>
percentile confidence interval of average F-measures has higher accuracy than those of HFold. Iterative HFold continues to be significantly superior to HFold until approximately 90
<italic>%</italic>
of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is available, after which HFold is more accurate. Iterative HFold is most successful when little information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is known because it can add both pseudoknot-free and pseudoknotted base pairs. In particular, using methods 3 and 4 (see Section ‘Iterative HFold’) Iterative HFold first finds a low energy pseudoknot-free structure that includes the given input structure (by extending the stems of the given structure), and then adds pseudoknotted base pairs to further lower the energy of the overall structure. However, when the vast majority of base pairs of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
are provided as input, HFold dominates as it keeps the base pairs of the input structure, thereby often adding base pairs of
<italic>G</italic>
<sub>
<italic>small</italic>
</sub>
. When 100
<italic>%</italic>
of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is provided as input, HFold’s bootstrap 95
<italic>%</italic>
percentile confidence interval is (85.74
<italic>%</italic>
,91.87
<italic>%</italic>
), compared with (79.36
<italic>%</italic>
,87.41
<italic>%</italic>
) for Iterative HFold. As shown in Figure
<xref rid="Fig2" ref-type="fig">2</xref>
A, Iterative HFold’s average accuracy on pseudoknotted structures steadily increases from about 54% to 79% as the user provides 1% to 40% of the input structure. This improvement in accuracy slows down but still persists when further structural information is provided. If we compare the slope of the curve for Iterative HFold’s average accuracy to that of HFold in Figure
<xref rid="Fig2" ref-type="fig">2</xref>
A, we can see that HFold’s slope is steeper than that of Iterative HFold, making Iterative HFold more robust than HFold.</p>
<p>For pseudoknot-free structures of the HK-PK-free data set, as shown in Figure
<xref rid="Fig2" ref-type="fig">2</xref>
B, HFold performs better than Iterative HFold. Even with 1
<italic>%</italic>
information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, HFold results in (79.76
<italic>%</italic>
,84.32
<italic>%</italic>
)95
<italic>%</italic>
bootstrap confidence interval in comparison with (79.14
<italic>%</italic>
,83.84
<italic>%</italic>
) for Iterative HFold with the same inputs. Roughly, HFold’s success for pseudoknot-free structures is because it often adds base pairs that do not cross those provided as part of the input, and thus are likely to be in
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
.</p>
<p>When 100% of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is provided as input, the overall bootstrap 95
<italic>%</italic>
confidence interval for HFold is (96.11
<italic>%</italic>
,97.24
<italic>%</italic>
) compared with (93.85
<italic>%</italic>
,96.07
<italic>%</italic>
) for Iterative HFold.</p>
</sec>
<sec id="Sec20">
<title>Accuracy comparison of different versions of HFold</title>
<p>Often, partial information about
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
is not available; this is the case for many RNAs of unknown function reported by the ENCODE consortium [
<xref ref-type="bibr" rid="CR24">24</xref>
]. Therefore, we next compare the quality of results obtained by HFold, HFold-PKonly and Iterative HFold when given a pseudoknot-free input,
<italic>G</italic>
that is predicted by existing computational methods. One way to produce an input structure is to use an MFE pseudoknot-free structure prediction method, such as MFold. We chose SimFold as it is an implementation of MFold and, because of its energy parameters, gives more accurate predictions than MFold. Of course, when comparative information is available, the user can input such information as structural constraint as a pseudoknot-free structure to Iterative HFold and expect a better prediction result. Here we compare two methods for predicting
<italic>G</italic>
, namely SimFold and the
<italic>hotspots</italic>
produced by HotKnots V2.0. Table
<xref rid="Tab2" ref-type="table">2</xref>
reports the bootstrap 95
<italic>%</italic>
percentile confidence intervals of average F-measures. The accuracy of HFold-PKonly is significantly worse than that of HFold and Iterative HFold, both with the output of SimFold, and with the HotKnots hotspots as input, so we do not discuss HFold-PKonly further.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>
<bold>Comparison of bootstrap 95% percentile confidence interval of average F-measure of different versions of HFold when given SimFold structure as input vs. when given HotKnots hotspots structures as input</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center">Input</th>
<th align="center" colspan="3">Hotspots</th>
<th align="center" colspan="3">SimFold (MFE)</th>
</tr>
<tr>
<th align="left"></th>
<th align="center">PKonly</th>
<th align="center">HFold</th>
<th align="center">Iter. HFold</th>
<th align="center">PKonly</th>
<th align="center">HFold</th>
<th align="center">Iter. HFold</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">HK-PK</td>
<td align="center">(55.54, 71.06)</td>
<td align="center">(73.35, 83.53)</td>
<td align="center">(72.83, 83.37)</td>
<td align="center">(50.57, 63.53)</td>
<td align="center">(50.69, 63.54)</td>
<td align="center">(51.42, 64.39)</td>
</tr>
<tr>
<td align="center">HK-PK-free</td>
<td align="center">(31.37, 38.52)</td>
<td align="center">(75.53, 80.79)</td>
<td align="center">(74.93, 80.26)</td>
<td align="center">(78.42, 83.21)</td>
<td align="center">(78.33, 83.27)</td>
<td align="center">(78.31, 83.17)</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>For pseudoknotted structures, using HotKnots hotspots as input is far superior to using SimFold as input, for both HFold and Iterative HFold. This appears to be because MFE structures predicted by SimFold tend to have more base pairs than the true pseudoknot free structure,
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
, so that HFold and Iterative HFold are unlikely to add pseudoknotted base pairs to the input structure. For pseudoknot-free structures, using SimFold as input is somewhat better than using HotKnots hotspots, but the permutation test indicates that the difference is not significant.</p>
<p>The confidence intervals for HFold and Iterative HFold with HotKnots hotspots are (73.35%, 83.53%) and (72.83%, 83.37%), respectively, and on pseudoknot-free structures they are (75.53%, 80.79%) and (74.93%, 80.26%) respectively. Again, based on the result of the permutation test, the difference in the results of HFold and Iterative HFold on pseudoknotted and pseudoknot-free structures are not significant. Similarly, the permutation test shows that the difference in prediction accuracy of HFold and Iterative HFold on SimFold input is not significant.</p>
</sec>
<sec id="Sec21">
<title>Accuracy comparison with existing methods</title>
<p>For comparisons with other methods already in the literature, we choose to use our Iterative HFold method with HotKnots hotspots as input structure, based on its overall good accuracies in Section ‘Accuracy comparison of different versions of HFold’. We compare this method with two of the best-performing methods [
<xref ref-type="bibr" rid="CR44">44</xref>
] for prediction of pseudoknotted structures, namely HotKnots V2.0 [
<xref ref-type="bibr" rid="CR36">36</xref>
], a MFE-based heuristic method, and IPknot [
<xref ref-type="bibr" rid="CR42">42</xref>
], a method that is based on maximum expected accuracy. (Prepared by Puton et al. [
<xref ref-type="bibr" rid="CR44">44</xref>
], CompaRNA, is the website for continuous comparison of RNA secondary structure methods on both PDB data set and RNA strand. We chose IPknot because it was the best-performing non-comparative pseudoknot prediction method that can handle long RNA sequences, based on the ranking on their website as of March 25, 2014. We also noticed that Puton et al. used HotKnots V1 for their comparison, and not the more recently available and better performing HotKnots V2.0. Therefore we chose to include HotKnots in our comparisons as well. Since the focus of this paper is on prediction of pseudoknotted structures, we do not compare our results with that of Co-Fold [
<xref ref-type="bibr" rid="CR26">26</xref>
] or other methods for prediction of pseudoknot free structures.)</p>
<p>Table
<xref rid="Tab3" ref-type="table">3</xref>
presents the bootstrap 95% percentile confidence interval of average F-measure for Iterative HFold with hotspots as input, HotKnots V2.0, SimFold and IPknot with default setting (see Section ‘Accuracy comparison tests’) on the HK-PK and HK-PK-free data sets. For pseudoknotted structures, our permutation tests show that the difference in accuracy of Iterative HFold and HotKnots is not significant. However, the superior accuracy of Iterative HFold compared with SimFold and IPknot is significant. For pseudoknot-free structures, the difference in accuracy between IPknot, Iterative HFold, HotKnots and SimFold is not significant.
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>
<bold>Comparison of bootstrap 95% percentile confidence interval of average F-measure with existing methods</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center">Input</th>
<th align="center">Iter. HFold</th>
<th align="center">HotKnots</th>
<th align="center">SimFold</th>
<th align="center">IPknot</th>
</tr>
<tr>
<th align="left"></th>
<th align="center">(hotspots)</th>
<th align="left"></th>
<th align="left"></th>
<th align="center">(default)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">HK-PK</td>
<td align="center">(72.83, 83.37)</td>
<td align="center">(73.60, 83.35)</td>
<td align="center">(45.34, 57.73)</td>
<td align="center">(54.56, 66.25)</td>
</tr>
<tr>
<td align="center">HK-PK-free</td>
<td align="center">(74.93, 80.26)</td>
<td align="center">(76.74, 81.95)</td>
<td align="center">(78.78, 83.55)</td>
<td align="center">(77.31, 81.79)</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>Table
<xref rid="Tab4" ref-type="table">4</xref>
presents the bootstrap 95
<italic>%</italic>
percentile confidence interval for average F-measure for Iterative HFold (with hotspots as input), HotKnots and IPknot (with default setting) on the IP-pk168 and DK-pk16 data sets. Our permutation tests show that the difference in accuracy of Iterative HFold, HotKnots and IPknot on the DK-pk16 data set is not significant. However, the superior accuracy of Iterative HFold compared with HotKnots and IPknot on the IP-pk168 data set is significant.
<table-wrap id="Tab4">
<label>Table 4</label>
<caption>
<p>
<bold>Comparison of bootstrap 95% percentile confidence interval of average F-measure with existing methods on the DK-pk16 and the IP-pk168 data sets</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center">Input</th>
<th align="center">Iter. HFold</th>
<th align="center">HotKnots</th>
<th align="center">IPknot</th>
</tr>
<tr>
<th align="left"></th>
<th align="center">(hotspots)</th>
<th align="left"></th>
<th align="center">(default)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">DK-pk16</td>
<td align="center">(68.05, 81.85)</td>
<td align="center">(69.11, 83.81)</td>
<td align="center">(65.42, 75.81)</td>
</tr>
<tr>
<td align="center">IP-pk168</td>
<td align="center">(72.65, 79.86)</td>
<td align="center">(65.51, 72.96)</td>
<td align="center">(58.20, 66.09)</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec id="Sec22">
<title>Running time comparison</title>
<p>Since prediction of pseudoknotted structures are of interest to us, we only report running time comparison on pseudoknotted structures of our HK-PK data set. Figure
<xref rid="Fig3" ref-type="fig">3</xref>
presents result of time comparison between Iterative HFold and HotKnots in a log plot (when
<italic>log</italic>
is in base 10). The X axis shows log(
<italic>t</italic>
<italic>i</italic>
<italic>m</italic>
<italic>e</italic>
) for HotKnots data points and the Y axis shows log(
<italic>t</italic>
<italic>i</italic>
<italic>m</italic>
<italic>e</italic>
) for Iterative HFold on the HK-PK data set. RNA sequences in this data set are between 26 and 400 bases long. HFold runs significantly faster than HotKnots and finishes under 1.5 seconds for even the longest RNA sequence in our data set (400 bases). HotKnots is faster than Iterative HFold on sequences of up to 47 bases, where Iterative HFold starts being faster than HotKnots. Iterative HFold runs in less than 8.3 seconds for all RNA sequences in this data set whereas HotKnots runs for over 6000 seconds (about 1.7 hours) on the longest RNA sequence in our data set. The running time of both HFold and Iterative HFold grows with sequence length, whereas HotKnots’ running time is not directly correlated with RNA length. For example, HotKnots runs for 1665.94 seconds for one RNA sequence of length 195 (ASE-00360), while it runs for 203.12 seconds for another RNA sequence of length 195 (ASE-00131).
<fig id="Fig3">
<label>Figure 3</label>
<caption>
<p>
<bold>Time comparison.</bold>
Comparison of running times of Iterative HFold and HotKnots in a log plot. The X axis shows log10(
<italic>t</italic>
<italic>i</italic>
<italic>m</italic>
<italic>e</italic>
) for HotKnots data points and the Y axis shows log10(
<italic>t</italic>
<italic>i</italic>
<italic>m</italic>
<italic>e</italic>
) for Iterative HFold. Time is measured in seconds.</p>
</caption>
<graphic xlink:href="12859_2014_Article_6443_Fig3_HTML" id="d29e3076"></graphic>
</fig>
</p>
<p>IPknot is significantly faster than both HFold and Iterative HFold. For all sequences in this data set, IPknot produces output in less than 0.8 seconds. For detailed information about performance of each method see Additional file
<xref rid="MOESM4" ref-type="media">4</xref>
.</p>
</sec>
<sec id="Sec23">
<title>Memory consumption comparison</title>
<p>Here we present memory consumption of HFold, Iterative HFold and HotKnots on our HK-PK pseudoknotted structures. Since HotKnots predicts and keeps about 20 structures in memory, its memory consumption can vary significantly from one sequence to another, and is not predictable. Up until 47 bases, HotKnots some times uses less memory than HFold or Iterative HFold, but for RNA sequences with 47 bases or longer, HotKnots uses much more memory than HFold and Iterative HFold. Iterative HFold’s memory usage is very similar to HFold’s and increases at a very low rate by the length of the RNA sequence. It starts from 48.69 MB for RNA sequences of length 26 and increases to 61.33 MB for the longest RNA sequence in this data set (400 bases long). HotKnots, however, uses as little as 16.53 MB for an RNA of length 30 bases (LP-PK1) and as much as 93419 MB for the longest RNA sequence in this data set.</p>
<p>IPknot uses much less memory than all other methods. For the longest RNA sequence in this data set, IPknot uses less than 5.5 MB of memory in comparison to 61.33 MB of HFold and Iterative HFold and 93419 MB of HotKnots. For detailed information about memory usage of each method see Additional file
<xref rid="MOESM4" ref-type="media">4</xref>
.</p>
</sec>
</sec>
<sec id="Sec24">
<title>Discussion</title>
<p>In Section ‘Comparison with Hotknots and IPknot’ we provide more insight on the differences and merits of Iterative HFold, HotKnots and IPknots. Then in Section ‘Comparison with ShapeKnots’ we compare accuracy of Iterative HFold with ShapeKnots, a method that incorporates SHAPE reactivity data to predict RNA pseudoknotted secondary structure. In Section ‘Iterative HFold with SimFold’s suboptimal structures’ we compare performance accuracy of Iterative HFold with two inputs: HotKnots hotspots and suboptimal structures. Section ‘Energy model’ provides more insight into the energy model used in this work.</p>
<sec id="Sec25">
<title>Comparison with Hotknots and IPknot</title>
<p>Comparing accuracy of Iterative HFold and HotKnots V2.0 on HK-PK, HK-PK-free, DK-pk16 and IP-pk168, we found that the difference in their accuracies is insignificant on HK-PK, HK-PK-free and DK-pk16 data sets when Iterative HFold is provided with HotKnots hotspots as input. Based on our results on the HK-PK data set, with only about 15
<italic>%</italic>
information about the true pseudoknot-free structures, Iterative HFold’s 95% percentile confidence interval is (65.08
<italic>%</italic>
;73.36
<italic>%</italic>
) (data shown in Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
). If the user has about 35% information about the true pseudoknot-free structure, Iterative HFold’s accuracy is comparable with that of HotKnots (i.e., (74.18
<italic>%</italic>
;82.30
<italic>%</italic>
) vs. (73.60
<italic>%</italic>
;83.35
<italic>%</italic>
)). However Iterative HFold’s accuracy (with hotspots as input) is significantly better than that of HotKnots on the IP-pk168 data set. One of the advantages of Iterative HFold over HotKnots is that in Iterative HFold base pairs are added to lower the energy of the given structure while in HotKnots stems are added in a way that does not take into account the energy of stems in the previous steps.</p>
<p>When reporting on time and memory consumption of Iterative HFold and HotKnots on the HK-PK data set, we did not include the time and memory required to get the input structures to Iterative HFold. Since we only run HotKnots V2.0 partially to produce hotspots it does not take as long as running HotKnots and does not consume as much memory. For example, for the 400 nucleotides long RNA sequence in our data set (A.tum.RNaseP), it only takes 0.5 seconds time and 4 MB of memory to produce the hotspots. (The time required to get the hotspots for all RNA sequences in this data set is provided in Additional file
<xref rid="MOESM4" ref-type="media">4</xref>
.) We also note that since calculating hotspots and running Iterative HFold are done sequentially, the memory consumption is calculated as the maximum of the two, so the memory consumption of Iterative HFold for this sequence is still the same even including the memory needed for calculating hotspots. As we can see based on this example, even including time and memory requirements of calculating hotspots, Iterative HFold is still faster than HotKnots and uses less memory.</p>
<p>We also compared Iterative HFold with IPknot [
<xref ref-type="bibr" rid="CR42">42</xref>
]. While IPknot is faster than Iterative HFold and uses less memory, we found that for the HK-PK and IP-pk168 data sets, Iterative HFold provides significantly more accurate predictions of pseudoknotted structures, compared with IPknot. Based on our results on the HK-PK data set, Iterative HFold’s performance with more than 5% information about the true pseudoknot-free structure, is better than that of IPknot with default settings (data shown in Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
). We note that Sato et al. [
<xref ref-type="bibr" rid="CR42">42</xref>
] find the performance of IPknot with predictions using “NUPACK” superior to all versions of IPknot, but since this model can be used for RNA sequences of length <80 nucleotides, we did not compare our results with this version of IPknot. Among all different versions of IPknot we tested, we found all but
<italic>γ</italic>
<sub>1</sub>
=1 and
<italic>γ</italic>
<sub>2</sub>
=1 setting producing similar confidence intervals for all but the HK-PK-free data sets, for which
<italic>γ</italic>
<sub>1</sub>
=4 and
<italic>γ</italic>
<sub>2</sub>
=8 produces the best result (data shown in Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
). While running parameter refinement with one iteration improved the confidence intervals in the HK-PK data set, it did not result in any improvement in accuracy in the rest of our data sets as in many cases IPknot failed to produce results. We note that our results on the IP-pk168 data set with different weight parameters perfectly match the results of Sato et al. [
<xref ref-type="bibr" rid="CR42">42</xref>
].</p>
<p>A disadvantage of IPknot over Iterative HFold is that being an MEA-based method, IPknot does not produce free energy of the predicted structure. Also to get the best prediction, the user needs to provide some guidance as to what type of structure to predict for the given sequence, e.g., whether pseudoknot-free or pseudoknotted.</p>
</sec>
<sec id="Sec26">
<title>Comparison with ShapeKnots</title>
<p>Similar to HotKnots, the ShapeKnots method of Hajdin et al. [
<xref ref-type="bibr" rid="CR47">47</xref>
] is a heuristic algorithm for prediction of pseudoknotted structures. This method incorporates SHAPE reactivity data as a pseudo energy term into the prediction method. SHAPE reactivity data is only available for a limited number of RNA sequences, so we cannot compare Iterative HFold with ShapeKnots on our data set. Therefore, we use data set of Hajdin et al. to compare these two methods. In their data set Hajdin et al. have 18 RNA sequences in their training set and 6 RNA sequences in their test set. We run Iterative HFold with hotspots for each RNA sequence and choose the lowest energy structure as the final output of our program. For Shapeknots, we use the sensitivity and positive predictive values reported in the work of Hajdin et al. [
<xref ref-type="bibr" rid="CR47">47</xref>
] to compare with Iterative HFold. Table
<xref rid="Tab5" ref-type="table">5</xref>
shows the results of this comparison. In all but one sequence of the test set, Iterative HFold obtains higher accuracy than ShapeKnots. The exception is the HIV-1 5’ pseudoknot domain; Hajdin et al. note that the accepted structure of HIV-1 5’ pseudoknot domain is based on a SHAPE directed prediction and thus an accuracy comparison between ShapeKnots and Iterative HFold may be biased towards ShapeKnots. In the training set, however, Iterative HFold does not perform as well as ShapeKnots. This might be because parameters of ShapeKnots were tuned on the training set to achieve the highest possible accuracy. Since both the training and test data sets are small, we cannot make more general statements about the significance of the differences in accuracy between the two methods.
<table-wrap id="Tab5">
<label>Table 5</label>
<caption>
<p>
<bold>Comparison of Iterative HFold F-measure with ShapeKnots on SHAPE data</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center">Training set</th>
<th align="center">Len</th>
<th align="center">PK</th>
<th align="center" colspan="3">Iter. HFold</th>
<th align="center" colspan="3">ShapeKnots</th>
</tr>
<tr>
<th align="left"></th>
<th align="left"></th>
<th align="left"></th>
<th align="center">sen</th>
<th align="center">ppv</th>
<th align="center">F</th>
<th align="center">sen</th>
<th align="center">ppv</th>
<th align="center">F</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">Pre-Q1 riboswitch, B. subtilis</td>
<td align="center">34</td>
<td align="center">1</td>
<td align="center">62.5</td>
<td align="center">100</td>
<td align="center">76.9</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
</tr>
<tr>
<td align="center">Telomerase pseudoknot, human</td>
<td align="center">47</td>
<td align="center">1</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
</tr>
<tr>
<td align="center">tRNA(asp), yeast</td>
<td align="center">75</td>
<td align="center">0</td>
<td align="center">81.0</td>
<td align="center">100</td>
<td align="center">89.5</td>
<td align="center">95.2</td>
<td align="center">95.2</td>
<td align="center">95.2</td>
</tr>
<tr>
<td align="center">TPP riboswitch, E. coli</td>
<td align="center">79</td>
<td align="center">0</td>
<td align="center">46.5</td>
<td align="center">47.6</td>
<td align="center">47.1</td>
<td align="center">95.4</td>
<td align="center">87.5</td>
<td align="center">91.3</td>
</tr>
<tr>
<td align="center">SARS corona virus pseudoknot</td>
<td align="center">82</td>
<td align="center">1</td>
<td align="center">69.2</td>
<td align="center">86.3</td>
<td align="center">69.2</td>
<td align="center">84.6</td>
<td align="center">88.0</td>
<td align="center">86.3</td>
</tr>
<tr>
<td align="center">cyclic-di-GMP riboswitch, V. cholerae</td>
<td align="center">97</td>
<td align="center">0</td>
<td align="center">85.5</td>
<td align="center">81.0</td>
<td align="center">83.2</td>
<td align="center">89.3</td>
<td align="center">86.2</td>
<td align="center">87.7</td>
</tr>
<tr>
<td align="center">SAM I riboswitch, T. tengcongenis</td>
<td align="center">118</td>
<td align="center">1</td>
<td align="center">79.5</td>
<td align="center">91.2</td>
<td align="center">84.9</td>
<td align="center">92.3</td>
<td align="center">97.3</td>
<td align="center">94.7</td>
</tr>
<tr>
<td align="center">M-Box riboswitch, B. subtilis</td>
<td align="center">154</td>
<td align="center">0</td>
<td align="center">87.5</td>
<td align="center">91.3</td>
<td align="center">89.4</td>
<td align="center">87.5</td>
<td align="center">91.3</td>
<td align="center">89.3</td>
</tr>
<tr>
<td align="center">P546 domain, bI3 group I intron</td>
<td align="center">155</td>
<td align="center">0</td>
<td align="center">55.4</td>
<td align="center">57.4</td>
<td align="center">56.4</td>
<td align="center">94.6</td>
<td align="center">96.4</td>
<td align="center">95.5</td>
</tr>
<tr>
<td align="center">Lysine riboswitch, T. maritima</td>
<td align="center">174</td>
<td align="center">1</td>
<td align="center">85.7</td>
<td align="center">94.7</td>
<td align="center">90.0</td>
<td align="center">87.3</td>
<td align="center">88.7</td>
<td align="center">88.0</td>
</tr>
<tr>
<td align="center">Group I intron, Azoarcus sp.</td>
<td align="center">214</td>
<td align="center">1</td>
<td align="center">52.4</td>
<td align="center">54.1</td>
<td align="center">53.2</td>
<td align="center">92.1</td>
<td align="center">95.1</td>
<td align="center">93.5</td>
</tr>
<tr>
<td align="center">Signal recognition particle RNA, human</td>
<td align="center">301</td>
<td align="center">0</td>
<td align="center">70.0</td>
<td align="center">73.7</td>
<td align="center">71.8</td>
<td align="center">55.0</td>
<td align="center">53.9</td>
<td align="center">54.4</td>
</tr>
<tr>
<td align="center">Hepatitis C virus IRES domain</td>
<td align="center">336</td>
<td align="center">1</td>
<td align="center">71.2</td>
<td align="center">74.0</td>
<td align="center">72.5</td>
<td align="center">92.3</td>
<td align="center">96.0</td>
<td align="center">94.1</td>
</tr>
<tr>
<td align="center">RNase P, B. subtilis</td>
<td align="center">405</td>
<td align="center">1</td>
<td align="center">55.7</td>
<td align="center">59.3</td>
<td align="center">57.4</td>
<td align="center">75.6</td>
<td align="center">79.8</td>
<td align="center">77.7</td>
</tr>
<tr>
<td align="center">Group II intron, O. iheyensis</td>
<td align="center">412</td>
<td align="center">1</td>
<td align="center">87.9</td>
<td align="center">95.9</td>
<td align="center">91.7</td>
<td align="center">93.2</td>
<td align="center">97.6</td>
<td align="center">95.3</td>
</tr>
<tr>
<td align="center">Group I intron, T. thermophila</td>
<td align="center">425</td>
<td align="center">1</td>
<td align="center">83.2</td>
<td align="center">85.2</td>
<td align="center">84.2</td>
<td align="center">93.9</td>
<td align="center">91.2</td>
<td align="center">92.5</td>
</tr>
<tr>
<td align="center">5’ domain of 23S rRNA, E. coli</td>
<td align="center">511</td>
<td align="center">0</td>
<td align="center">84.0</td>
<td align="center">72.5</td>
<td align="center">77.8</td>
<td align="center">92.4</td>
<td align="center">76.4</td>
<td align="center">83.6</td>
</tr>
<tr>
<td align="center">5’ domain of 16S rRNA, E. coli</td>
<td align="center">530</td>
<td align="center">0</td>
<td align="center">73.6</td>
<td align="center">69.0</td>
<td align="center">71.2</td>
<td align="center">89.9</td>
<td align="center">80.6</td>
<td align="center">84.9</td>
</tr>
<tr>
<td align="center">
<bold>Test set</bold>
</td>
<td align="center">
<bold>Len</bold>
</td>
<td align="center">
<bold>PK</bold>
</td>
<td align="center" colspan="3">
<bold>Iter. HFold</bold>
</td>
<td align="center" colspan="3">
<bold>ShapeKnots</bold>
</td>
</tr>
<tr>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center">
<bold>sen</bold>
</td>
<td align="center">
<bold>ppv</bold>
</td>
<td align="center">
<bold>F</bold>
</td>
<td align="center">
<bold>sen</bold>
</td>
<td align="center">
<bold>ppv</bold>
</td>
<td align="center">
<bold>F</bold>
</td>
</tr>
<tr>
<td align="center">Fluoride riboswitch, P. syringae</td>
<td align="center">66</td>
<td align="center">1</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">93.7</td>
<td align="center">93.7</td>
<td align="center">93.7</td>
</tr>
<tr>
<td align="center">Adenine riboswitch, V. vulnificus</td>
<td align="center">71</td>
<td align="center">0</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
</tr>
<tr>
<td align="center">tRNA(phe), E. coli</td>
<td align="center">76</td>
<td align="center">0</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">84.0</td>
<td align="center">91.3</td>
</tr>
<tr>
<td align="center">5S rRNA, E. coli</td>
<td align="center">120</td>
<td align="center">0</td>
<td align="center">91.4</td>
<td align="center">91.4</td>
<td align="center">91.4</td>
<td align="center">85.7</td>
<td align="center">76.9</td>
<td align="center">81.1</td>
</tr>
<tr>
<td align="center">5’ domain of 16S rRNA, H. volcanii</td>
<td align="center">473</td>
<td align="center">0</td>
<td align="center">90.3</td>
<td align="center">82.3</td>
<td align="center">86.1</td>
<td align="center">89.6</td>
<td align="center">82.7</td>
<td align="center">86.0</td>
</tr>
<tr>
<td align="center">HIV-1 5’ pseudoknot domain</td>
<td align="center">500</td>
<td align="center">1</td>
<td align="center">45.4</td>
<td align="center">50.4</td>
<td align="center">47.7</td>
<td align="center">100</td>
<td align="center">100</td>
<td align="center">100</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec id="Sec27">
<title>Iterative HFold with SimFold’s suboptimal structures</title>
<p>To further investigate which input structures are good to use when
<italic>G</italic>
<sub>
<italic>b</italic>
<italic>i</italic>
<italic>g</italic>
</sub>
is not known, we use the first 50 suboptimal structures produced by SimFold (including the MFE structure). Then for each RNA sequence we run our methods on all 50 suboptimal structures and choose the one with the lowest free energy as the final result for that RNA sequence. With this approach, the bootstrap 95% percentile confidence interval of average F-measure of HFold and Iterative HFold is (61.80%, 80.63%) and (67.70%, 79.57%) respectively for pseudoknotted structures and (77.17%, 82.35%) and (76.27%, 81.46%) respectively for pseudoknot-free structures. The permutation test indicates that the difference between these results and the corresponding results when input structures are hotspots is not significant. We also test the significance of results of HFold with first 50 suboptimal structures versus Iterative HFold with the same input structures and Iterative HFold with hotspots for both pseudoknotted and pseudoknot-free structures. Although the bootstrap 95% percentile confidence intervals for average F-measures seem different, the permutation test indicates that the difference is not significant. Similarly, results of Iterative HFold with the first 50 suboptimal structures are not significantly better or worse than the result of HFold with hotspots as input structures for both pseudoknotted and pseudoknot-free structures.</p>
</sec>
<sec id="Sec28">
<title>Energy model</title>
<p>In this paper we use the HotKnots V2.0 DP09 [
<xref ref-type="bibr" rid="CR36">36</xref>
] energy parameters in our implementation of Iterative HFold. To investigate the degree to which the energy model may be causing mis-predictions by HotKnots V2.0 or Iterative HFold, we considered the degree to which the maximum accuracy structures produced by these methods, i.e., the structure with highest F-measure, is better than the minimum free energy structures. Table
<xref rid="Tab6" ref-type="table">6</xref>
presents the difference in bootstrap 95% percentile confidence intervals of average F-measure. If we choose the maximum accuracy structure among the 20 output structures predicted by HotKnots for each RNA sequence, the bootstrap 95% percentile confidence intervals of average F-measure of HotKnots will increase to (84.50%, 91.48%) for pseudoknotted structures of the HK-PK data set (vs. (73.6%, 83.35%) when choosing the lowest energy structure) and (88.32%, 91.08%) for pseudoknot-free structures of the HK-PK-free data set (vs. (76.74%, 81.95%) when choosing the lowest energy structure).
<table-wrap id="Tab6">
<label>Table 6</label>
<caption>
<p>
<bold>Comparison of bootstrap 95% percentile confidence interval of average F-measure between the minimum energy structures and the maximum accuracy structures of the HK-PK and the HK-PK-free data sets</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center">Input structures</th>
<th align="center">Min energy</th>
<th align="center">Max accuracy</th>
<th align="center">Permutation test</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">Iter. HFold - hotspots PKed</td>
<td align="center">(72.83, 83.37)</td>
<td align="center">(78.56, 87.05)</td>
<td align="center">Not significant</td>
</tr>
<tr>
<td align="center">Iter. HFold - hotspots PK-free</td>
<td align="center">(74.93, 80.26)</td>
<td align="center">(87.70, 90.57)</td>
<td align="center">Significant</td>
</tr>
<tr>
<td align="center">Iter. HFold - 50 suboptimals PKed</td>
<td align="center">(67.70, 79.57)</td>
<td align="center">(80.41, 88.14)</td>
<td align="center">Significant</td>
</tr>
<tr>
<td align="center">Iter. HFold - 50 suboptimals PK-free</td>
<td align="center">(76.27, 81.46)</td>
<td align="center">(90.05, 93.00)</td>
<td align="center">Significant</td>
</tr>
<tr>
<td align="center">HotKnots PKed</td>
<td align="center">(73.60, 83.35)</td>
<td align="center">(84.50, 91.48)</td>
<td align="center">Significant</td>
</tr>
<tr>
<td align="center">HotKnots PK-free</td>
<td align="center">(76.74, 81.95)</td>
<td align="center">(88.32, 91.08)</td>
<td align="center">Significant</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>Similarly, if we compare the maximum accuracy structure output by Iterative HFold with the minimum free energy structure, whether given HotKnots hotspots or the first 50 suboptimal structures to Iterative HFold as input, the bootstrap 95% percentile confidence intervals of average F-measure also show improvement - see Table
<xref rid="Tab6" ref-type="table">6</xref>
. The difference in improvements is significant in all but one case, namely Iterative HFold on hotspots structures as input for pseudoknotted structures of the HK-PK data set. We conclude that improvements on the energy parameter values for pseudoknotted structures may further improve accuracy of both HotKnots and Iterative HFold.</p>
</sec>
</sec>
<sec id="Sec29">
<title>Conclusions</title>
<p>In this work we present Iterative HFold, a fast and robust iterative algorithm that matches the accuracy of the best existing pseudoknot prediction methods. Iterative HFold is significantly more accurate than IPknot while matching the accuracy of HotKnots on the HK-PK data set. Iterative HFold is superior to both IPknot and HotKnots on the IP-pk168 data set. Moreover both Iterative HFold and IPknot use less memory and run much faster than HotKnots on long sequences.</p>
<p>Iterative HFold also has lower rate of accuracy deterioration than HFold with loss of information about the true pseudoknot-free structure, so it is more robust than HFold. This is particularly helpful when the given input structure may be unreliable and/or limited information about the true pseudoknot-free structure is available. Iterative HFold is also more accurate than ShapeKnots [
<xref ref-type="bibr" rid="CR47">47</xref>
] on the test set of Hajdin et al. [
<xref ref-type="bibr" rid="CR47">47</xref>
].</p>
<p>In this work, we compared two different ways to generate pseudoknot-free input structures for input to Iterative HFold, namely the first 50 suboptimal structures produced by SimFold, and HotKnots hotspots. On the HK-PK and HK-PK-free data sets, accuracy of Iterative HFold is not significantly different on each of these. An alternative approach that may be worth exploring in future work would be to use the most highly probable base pairs, as calculated using the partition function [
<xref ref-type="bibr" rid="CR43">43</xref>
]. Even better may be to calculate base pair probabilities for base pairs of pseudoknotted RNA structures; however this requires
<italic>Θ</italic>
(
<italic>n</italic>
<sup>5</sup>
) time. Since HFold finds minimum free energy structure in
<italic>O</italic>
(
<italic>n</italic>
<sup>3</sup>
) time, conditional on the given input structure, we are currently investigating ways to develop an
<italic>O</italic>
(
<italic>n</italic>
<sup>3</sup>
)-time partition function version of HFold that can produce pseudoknotted base pair probabilities that are conditional on the given input structure.</p>
<p>Comparing accuracy of the minimum free energy structures with the maximum accuracy structures in this work, we found that, on average, the minimum free energy structure has significantly poorer F-measure than the maximum accuracy structure. This suggests that an improved energy model for pseudoknotted structure prediction may improve accuracy of prediction algorithms for pseudoknotted structures.</p>
<p>Another direction for future work can be to use Iterative HFold for structure prediction of two interacting RNA molecules. Iterative HFold may be well suited for this purpose because, given input structures for each individual input molecule, it allows for modification of these input structures as it explores potential base pairing interactions between the two molecules.</p>
</sec>
<sec id="Sec30" sec-type="data-availability">
<title>Data and software availability</title>
<p>Iterative HFold and all data used in this work are freely available at
<ext-link ext-link-type="uri" xlink:href="http://www.cs.ubc.ca/~hjabbari/software.php">http://www.cs.ubc.ca/~hjabbari/software.php</ext-link>
.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Electronic supplementary material</title>
<sec id="Sec31">
<p>
<supplementary-material content-type="local-data" id="MOESM1">
<media xlink:href="12859_2014_6443_MOESM1_ESM.pdf">
<caption>
<p>Additional file 1: Pseudocode. We provide pseudocode of our Iterative HFold algorithm in this section. (PDF 56 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM2">
<media xlink:href="12859_2014_6443_MOESM2_ESM.pdf">
<caption>
<p>Additional file 2: IPknot Performance. Table 1 provides the bootstrap 95% confidence intervals for average F-measure of IPknot on different data sets and different weight parameters. The energy model in all these experiments is set to McCaskill and level is set to 2 (both default values). (PDF 29 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM3">
<media xlink:href="12859_2014_6443_MOESM3_ESM.pdf">
<caption>
<p>Additional file 3:
<bold>Robustness Comparison and Correlation to False Positives.</bold>
Tables 1 and 2 provide complete presenting robustness comparison of HFold-PKonly, HFold, and Iterative HFold, when provided with different percentage of
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
information of HK-PK and HK-PK-free data sets. Note that the reported interval in each case is the bootstrap 95% confidence interval for F-measure of the 100 structures with 1≤
<italic>α</italic>
≤99 percent information about the
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
structure. The 100% information is the bootstrap 95% confidence interval for F-measure when input structure is
<italic>G</italic>
<sub>
<italic>big</italic>
</sub>
. Table 3 provides Pearson correlation coefficient of HFold and Iterative HFold with HotKnots hotspots, SimFold MFE and SimFold first 50 suboptimal structures. Here FP represents false positive rate and F represents the F-measure as the accuracy measure. (PDF 49 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM4">
<media xlink:href="12859_2014_6443_MOESM4_ESM.pdf">
<caption>
<p>Additional file 4: Time and Memory Comparison. Tables 1 and 2 provide complete data presenting running time comparison of HFold, Iterative HFold, HotKnots V2.0 and IPknot on the HK-PK data set. Timing is presented in seconds. Tables 3 and 4 provide complete data presenting memory (total heap usage) comparison of HFold, Iterative HFold, HotKnots V2.0 and IPknot on the HK-PK data set. Memory usage is presented in Mega Bytes. (PDF 44 KB)</p>
</caption>
</media>
</supplementary-material>
</p>
</sec>
</sec>
</body>
<back>
<app-group>
<app id="App1">
<sec id="Sec32">
<title>Authors’ original submitted files for images</title>
<p>Below are the links to the authors’ original submitted files for images.
<media position="anchor" xlink:href="12859_2014_6443_MOESM5_ESM.pdf" id="MOESM5">
<caption>
<p>Authors’ original file for figure 1</p>
</caption>
</media>
<media position="anchor" xlink:href="12859_2014_6443_MOESM6_ESM.pdf" id="MOESM6">
<caption>
<p>Authors’ original file for figure 2</p>
</caption>
</media>
<media position="anchor" xlink:href="12859_2014_6443_MOESM7_ESM.pdf" id="MOESM7">
<caption>
<p>Authors’ original file for figure 3</p>
</caption>
</media>
<media position="anchor" xlink:href="12859_2014_6443_MOESM8_ESM.pdf" id="MOESM8">
<caption>
<p>Authors’ original file for figure 4</p>
</caption>
</media>
<media position="anchor" xlink:href="12859_2014_6443_MOESM9_ESM.pdf" id="MOESM9">
<caption>
<p>Authors’ original file for figure 5</p>
</caption>
</media>
<media position="anchor" xlink:href="12859_2014_6443_MOESM10_ESM.pdf" id="MOESM10">
<caption>
<p>Authors’ original file for figure 6</p>
</caption>
</media>
</p>
</sec>
</app>
</app-group>
<fn-group>
<fn>
<p>
<bold>Competing interests</bold>
</p>
<p>The authors declare that they have no competing interests.</p>
</fn>
<fn>
<p>
<bold>Authors’ contributions</bold>
</p>
<p>HJ designed and implemented the algorithms, acquired, analyzed and interpreted the data and drafted the manuscript. AC supervised the research, participated in analysis of data and in revising the manuscript. Both authors read and approved the final manuscript.</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>The authors thank the reviewers for their very constructive suggestions. The authors also thank and acknowledge Dr. Holger Hoos for his helpful comments and early discussion about the paper. This research was funded by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC).</p>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hale</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>C-X</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>JW</given-names>
</name>
</person-group>
<article-title>
<bold>Small RNA regulation of reproductive function</bold>
</article-title>
<source>Mol Reprod Dev</source>
<year>2014</year>
<volume>81</volume>
<issue>2</issue>
<fpage>148</fpage>
<lpage>159</lpage>
<pub-id pub-id-type="doi">10.1002/mrd.22272</pub-id>
<pub-id pub-id-type="pmid">24167089</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deryusheva</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gall</surname>
<given-names>JG</given-names>
</name>
</person-group>
<article-title>
<bold>Novel small cajal-body-specific RNAs identified in drosophila: probing guide RNA function</bold>
</article-title>
<source>RNA</source>
<year>2013</year>
<volume>19</volume>
<issue>12</issue>
<fpage>1802</fpage>
<lpage>1814</lpage>
<pub-id pub-id-type="doi">10.1261/rna.042028.113</pub-id>
<pub-id pub-id-type="pmid">24149844</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holt</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Schuman</surname>
<given-names>EM</given-names>
</name>
</person-group>
<article-title>
<bold>The central dogma decentralized: New perspectives on RNA function and local translation in neurons</bold>
</article-title>
<source>Neuron</source>
<year>2013</year>
<volume>80</volume>
<issue>3</issue>
<fpage>648</fpage>
<lpage>657</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuron.2013.10.036</pub-id>
<pub-id pub-id-type="pmid">24183017</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mattick</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Makunin</surname>
<given-names>IV</given-names>
</name>
</person-group>
<article-title>
<bold>Non-coding RNA</bold>
</article-title>
<source>Hum Mol Genet</source>
<year>2006</year>
<volume>15</volume>
<issue>suppl 1</issue>
<fpage>17</fpage>
<lpage>29</lpage>
<pub-id pub-id-type="doi">10.1093/hmg/ddl046</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carninci</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kasukawa</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Katayama</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gough</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Maeda</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Oyama</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Ravasi</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Lenhard</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wells</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kodzius</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Shimokawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Bajic</surname>
<given-names>VB</given-names>
</name>
<name>
<surname>Brenner</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Batalov</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Forrest</surname>
<given-names>ARR</given-names>
</name>
<name>
<surname>Zavolan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Wilming</surname>
<given-names>LG</given-names>
</name>
<name>
<surname>Aidinis</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Allen</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Ambesi-Impiombato</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Apweiler</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Aturaliya</surname>
<given-names>RN</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Bansal</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Baxter</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Beisel</surname>
<given-names>KW</given-names>
</name>
<name>
<surname>Bersano</surname>
<given-names>T</given-names>
</name>
<collab>The FANTOM Consortium</collab>
<etal></etal>
</person-group>
<article-title>
<bold>The transcriptional landscape of the mammalian genome</bold>
</article-title>
<source>Science</source>
<year>2005</year>
<volume>309</volume>
<issue>5740</issue>
<fpage>1559</fpage>
<lpage>1563</lpage>
<pub-id pub-id-type="doi">10.1126/science.1112014</pub-id>
<pub-id pub-id-type="pmid">16141072</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dennis</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>
<bold>The brave new world of RNA</bold>
</article-title>
<source>Nature</source>
<year>2002</year>
<volume>418</volume>
<issue>6894</issue>
<fpage>122</fpage>
<lpage>124</lpage>
<pub-id pub-id-type="doi">10.1038/418122a</pub-id>
<pub-id pub-id-type="pmid">12110860</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Varma</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Santalucia</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cunningham</surname>
<given-names>PR</given-names>
</name>
</person-group>
<article-title>
<bold>In vivo determination of RNA structure-function relationships: analysis of the 790 loop in ribosomal RNA</bold>
</article-title>
<source>J Mol Biol</source>
<year>1997</year>
<volume>269</volume>
<issue>5</issue>
<fpage>732</fpage>
<lpage>743</lpage>
<pub-id pub-id-type="doi">10.1006/jmbi.1997.1092</pub-id>
<pub-id pub-id-type="pmid">9223637</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abdi</surname>
<given-names>NM</given-names>
</name>
<name>
<surname>Fredrick</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>
<bold>Contribution of 16S rRNA nucleotides forming the 30S subunit a and p sites to translation in escherichia coli</bold>
</article-title>
<source>RNA</source>
<year>2005</year>
<volume>11</volume>
<issue>11</issue>
<fpage>1624</fpage>
<lpage>1632</lpage>
<pub-id pub-id-type="doi">10.1261/rna.2118105</pub-id>
<pub-id pub-id-type="pmid">16177132</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saraiya</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Lamichhane</surname>
<given-names>TN</given-names>
</name>
<name>
<surname>Chow</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>SantaLucia</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cunningham</surname>
<given-names>PR</given-names>
</name>
</person-group>
<article-title>
<bold>Identification and role of functionally important motifs in the 970 loop of escherichia coli 16S ribosomal RNA</bold>
</article-title>
<source>J Mol Biol</source>
<year>2008</year>
<volume>376</volume>
<issue>3</issue>
<fpage>645</fpage>
<lpage>657</lpage>
<pub-id pub-id-type="doi">10.1016/j.jmb.2007.11.102</pub-id>
<pub-id pub-id-type="pmid">18177894</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Calidas</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lyon</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Culver</surname>
<given-names>GM</given-names>
</name>
</person-group>
<article-title>
<bold>The N-terminal extension of S12 influences small ribosomal subunit assembly in Escherichia coli</bold>
</article-title>
<source>RNA</source>
<year>2014</year>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kato</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Akutsu</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Asai</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sakakibara</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>
<bold>DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition</bold>
</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<issue>24</issue>
<fpage>3218</fpage>
<lpage>3224</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts612</pub-id>
<pub-id pub-id-type="pmid">23060618</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hamada</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Asai</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>
<bold>Improving the accuracy of predicting secondary structure for aligned RNA sequences</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<issue>2</issue>
<fpage>393</fpage>
<lpage>402</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkq792</pub-id>
<pub-id pub-id-type="pmid">20843778</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hamada</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yamada</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Asai</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>
<bold>CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<issue>suppl 2</issue>
<fpage>100</fpage>
<lpage>106</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkr290</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences</bold>
</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<issue>5</issue>
<fpage>626</fpage>
<lpage>632</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btq726</pub-id>
<pub-id pub-id-type="pmid">21193521</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wiebe</surname>
<given-names>NJP</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>IM</given-names>
</name>
</person-group>
<article-title>
<bold>Transat - a method for detecting the conserved helices of functional rna structures, including transient, pseudo-knotted and alternative structures</bold>
</article-title>
<source>PLoS Comput Biol</source>
<year>2010</year>
<volume>6</volume>
<issue>6</issue>
<fpage>1000823</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000823</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bernhart</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hofacker</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Will</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gruber</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stadler</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>
<bold>RNAalifold: improved consensus structure prediction for RNA alignments</bold>
</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<issue>1</issue>
<fpage>474</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-9-474</pub-id>
<pub-id pub-id-type="pmid">19014431</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Meyer</surname>
<given-names>IM</given-names>
</name>
<name>
<surname>Miklós</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>
<bold>SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework</bold>
</article-title>
<source>PLoS Comput Biol</source>
<year>2007</year>
<volume>3</volume>
<issue>8</issue>
<fpage>149</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.0030149</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedersen</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Bejerano</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Siepel</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rosenbloom</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Lindblad-Toh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Lander</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>Kent</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Haussler</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>
<bold>Identification and classification of conserved RNA secondary structures in the human genome</bold>
</article-title>
<source>PLoS Comput Biol</source>
<year>2006</year>
<volume>2</volume>
<issue>4</issue>
<fpage>33</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.0020033</pub-id>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Griffiths-Jones</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Moxon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Marshall</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Khanna</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Bateman</surname>
<given-names>A</given-names>
</name>
</person-group>
<source>
<bold>Rfam: annotating non-coding RNAs in complete genomes</bold>
</source>
<year>2005</year>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Touzet</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Perriquet</surname>
<given-names>O</given-names>
</name>
</person-group>
<source>
<bold>CARNAC: folding families of related RNAs</bold>
</source>
<year>2004</year>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Knudsen</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Hein</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>
<bold>RNA secondary structure prediction using stochastic context-free grammars and evolutionary history</bold>
</article-title>
<source>Bioinformatics</source>
<year>1999</year>
<volume>15</volume>
<issue>6</issue>
<fpage>446</fpage>
<lpage>454</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/15.6.446</pub-id>
<pub-id pub-id-type="pmid">10383470</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Krogh</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mitchison</surname>
<given-names>G</given-names>
</name>
</person-group>
<source>Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids</source>
<year>1998</year>
<publisher-loc>Cambridge</publisher-loc>
<publisher-name>Cambridge University Press</publisher-name>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Sabina</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zuker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure</bold>
</article-title>
<source>J Mol Biol</source>
<year>1999</year>
<volume>288</volume>
<issue>5</issue>
<fpage>911</fpage>
<lpage>940</lpage>
<pub-id pub-id-type="doi">10.1006/jmbi.1999.2700</pub-id>
<pub-id pub-id-type="pmid">10329189</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<collab>The ENCODE Project Consortium</collab>
</person-group>
<article-title>
<bold>Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project</bold>
</article-title>
<source>Nature</source>
<year>2007</year>
<volume>447</volume>
<issue>7146</issue>
<fpage>799</fpage>
<lpage>816</lpage>
<pub-id pub-id-type="doi">10.1038/nature05874</pub-id>
<pub-id pub-id-type="pmid">17571346</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hofacker</surname>
<given-names>IL</given-names>
</name>
<name>
<surname>Fontana</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Stadler</surname>
<given-names>PF</given-names>
</name>
<name>
<surname>Bonhoeffer</surname>
<given-names>LS</given-names>
</name>
<name>
<surname>Tacker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>
<bold>Fast folding and comparison of RNA secondary structures</bold>
</article-title>
<source>Monatshefte für Chemie / Chem Monthly</source>
<year>1994</year>
<volume>125</volume>
<issue>2</issue>
<fpage>167</fpage>
<lpage>188</lpage>
<pub-id pub-id-type="doi">10.1007/BF00818163</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Proctor</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>IM</given-names>
</name>
</person-group>
<article-title>
<bold>CoFold: an RNA secondary structure prediction method that takes co-transcriptional folding into account</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2013</year>
<volume>41</volume>
<issue>9</issue>
<fpage>102</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gkt174</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Staple</surname>
<given-names>DW</given-names>
</name>
<name>
<surname>Butcher</surname>
<given-names>SE</given-names>
</name>
</person-group>
<article-title>
<bold>Pseudoknots: RNA structures with diverse functions</bold>
</article-title>
<source>PLoS Biol</source>
<year>2005</year>
<volume>3</volume>
<issue>6</issue>
<fpage>e213+</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0030213</pub-id>
<pub-id pub-id-type="pmid">15941360</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Batenburg</surname>
<given-names>FH</given-names>
</name>
<name>
<surname>Gultyaev</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Pleij</surname>
<given-names>CW</given-names>
</name>
</person-group>
<article-title>
<bold>Pseudobase: structural information on RNA pseudoknots</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2001</year>
<volume>29</volume>
<issue>1</issue>
<fpage>194</fpage>
<lpage>195</lpage>
<pub-id pub-id-type="doi">10.1093/nar/29.1.194</pub-id>
<pub-id pub-id-type="pmid">11125088</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deiman</surname>
<given-names>BALM</given-names>
</name>
<name>
<surname>Pleij</surname>
<given-names>CWA</given-names>
</name>
</person-group>
<article-title>
<bold>Pseudoknots: A vital feature in viral RNA</bold>
</article-title>
<source>Semin Virol</source>
<year>1997,s</year>
<volume>8</volume>
<issue>3</issue>
<fpage>166</fpage>
<lpage>175</lpage>
<pub-id pub-id-type="doi">10.1006/smvy.1997.0119</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akutsu</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>
<bold>Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots</bold>
</article-title>
<source>Disc App Math</source>
<year>2000</year>
<volume>104</volume>
<issue>1–3</issue>
<fpage>45</fpage>
<lpage>62</lpage>
<pub-id pub-id-type="doi">10.1016/S0166-218X(00)00186-4</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<collab>Lyngsø RB</collab>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Díaz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Karhumäki</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lepistö</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sannella</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>
<bold>Complexity of pseudoknot prediction in simple models</bold>
</article-title>
<source>ICALP. Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 3142</source>
<year>2004</year>
<publisher-loc>Heidelberg</publisher-loc>
<publisher-name>Springer Berlin</publisher-name>
<fpage>919</fpage>
<lpage>931</lpage>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedersen</surname>
<given-names>CN</given-names>
</name>
<collab>Lyngsø RB</collab>
</person-group>
<article-title>
<bold>RNA pseudoknot prediction in energy-based models</bold>
</article-title>
<source>J Comput Biol</source>
<year>2000</year>
<volume>7</volume>
<issue>3–4</issue>
<fpage>409</fpage>
<lpage>427</lpage>
<pub-id pub-id-type="pmid">11108471</pub-id>
</element-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rivas</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
</person-group>
<article-title>
<bold>A dynamic programming algorithm for RNA structure prediction including pseudoknots</bold>
</article-title>
<source>J Mol Biol</source>
<year>1999</year>
<volume>285</volume>
<issue>5</issue>
<fpage>2053</fpage>
<lpage>2068</lpage>
<pub-id pub-id-type="doi">10.1006/jmbi.1998.2436</pub-id>
<pub-id pub-id-type="pmid">9925784</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dirks</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Pierce</surname>
<given-names>NA</given-names>
</name>
</person-group>
<article-title>
<bold>A partition function algorithm for nucleic acid secondary structure including pseudoknots</bold>
</article-title>
<source>J Comput Chem</source>
<year>2003</year>
<volume>24</volume>
<issue>13</issue>
<fpage>1664</fpage>
<lpage>1677</lpage>
<pub-id pub-id-type="doi">10.1002/jcc.10296</pub-id>
<pub-id pub-id-type="pmid">12926009</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reeder</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Giegerich</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>
<bold>Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics</bold>
</article-title>
<source>BMC Bioinformatics</source>
<year>2004</year>
<volume>5</volume>
<fpage>104+</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-5-104</pub-id>
<pub-id pub-id-type="pmid">15294028</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andronescu</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Pop</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>AE</given-names>
</name>
</person-group>
<article-title>
<bold>Improved free energy parameters for RNA pseudoknotted secondary structure prediction</bold>
</article-title>
<source>RNA</source>
<year>2010</year>
<volume>16</volume>
<issue>1</issue>
<fpage>26</fpage>
<lpage>42</lpage>
<pub-id pub-id-type="doi">10.1261/rna.1689910</pub-id>
<pub-id pub-id-type="pmid">19933322</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sperschneider</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Datta</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wise</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>
<bold>Heuristic RNA pseudoknot prediction including intramolecular kissing hairpins</bold>
</article-title>
<source>RNA</source>
<year>2011</year>
<volume>17</volume>
<issue>1</issue>
<fpage>27</fpage>
<lpage>38</lpage>
<pub-id pub-id-type="doi">10.1261/rna.2394511</pub-id>
<pub-id pub-id-type="pmid">21098139</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sperschneider</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Datta</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>
<bold>DotKnot: pseudoknot prediction using the probability dot plot under a refined energy model</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<issue>7</issue>
<fpage>103</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gkq021</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sperschneider</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Datta</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>
<bold>KnotSeeker: Heuristic pseudoknot detection in long RNA sequences</bold>
</article-title>
<source>RNA</source>
<year>2008</year>
<volume>14</volume>
<issue>4</issue>
<fpage>630</fpage>
<lpage>640</lpage>
<pub-id pub-id-type="doi">10.1261/rna.968808</pub-id>
<pub-id pub-id-type="pmid">18314500</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>C-H</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Chiu</surname>
<given-names>H-T</given-names>
</name>
</person-group>
<article-title>
<bold>A heuristic approach for detecting RNA h-type pseudoknots</bold>
</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<issue>17</issue>
<fpage>3501</fpage>
<lpage>3508</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bti568</pub-id>
<pub-id pub-id-type="pmid">15994188</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ren</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rastegari</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hoos</surname>
<given-names>HH</given-names>
</name>
</person-group>
<article-title>
<bold>Hotknots: Heuristic prediction of rna secondary structures including pseudoknots</bold>
</article-title>
<source>RNA</source>
<year>2005</year>
<volume>11</volume>
<issue>10</issue>
<fpage>1494</fpage>
<lpage>1504</lpage>
<pub-id pub-id-type="doi">10.1261/rna.7284905</pub-id>
<pub-id pub-id-type="pmid">16199760</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kato</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Hamada</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Akutsu</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Asai</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>
<bold>IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming</bold>
</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<issue>13</issue>
<fpage>85</fpage>
<lpage>93</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr215</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization</bold>
</article-title>
<source>RNA</source>
<year>2004</year>
<volume>10</volume>
<issue>8</issue>
<fpage>1178</fpage>
<lpage>1190</lpage>
<pub-id pub-id-type="doi">10.1261/rna.7650904</pub-id>
<pub-id pub-id-type="pmid">15272118</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Puton</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kozlowski</surname>
<given-names>LP</given-names>
</name>
<name>
<surname>Rother</surname>
<given-names>KM</given-names>
</name>
<name>
<surname>Bujnicki</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>
<bold>CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2013</year>
<volume>41</volume>
<issue>7</issue>
<fpage>4307</fpage>
<lpage>4323</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkt101</pub-id>
<pub-id pub-id-type="pmid">23435231</pub-id>
</element-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Disney</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Childs</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Schroeder</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Zuker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure</bold>
</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2004</year>
<volume>101</volume>
<issue>19</issue>
<fpage>7287</fpage>
<lpage>7292</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0401799101</pub-id>
<pub-id pub-id-type="pmid">15123812</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deigan</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>TW</given-names>
</name>
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Weeks</surname>
<given-names>KM</given-names>
</name>
</person-group>
<article-title>
<bold>Accurate SHAPE-directed RNA structure determination</bold>
</article-title>
<source>Proc Natl Acad Sci</source>
<year>2009,s</year>
<volume>106</volume>
<issue>1</issue>
<fpage>97</fpage>
<lpage>102</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0806929106</pub-id>
<pub-id pub-id-type="pmid">19109441</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hajdin</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Bellaousov</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Huggins</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Leonard</surname>
<given-names>CW</given-names>
</name>
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Weeks</surname>
<given-names>KM</given-names>
</name>
</person-group>
<article-title>
<bold>Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots</bold>
</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2013</year>
<volume>110</volume>
<issue>14</issue>
<fpage>5498</fpage>
<lpage>5503</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1219988110</pub-id>
<pub-id pub-id-type="pmid">23503844</pub-id>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jabbari</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>
<bold>Novel and efficient RNA secondary structure prediction using hierarchical folding</bold>
</article-title>
<source>J Comput Biol</source>
<year>2008</year>
<volume>15</volume>
<issue>2</issue>
<fpage>139</fpage>
<lpage>163</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2007.0198</pub-id>
<pub-id pub-id-type="pmid">18312147</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tinoco</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Bustamante</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>
<bold>How RNA folds</bold>
</article-title>
<source>J Mol Biol</source>
<year>1999</year>
<volume>293</volume>
<issue>2</issue>
<fpage>271</fpage>
<lpage>281</lpage>
<pub-id pub-id-type="doi">10.1006/jmbi.1999.3001</pub-id>
<pub-id pub-id-type="pmid">10550208</pub-id>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>Predicting RNA secondary structure by free energy minimization</bold>
</article-title>
<source>Theor Chem Acc: Theory, Computation, and Modeling (Theoretica Chimica Acta)</source>
<year>2006</year>
<fpage>1</fpage>
<lpage>9</lpage>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cho</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Pincus</surname>
<given-names>DL</given-names>
</name>
<name>
<surname>Thirumalai</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>
<bold>Assembly mechanisms of RNA pseudoknots are determined by the stabilities of constituent secondary structures</bold>
</article-title>
<source>Proc Natl Acad Sci</source>
<year>2009</year>
<volume>106</volume>
<issue>41</issue>
<fpage>17349</fpage>
<lpage>17354</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0906625106</pub-id>
<pub-id pub-id-type="pmid">19805055</pub-id>
</element-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bailor</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Al-Hashimi</surname>
<given-names>HM</given-names>
</name>
</person-group>
<article-title>
<bold>Topology links RNA secondary structure with global conformation, dynamics, and adaptation</bold>
</article-title>
<source>Science</source>
<year>2010</year>
<volume>327</volume>
<issue>5962</issue>
<fpage>202</fpage>
<lpage>206</lpage>
<pub-id pub-id-type="doi">10.1126/science.1181085</pub-id>
<pub-id pub-id-type="pmid">20056889</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilkinson</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Merino</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Weeks</surname>
<given-names>KM</given-names>
</name>
</person-group>
<article-title>
<bold>RNA SHAPE chemistry reveals nonhierarchical interactions dominate equilibrium structural transitions in tRNAasp transcripts</bold>
</article-title>
<source>J Am Chem Soc</source>
<year>2005</year>
<volume>127</volume>
<issue>13</issue>
<fpage>4659</fpage>
<lpage>4667</lpage>
<pub-id pub-id-type="doi">10.1021/ja0436749</pub-id>
<pub-id pub-id-type="pmid">15796531</pub-id>
</element-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chalasani</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Demidov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Broude</surname>
<given-names>NE</given-names>
</name>
<name>
<surname>Dokholyan</surname>
<given-names>NV</given-names>
</name>
</person-group>
<article-title>
<bold>Ab initio RNA folding by discrete molecular dynamics: From structure prediction to folding mechanisms</bold>
</article-title>
<source>RNA</source>
<year>2008</year>
<volume>14</volume>
<issue>6</issue>
<fpage>1164</fpage>
<lpage>1173</lpage>
<pub-id pub-id-type="doi">10.1261/rna.894608</pub-id>
<pub-id pub-id-type="pmid">18456842</pub-id>
</element-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Darty</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Denise</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ponty</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>
<bold>VARNA: interactive drawing and editing of the RNA secondary structure</bold>
</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>15</issue>
<fpage>1974</fpage>
<lpage>1975</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp250</pub-id>
<pub-id pub-id-type="pmid">19398448</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rastegari</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>
<bold>Parsing nucleic acid pseudoknotted secondary structure: algorithm and applications</bold>
</article-title>
<source>J Comput Biol</source>
<year>2007</year>
<volume>14</volume>
<issue>1</issue>
<fpage>16</fpage>
<lpage>32</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2006.0108</pub-id>
<pub-id pub-id-type="pmid">17381343</pub-id>
</element-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sperschneider</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Datta</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wise</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>
<bold>Predicting pseudoknotted structures across two RNA sequences</bold>
</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<issue>23</issue>
<fpage>3058</fpage>
<lpage>3065</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts575</pub-id>
<pub-id pub-id-type="pmid">23044552</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hajiaghayi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hoos</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>
<bold>Analysis of energy-based algorithms for RNA secondary structure prediction</bold>
</article-title>
<source>BMC Bioinformatics</source>
<year>2012</year>
<volume>13</volume>
<issue>1</issue>
<fpage>22</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-13-22</pub-id>
<pub-id pub-id-type="pmid">22296803</pub-id>
</element-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Varian</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>
<bold>Bootstrap tutorial</bold>
</article-title>
<source>Math J</source>
<year>2005</year>
<volume>9</volume>
<issue>4</issue>
<fpage>768</fpage>
<lpage>775</lpage>
</element-citation>
</ref>
<ref id="CR60">
<label>60.</label>
<mixed-citation publication-type="other">Hesterberg T, Monaghan S, Moore DS, Cipson A, Epstein R: Bootstrap methods and permutation tests. The practice of business statistics. Edited by: Farace P, Ward T, Swearengin D, Donnellan B. Chap. 18, New York: W. H. Freeman and Company,</mixed-citation>
</ref>
<ref id="CR61">
<label>61.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aghaeepour</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hoos</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>
<bold>Ensemble-based prediction of RNA secondary structures</bold>
</article-title>
<source>BMC Bioinformatics</source>
<year>2013</year>
<volume>14</volume>
<issue>1</issue>
<fpage>139</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-14-139</pub-id>
<pub-id pub-id-type="pmid">23617269</pub-id>
</element-citation>
</ref>
<ref id="CR62">
<label>62.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<collab>R Core Team</collab>
</person-group>
<source>R: A Language and Environment for Statistical Computing</source>
<year>2013</year>
<publisher-loc>Vienna, Austria</publisher-loc>
<publisher-name>R Foundation for Statistical Computing</publisher-name>
</element-citation>
</ref>
<ref id="CR63">
<label>63.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andronescu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Chuan</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Condon</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>
<bold>Secondary structure prediction of interacting RNA molecules</bold>
</article-title>
<source>J Mol Biol</source>
<year>2005</year>
<volume>345</volume>
<issue>5</issue>
<fpage>987</fpage>
<lpage>1001</lpage>
<pub-id pub-id-type="doi">10.1016/j.jmb.2004.10.082</pub-id>
<pub-id pub-id-type="pmid">15644199</pub-id>
</element-citation>
</ref>
<ref id="CR64">
<label>64.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zuker</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>
<bold>Mfold web server for nucleic acid folding and hybridization prediction</bold>
</article-title>
<source>Nucleic Acids Res</source>
<year>2003</year>
<volume>31</volume>
<fpage>3406</fpage>
<lpage>3415</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg595</pub-id>
<pub-id pub-id-type="pmid">12824337</pub-id>
</element-citation>
</ref>
<ref id="CR65">
<label>65.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bellaousov</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mathews</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>
<bold>ProbKnot: fast prediction of RNA secondary structure including pseudoknots</bold>
</article-title>
<source>RNA</source>
<year>2010</year>
<volume>16</volume>
<issue>10</issue>
<fpage>1870</fpage>
<lpage>1880</lpage>
<pub-id pub-id-type="doi">10.1261/rna.2125310</pub-id>
<pub-id pub-id-type="pmid">20699301</pub-id>
</element-citation>
</ref>
<ref id="CR66">
<label>66.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nethercote</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Seward</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>
<bold>Valgrind: a framework for heavyweight dynamic binary instrumentation</bold>
</article-title>
<source>SIGPLAN Not</source>
<year>2007</year>
<volume>42</volume>
<issue>6</issue>
<fpage>89</fpage>
<lpage>100</lpage>
<pub-id pub-id-type="doi">10.1145/1273442.1250746</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/CovidV2/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000192 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000192 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sante
   |area=    CovidV2
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4064103
   |texte=   A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:24884954" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CovidV2 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Sat Mar 28 17:51:24 2020. Site generation: Sun Jan 31 15:35:48 2021