Serveur d'exploration sur la télématique

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000419 ( Pmc/Corpus ); précédent : 0004189; suivant : 0004200 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases</title>
<author>
<name sortKey="Pellegrini, Marco" sort="Pellegrini, Marco" uniqKey="Pellegrini M" first="Marco" last="Pellegrini">Marco Pellegrini</name>
<affiliation>
<nlm:aff id="I1">Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Renda, Maria Elena" sort="Renda, Maria Elena" uniqKey="Renda M" first="Maria Elena" last="Renda">Maria Elena Renda</name>
<affiliation>
<nlm:aff id="I1">Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vecchio, Alessio" sort="Vecchio, Alessio" uniqKey="Vecchio A" first="Alessio" last="Vecchio">Alessio Vecchio</name>
<affiliation>
<nlm:aff id="I2">Dipartimento di Ingegneria dell'Informazione, Università di Pisa, Pisa I-56122, Italy</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22536970</idno>
<idno type="pmc">3303744</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3303744</idno>
<idno type="RBID">PMC:3303744</idno>
<idno type="doi">10.1186/1471-2105-13-S4-S3</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000419</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000419</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases</title>
<author>
<name sortKey="Pellegrini, Marco" sort="Pellegrini, Marco" uniqKey="Pellegrini M" first="Marco" last="Pellegrini">Marco Pellegrini</name>
<affiliation>
<nlm:aff id="I1">Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Renda, Maria Elena" sort="Renda, Maria Elena" uniqKey="Renda M" first="Maria Elena" last="Renda">Maria Elena Renda</name>
<affiliation>
<nlm:aff id="I1">Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vecchio, Alessio" sort="Vecchio, Alessio" uniqKey="Vecchio A" first="Alessio" last="Vecchio">Alessio Vecchio</name>
<affiliation>
<nlm:aff id="I2">Dipartimento di Ingegneria dell'Informazione, Università di Pisa, Pisa I-56122, Italy</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper we present
<italic>TReaDS - Tandem Repeats Discovery Service</italic>
, a
<italic>tandem repeat meta search engine</italic>
.
<italic>TReaDS </italic>
forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular,
<italic>TReaDS </italic>
allows the user to (
<italic>i</italic>
) simultaneously run different algorithms on the same data set, (
<italic>ii</italic>
) choose for each algorithm a different setting of parameters, and (
<italic>iii</italic>
) obtain a report that can be downloaded for further, off-line, investigations. We used
<italic>TReaDS </italic>
to investigate sequences associated with repeat expansion diseases.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>By using the tool
<italic>TReaDS </italic>
we discover that, for 27 repeat expansion diseases out of a currently known set of 29,
<italic>long fuzzy tandem repeats </italic>
are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability.</p>
<p>We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Cummings, Cj" uniqKey="Cummings C">CJ Cummings</name>
</author>
<author>
<name sortKey="Zoghbi, Hy" uniqKey="Zoghbi H">HY Zoghbi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Usdin, K" uniqKey="Usdin K">K Usdin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mirkin, Sm" uniqKey="Mirkin S">SM Mirkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richard, Gf" uniqKey="Richard G">GF Richard</name>
</author>
<author>
<name sortKey="Kerrest, A" uniqKey="Kerrest A">A Kerrest</name>
</author>
<author>
<name sortKey="Dujon, B" uniqKey="Dujon B">B Dujon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richards, Ri" uniqKey="Richards R">RI Richards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jasinska, A" uniqKey="Jasinska A">A Jasinska</name>
</author>
<author>
<name sortKey="Michlewski, G" uniqKey="Michlewski G">G Michlewski</name>
</author>
<author>
<name sortKey="De Mezer, M" uniqKey="De Mezer M">M de Mezer</name>
</author>
<author>
<name sortKey="Sobczak, K" uniqKey="Sobczak K">K Sobczak</name>
</author>
<author>
<name sortKey="Kozlowski, P" uniqKey="Kozlowski P">P Kozlowski</name>
</author>
<author>
<name sortKey="Napierala, M" uniqKey="Napierala M">M Napierala</name>
</author>
<author>
<name sortKey="Krzyzosiak, Wj" uniqKey="Krzyzosiak W">WJ Krzyzosiak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wells, Rd" uniqKey="Wells R">RD Wells</name>
</author>
<author>
<name sortKey="Dere, R" uniqKey="Dere R">R Dere</name>
</author>
<author>
<name sortKey="Hebert, Ml" uniqKey="Hebert M">ML Hebert</name>
</author>
<author>
<name sortKey="Napierala, M" uniqKey="Napierala M">M Napierala</name>
</author>
<author>
<name sortKey="Son, Ls" uniqKey="Son L">LS Son</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nenguke, T" uniqKey="Nenguke T">T Nenguke</name>
</author>
<author>
<name sortKey="Aladjem, Mi" uniqKey="Aladjem M">MI Aladjem</name>
</author>
<author>
<name sortKey="Gusella, Jf" uniqKey="Gusella J">JF Gusella</name>
</author>
<author>
<name sortKey="Wexler, Ns" uniqKey="Wexler N">NS Wexler</name>
</author>
<author>
<name sortKey="Project, Tvh" uniqKey="Project T">TVH Project</name>
</author>
<author>
<name sortKey="Arnheim, N" uniqKey="Arnheim N">N Arnheim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cleary, J" uniqKey="Cleary J">J Cleary</name>
</author>
<author>
<name sortKey="Nichol, K" uniqKey="Nichol K">K Nichol</name>
</author>
<author>
<name sortKey="Wang, Yh" uniqKey="Wang Y">YH Wang</name>
</author>
<author>
<name sortKey="Pearson, C" uniqKey="Pearson C">C Pearson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brock, Gjr" uniqKey="Brock G">GJR Brock</name>
</author>
<author>
<name sortKey="Anderson, Nh" uniqKey="Anderson N">NH Anderson</name>
</author>
<author>
<name sortKey="Monckton, Dg" uniqKey="Monckton D">DG Monckton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Libby, Rt" uniqKey="Libby R">RT Libby</name>
</author>
<author>
<name sortKey="Hagerman, Ka" uniqKey="Hagerman K">KA Hagerman</name>
</author>
<author>
<name sortKey="Pineda, Vv" uniqKey="Pineda V">VV Pineda</name>
</author>
<author>
<name sortKey="Lau, R" uniqKey="Lau R">R Lau</name>
</author>
<author>
<name sortKey="Cho, Dh" uniqKey="Cho D">DH Cho</name>
</author>
<author>
<name sortKey="Baccam, Sl" uniqKey="Baccam S">SL Baccam</name>
</author>
<author>
<name sortKey="Axford, Mm" uniqKey="Axford M">MM Axford</name>
</author>
<author>
<name sortKey="Cleary, Jd" uniqKey="Cleary J">JD Cleary</name>
</author>
<author>
<name sortKey="Moore, Jm" uniqKey="Moore J">JM Moore</name>
</author>
<author>
<name sortKey="Sopher, Bl" uniqKey="Sopher B">BL Sopher</name>
</author>
<author>
<name sortKey="Tapscott, Sj" uniqKey="Tapscott S">SJ Tapscott</name>
</author>
<author>
<name sortKey="Filippova, Gn" uniqKey="Filippova G">GN Filippova</name>
</author>
<author>
<name sortKey="Pearson, Ce" uniqKey="Pearson C">CE Pearson</name>
</author>
<author>
<name sortKey="La Spada, Ar" uniqKey="La Spada A">AR La Spada</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Warby, Sc" uniqKey="Warby S">SC Warby</name>
</author>
<author>
<name sortKey="Montpetit, A" uniqKey="Montpetit A">A Montpetit</name>
</author>
<author>
<name sortKey="Hayden, Ar" uniqKey="Hayden A">AR Hayden</name>
</author>
<author>
<name sortKey="Carroll, Jb" uniqKey="Carroll J">JB Carroll</name>
</author>
<author>
<name sortKey="Butland, Sl" uniqKey="Butland S">SL Butland</name>
</author>
<author>
<name sortKey="Visscher, H" uniqKey="Visscher H">H Visscher</name>
</author>
<author>
<name sortKey="Collins, Ja" uniqKey="Collins J">JA Collins</name>
</author>
<author>
<name sortKey="Semaka, A" uniqKey="Semaka A">A Semaka</name>
</author>
<author>
<name sortKey="Hudson, Tj" uniqKey="Hudson T">TJ Hudson</name>
</author>
<author>
<name sortKey="Hayden, Mr" uniqKey="Hayden M">MR Hayden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boeva, V" uniqKey="Boeva V">V Boeva</name>
</author>
<author>
<name sortKey="Regnier, M" uniqKey="Regnier M">M Regnier</name>
</author>
<author>
<name sortKey="Papatsenko, D" uniqKey="Papatsenko D">D Papatsenko</name>
</author>
<author>
<name sortKey="Makeev, V" uniqKey="Makeev V">V Makeev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pellegrini, M" uniqKey="Pellegrini M">M Pellegrini</name>
</author>
<author>
<name sortKey="Renda, Me" uniqKey="Renda M">ME Renda</name>
</author>
<author>
<name sortKey="Vecchio, A" uniqKey="Vecchio A">A Vecchio</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rolfsmeier, Ml" uniqKey="Rolfsmeier M">ML Rolfsmeier</name>
</author>
<author>
<name sortKey="Dixon, Mj" uniqKey="Dixon M">MJ Dixon</name>
</author>
<author>
<name sortKey="Pessoa Brandao, L" uniqKey="Pessoa Brandao L">L Pessoa-Brandão</name>
</author>
<author>
<name sortKey="Pelletier, R" uniqKey="Pelletier R">R Pelletier</name>
</author>
<author>
<name sortKey="Miret, Jj" uniqKey="Miret J">JJ Miret</name>
</author>
<author>
<name sortKey="Lahue, Rs" uniqKey="Lahue R">RS Lahue</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bichara, M" uniqKey="Bichara M">M Bichara</name>
</author>
<author>
<name sortKey="Wagner, J" uniqKey="Wagner J">J Wagner</name>
</author>
<author>
<name sortKey="Lambert, Ib" uniqKey="Lambert I">IB Lambert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sobczak, K" uniqKey="Sobczak K">K Sobczak</name>
</author>
<author>
<name sortKey="De Mezer, M" uniqKey="De Mezer M">M de Mezer</name>
</author>
<author>
<name sortKey="Michlewski, G" uniqKey="Michlewski G">G Michlewski</name>
</author>
<author>
<name sortKey="Krol, J" uniqKey="Krol J">J Krol</name>
</author>
<author>
<name sortKey="Krzyzosiak, Wj" uniqKey="Krzyzosiak W">WJ Krzyzosiak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Heidenfelder, Bl" uniqKey="Heidenfelder B">BL Heidenfelder</name>
</author>
<author>
<name sortKey="Makhof, Am" uniqKey="Makhof A">AM Makhof</name>
</author>
<author>
<name sortKey="Topal, Md" uniqKey="Topal M">MD Topal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marquis Gacy, A" uniqKey="Marquis Gacy A">A Marquis Gacy</name>
</author>
<author>
<name sortKey="Goellner, G" uniqKey="Goellner G">G Goellner</name>
</author>
<author>
<name sortKey="Juranic, N" uniqKey="Juranic N">N Juranic</name>
</author>
<author>
<name sortKey="Macura, S" uniqKey="Macura S">S Macura</name>
</author>
<author>
<name sortKey="Mcmurray, Ct" uniqKey="Mcmurray C">CT McMurray</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reddy, K" uniqKey="Reddy K">K Reddy</name>
</author>
<author>
<name sortKey="Tam, M" uniqKey="Tam M">M Tam</name>
</author>
<author>
<name sortKey="Bowater, Rp" uniqKey="Bowater R">RP Bowater</name>
</author>
<author>
<name sortKey="Barber, M" uniqKey="Barber M">M Barber</name>
</author>
<author>
<name sortKey="Tomlinson, M" uniqKey="Tomlinson M">M Tomlinson</name>
</author>
<author>
<name sortKey="Nichol Edamura, K" uniqKey="Nichol Edamura K">K Nichol Edamura</name>
</author>
<author>
<name sortKey="Wang, Yh" uniqKey="Wang Y">YH Wang</name>
</author>
<author>
<name sortKey="Pearson, Ce" uniqKey="Pearson C">CE Pearson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Michlewski, G" uniqKey="Michlewski G">G Michlewski</name>
</author>
<author>
<name sortKey="Krzyzosiak, Wj" uniqKey="Krzyzosiak W">WJ Krzyzosiak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author>
<name sortKey="Vitalis, A" uniqKey="Vitalis A">A Vitalis</name>
</author>
<author>
<name sortKey="Wyczalkowski, Ma" uniqKey="Wyczalkowski M">MA Wyczalkowski</name>
</author>
<author>
<name sortKey="Pappu, Rv" uniqKey="Pappu R">RV Pappu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Faux, Ng" uniqKey="Faux N">NG Faux</name>
</author>
<author>
<name sortKey="Bottomley, Sp" uniqKey="Bottomley S">SP Bottomley</name>
</author>
<author>
<name sortKey="Lesk, Am" uniqKey="Lesk A">AM Lesk</name>
</author>
<author>
<name sortKey="Irving, Ja" uniqKey="Irving J">JA Irving</name>
</author>
<author>
<name sortKey="Morrison, Jr" uniqKey="Morrison J">JR Morrison</name>
</author>
<author>
<name sortKey="De La Banda, Mg" uniqKey="De La Banda M">MG de la Banda</name>
</author>
<author>
<name sortKey="Whisstock, Jc" uniqKey="Whisstock J">JC Whisstock</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kelkar, Ydd" uniqKey="Kelkar Y">YDD Kelkar</name>
</author>
<author>
<name sortKey="Tyekucheva, S" uniqKey="Tyekucheva S">S Tyekucheva</name>
</author>
<author>
<name sortKey="Chiaromonte, F" uniqKey="Chiaromonte F">F Chiaromonte</name>
</author>
<author>
<name sortKey="Makova, Kdd" uniqKey="Makova K">KDD Makova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vogler, A" uniqKey="Vogler A">A Vogler</name>
</author>
<author>
<name sortKey="Keys, C" uniqKey="Keys C">C Keys</name>
</author>
<author>
<name sortKey="Nemoto, Y" uniqKey="Nemoto Y">Y Nemoto</name>
</author>
<author>
<name sortKey="Colman, R" uniqKey="Colman R">R Colman</name>
</author>
<author>
<name sortKey="Jay, Z" uniqKey="Jay Z">Z Jay</name>
</author>
<author>
<name sortKey="Keim, P" uniqKey="Keim P">P Keim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wooster, R" uniqKey="Wooster R">R Wooster</name>
</author>
<author>
<name sortKey="Cleton Jansen, Am" uniqKey="Cleton Jansen A">AM Cleton-Jansen</name>
</author>
<author>
<name sortKey="Collins, N" uniqKey="Collins N">N Collins</name>
</author>
<author>
<name sortKey="Mangion, R" uniqKey="Mangion R">R Mangion</name>
</author>
<author>
<name sortKey="Cornelis, J" uniqKey="Cornelis J">J Cornelis</name>
</author>
<author>
<name sortKey="Cooper, C" uniqKey="Cooper C">C Cooper</name>
</author>
<author>
<name sortKey="Gusterson, B" uniqKey="Gusterson B">B Gusterson</name>
</author>
<author>
<name sortKey="Ponder, B" uniqKey="Ponder B">B Ponder</name>
</author>
<author>
<name sortKey="Von Deimling, A" uniqKey="Von Deimling A">A von Deimling</name>
</author>
<author>
<name sortKey="Wiestler, O" uniqKey="Wiestler O">O Wiestler</name>
</author>
<author>
<name sortKey="Cornelisse, C" uniqKey="Cornelisse C">C Cornelisse</name>
</author>
<author>
<name sortKey="Devilee, P" uniqKey="Devilee P">P Devilee</name>
</author>
<author>
<name sortKey="Stratton, M" uniqKey="Stratton M">M Stratton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="O Dushlaine, C" uniqKey="O Dushlaine C">C O'Dushlaine</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
<author>
<name sortKey="Park, S" uniqKey="Park S">S Park</name>
</author>
<author>
<name sortKey="Shields, D" uniqKey="Shields D">D Shields</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Legendre, M" uniqKey="Legendre M">M Legendre</name>
</author>
<author>
<name sortKey="Pochet, N" uniqKey="Pochet N">N Pochet</name>
</author>
<author>
<name sortKey="Pak, T" uniqKey="Pak T">T Pak</name>
</author>
<author>
<name sortKey="Verstrepen, Kj" uniqKey="Verstrepen K">KJ Verstrepen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grissa, I" uniqKey="Grissa I">I Grissa</name>
</author>
<author>
<name sortKey="Vergnaud, G" uniqKey="Vergnaud G">G Vergnaud</name>
</author>
<author>
<name sortKey="Pourcel, C" uniqKey="Pourcel C">C Pourcel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kolpakov, R" uniqKey="Kolpakov R">R Kolpakov</name>
</author>
<author>
<name sortKey="Bana, G" uniqKey="Bana G">G Bana</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kurtz, S" uniqKey="Kurtz S">S Kurtz</name>
</author>
<author>
<name sortKey="Choudhuri, Jv" uniqKey="Choudhuri J">JV Choudhuri</name>
</author>
<author>
<name sortKey="Ohlebusch, E" uniqKey="Ohlebusch E">E Ohlebusch</name>
</author>
<author>
<name sortKey="Schleiermacher, C" uniqKey="Schleiermacher C">C Schleiermacher</name>
</author>
<author>
<name sortKey="Stoye, J" uniqKey="Stoye J">J Stoye</name>
</author>
<author>
<name sortKey="Giegerich, R" uniqKey="Giegerich R">R Giegerich</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wexler, Y" uniqKey="Wexler Y">Y Wexler</name>
</author>
<author>
<name sortKey="Yakhini, Z" uniqKey="Yakhini Z">Z Yakhini</name>
</author>
<author>
<name sortKey="Kashi, Y" uniqKey="Kashi Y">Y Kashi</name>
</author>
<author>
<name sortKey="Geiger, D" uniqKey="Geiger D">D Geiger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sokol, D" uniqKey="Sokol D">D Sokol</name>
</author>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G Benson</name>
</author>
<author>
<name sortKey="Tojeira, J" uniqKey="Tojeira J">J Tojeira</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leclercq, S" uniqKey="Leclercq S">S Leclercq</name>
</author>
<author>
<name sortKey="Rivals, E" uniqKey="Rivals E">E Rivals</name>
</author>
<author>
<name sortKey="Jarne, P" uniqKey="Jarne P">P Jarne</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Butland, S" uniqKey="Butland S">S Butland</name>
</author>
<author>
<name sortKey="Devon, R" uniqKey="Devon R">R Devon</name>
</author>
<author>
<name sortKey="Huang, Y" uniqKey="Huang Y">Y Huang</name>
</author>
<author>
<name sortKey="Mead, Cl" uniqKey="Mead C">CL Mead</name>
</author>
<author>
<name sortKey="Meynert, A" uniqKey="Meynert A">A Meynert</name>
</author>
<author>
<name sortKey="Neal, S" uniqKey="Neal S">S Neal</name>
</author>
<author>
<name sortKey="Lee, S" uniqKey="Lee S">S Lee</name>
</author>
<author>
<name sortKey="Wilkinson, A" uniqKey="Wilkinson A">A Wilkinson</name>
</author>
<author>
<name sortKey="Yang, G" uniqKey="Yang G">G Yang</name>
</author>
<author>
<name sortKey="Yuen, M" uniqKey="Yuen M">M Yuen</name>
</author>
<author>
<name sortKey="Hayden, M" uniqKey="Hayden M">M Hayden</name>
</author>
<author>
<name sortKey="Holt, R" uniqKey="Holt R">R Holt</name>
</author>
<author>
<name sortKey="Leavitt, B" uniqKey="Leavitt B">B Leavitt</name>
</author>
<author>
<name sortKey="Ouellette, Bf" uniqKey="Ouellette B">BF Ouellette</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hayes, S" uniqKey="Hayes S">S Hayes</name>
</author>
<author>
<name sortKey="Turecki, G" uniqKey="Turecki G">G Turecki</name>
</author>
<author>
<name sortKey="Brisebois, K" uniqKey="Brisebois K">K Brisebois</name>
</author>
<author>
<name sortKey="Lopes Cendes, I" uniqKey="Lopes Cendes I">I Lopes-Cendes</name>
</author>
<author>
<name sortKey="Gaspar, C" uniqKey="Gaspar C">C Gaspar</name>
</author>
<author>
<name sortKey="Riess, O" uniqKey="Riess O">O Riess</name>
</author>
<author>
<name sortKey="Ranum, Lp" uniqKey="Ranum L">LP Ranum</name>
</author>
<author>
<name sortKey="Pulst, Sm" uniqKey="Pulst S">SM Pulst</name>
</author>
<author>
<name sortKey="Rouleau, Ga" uniqKey="Rouleau G">GA Rouleau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ayres, Ja" uniqKey="Ayres J">JA Ayres</name>
</author>
<author>
<name sortKey="Shum, L" uniqKey="Shum L">L Shum</name>
</author>
<author>
<name sortKey="Akarsu, An" uniqKey="Akarsu A">AN Akarsu</name>
</author>
<author>
<name sortKey="Dashner, R" uniqKey="Dashner R">R Dashner</name>
</author>
<author>
<name sortKey="Takahashi, K" uniqKey="Takahashi K">K Takahashi</name>
</author>
<author>
<name sortKey="Ikura, T" uniqKey="Ikura T">T Ikura</name>
</author>
<author>
<name sortKey="Slavkin, Hc" uniqKey="Slavkin H">HC Slavkin</name>
</author>
<author>
<name sortKey="Nuckolls, Gh" uniqKey="Nuckolls G">GH Nuckolls</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kottgen, A" uniqKey="Kottgen A">A Köttgen</name>
</author>
<author>
<name sortKey="Pattaro, C" uniqKey="Pattaro C">C Pattaro</name>
</author>
<author>
<name sortKey="Boger, Ca" uniqKey="Boger C">CA Böger</name>
</author>
<author>
<name sortKey="Fuchsberger, C" uniqKey="Fuchsberger C">C Fuchsberger</name>
</author>
<author>
<name sortKey="Olden, M" uniqKey="Olden M">M Olden</name>
</author>
<author>
<name sortKey="Glazer, Nl" uniqKey="Glazer N">NL Glazer</name>
</author>
<author>
<name sortKey="Parsa, A" uniqKey="Parsa A">A Parsa</name>
</author>
<author>
<name sortKey="Gao, X" uniqKey="Gao X">X Gao</name>
</author>
<author>
<name sortKey="Yang, Q" uniqKey="Yang Q">Q Yang</name>
</author>
<author>
<name sortKey="Smith, Av" uniqKey="Smith A">AV Smith</name>
</author>
<author>
<name sortKey="O Connell, Jr" uniqKey="O Connell J">JR O'Connell</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Schmidt, H" uniqKey="Schmidt H">H Schmidt</name>
</author>
<author>
<name sortKey="Tanaka, T" uniqKey="Tanaka T">T Tanaka</name>
</author>
<author>
<name sortKey="Isaacs, A" uniqKey="Isaacs A">A Isaacs</name>
</author>
<author>
<name sortKey="Ketkar, S" uniqKey="Ketkar S">S Ketkar</name>
</author>
<author>
<name sortKey="Hwang, Sj" uniqKey="Hwang S">SJ Hwang</name>
</author>
<author>
<name sortKey="Johnson, Ad" uniqKey="Johnson A">AD Johnson</name>
</author>
<author>
<name sortKey="Dehghan, A" uniqKey="Dehghan A">A Dehghan</name>
</author>
<author>
<name sortKey="Teumer, A" uniqKey="Teumer A">A Teumer</name>
</author>
<author>
<name sortKey="Pare, G" uniqKey="Pare G">G Paré</name>
</author>
<author>
<name sortKey="Atkinson, Ej" uniqKey="Atkinson E">EJ Atkinson</name>
</author>
<author>
<name sortKey="Zeller, T" uniqKey="Zeller T">T Zeller</name>
</author>
<author>
<name sortKey="Lohman, K" uniqKey="Lohman K">K Lohman</name>
</author>
<author>
<name sortKey="Cornelis, Mc" uniqKey="Cornelis M">MC Cornelis</name>
</author>
<author>
<name sortKey="Probst Hensch, Nm" uniqKey="Probst Hensch N">NM Probst-Hensch</name>
</author>
<author>
<name sortKey="Kronenberg, F" uniqKey="Kronenberg F">F Kronenberg</name>
</author>
<author>
<name sortKey="Tonjes, A" uniqKey="Tonjes A">A Tönjes</name>
</author>
<author>
<name sortKey="Hayward, C" uniqKey="Hayward C">C Hayward</name>
</author>
<author>
<name sortKey="Aspelund, T" uniqKey="Aspelund T">T Aspelund</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huang, H" uniqKey="Huang H">H Huang</name>
</author>
<author>
<name sortKey="Winter, E" uniqKey="Winter E">E Winter</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H Wang</name>
</author>
<author>
<name sortKey="Weinstock, K" uniqKey="Weinstock K">K Weinstock</name>
</author>
<author>
<name sortKey="Xing, H" uniqKey="Xing H">H Xing</name>
</author>
<author>
<name sortKey="Goodstadt, L" uniqKey="Goodstadt L">L Goodstadt</name>
</author>
<author>
<name sortKey="Stenson, P" uniqKey="Stenson P">P Stenson</name>
</author>
<author>
<name sortKey="Cooper, D" uniqKey="Cooper D">D Cooper</name>
</author>
<author>
<name sortKey="Smith, D" uniqKey="Smith D">D Smith</name>
</author>
<author>
<name sortKey="Alba, Mm" uniqKey="Alba M">MM Alba</name>
</author>
<author>
<name sortKey="Ponting, C" uniqKey="Ponting C">C Ponting</name>
</author>
<author>
<name sortKey="Fechtel, K" uniqKey="Fechtel K">K Fechtel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ring, Hz" uniqKey="Ring H">HZ Ring</name>
</author>
<author>
<name sortKey="Chang, H" uniqKey="Chang H">H Chang</name>
</author>
<author>
<name sortKey="Guilbot, A" uniqKey="Guilbot A">A Guilbot</name>
</author>
<author>
<name sortKey="Brice, A" uniqKey="Brice A">A Brice</name>
</author>
<author>
<name sortKey="Leguern, E" uniqKey="Leguern E">E LeGuern</name>
</author>
<author>
<name sortKey="Francke, U" uniqKey="Francke U">U Francke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sherry, St" uniqKey="Sherry S">ST Sherry</name>
</author>
<author>
<name sortKey="Ward, M" uniqKey="Ward M">M Ward</name>
</author>
<author>
<name sortKey="Kholodov, M" uniqKey="Kholodov M">M Kholodov</name>
</author>
<author>
<name sortKey="Baker, J" uniqKey="Baker J">J Baker</name>
</author>
<author>
<name sortKey="Phan, L" uniqKey="Phan L">L Phan</name>
</author>
<author>
<name sortKey="Smigielski, Em" uniqKey="Smigielski E">EM Smigielski</name>
</author>
<author>
<name sortKey="Sirotkin, K" uniqKey="Sirotkin K">K Sirotkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boby, T" uniqKey="Boby T">T Boby</name>
</author>
<author>
<name sortKey="Patch, Am" uniqKey="Patch A">AM Patch</name>
</author>
<author>
<name sortKey="Aves, Sj" uniqKey="Aves S">SJ Aves</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Payseur, Ba" uniqKey="Payseur B">BA Payseur</name>
</author>
<author>
<name sortKey="Jing, P" uniqKey="Jing P">P Jing</name>
</author>
<author>
<name sortKey="Haasl, Rj" uniqKey="Haasl R">RJ Haasl</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mills, Re" uniqKey="Mills R">RE Mills</name>
</author>
<author>
<name sortKey="Luttig, Ct" uniqKey="Luttig C">CT Luttig</name>
</author>
<author>
<name sortKey="Larkins, Ce" uniqKey="Larkins C">CE Larkins</name>
</author>
<author>
<name sortKey="Beauchamp, A" uniqKey="Beauchamp A">A Beauchamp</name>
</author>
<author>
<name sortKey="Tsui, C" uniqKey="Tsui C">C Tsui</name>
</author>
<author>
<name sortKey="Pittard, Ws" uniqKey="Pittard W">WS Pittard</name>
</author>
<author>
<name sortKey="Devine, Se" uniqKey="Devine S">SE Devine</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reddy, Ph" uniqKey="Reddy P">PH Reddy</name>
</author>
<author>
<name sortKey="Stockburger, E" uniqKey="Stockburger E">E Stockburger</name>
</author>
<author>
<name sortKey="Gillevet, P" uniqKey="Gillevet P">P Gillevet</name>
</author>
<author>
<name sortKey="Tagle, Da" uniqKey="Tagle D">DA Tagle</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">22536970</article-id>
<article-id pub-id-type="pmc">3303744</article-id>
<article-id pub-id-type="publisher-id">1471-2105-13-S4-S3</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-13-S4-S3</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Pellegrini</surname>
<given-names>Marco</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>marco.pellegrini@iit.cnr.it</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A2">
<name>
<surname>Renda</surname>
<given-names>Maria Elena</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>elena.renda@iit.cnr.it</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Vecchio</surname>
<given-names>Alessio</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>a.vecchio@iet.unipi.it</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa I-56124, Italy</aff>
<aff id="I2">
<label>2</label>
Dipartimento di Ingegneria dell'Informazione, Università di Pisa, Pisa I-56122, Italy</aff>
<pub-date pub-type="collection">
<year>2012</year>
</pub-date>
<pub-date pub-type="epub">
<day>28</day>
<month>3</month>
<year>2012</year>
</pub-date>
<volume>13</volume>
<issue>Suppl 4</issue>
<supplement>
<named-content content-type="supplement-title">Italian Society of Bioinformatics (BITS): Annual Meeting 2011</named-content>
<named-content content-type="supplement-editor">Paolo Romano and Manuela Helmer-Citterich</named-content>
</supplement>
<fpage>S3</fpage>
<lpage>S3</lpage>
<permissions>
<copyright-statement>Copyright ©2012 Pellegrini et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2012</copyright-year>
<copyright-holder>Pellegrini et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S4/S3"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>Tandem repeats are multiple duplications of substrings in the DNA that occur contiguously, or at a short distance, and may involve some mutations (such as substitutions, insertions, and deletions). Tandem repeats have been extensively studied also for their association with the class of repeat expansion diseases (mostly affecting the nervous system). Comparative studies on the output of different tools for finding tandem repeats highlighted significant differences among the sets of detected tandem repeats, while many authors pointed up how critical it is the right choice of parameters.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper we present
<italic>TReaDS - Tandem Repeats Discovery Service</italic>
, a
<italic>tandem repeat meta search engine</italic>
.
<italic>TReaDS </italic>
forwards user requests to several state of the art tools for finding tandem repeats and merges their outcome into a single report, providing a global, synthetic, and comparative view of the results. In particular,
<italic>TReaDS </italic>
allows the user to (
<italic>i</italic>
) simultaneously run different algorithms on the same data set, (
<italic>ii</italic>
) choose for each algorithm a different setting of parameters, and (
<italic>iii</italic>
) obtain a report that can be downloaded for further, off-line, investigations. We used
<italic>TReaDS </italic>
to investigate sequences associated with repeat expansion diseases.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>By using the tool
<italic>TReaDS </italic>
we discover that, for 27 repeat expansion diseases out of a currently known set of 29,
<italic>long fuzzy tandem repeats </italic>
are covering the expansion loci. Tests with control sets confirm the specificity of this association. This finding suggests that long fuzzy tandem repeats can be a new class of cis-acting elements involved in the mechanisms leading to the expansion instability.</p>
<p>We strongly believe that biologists can be interested in a tool that, not only gives them the possibility of using multiple search algorithm at the same time, with the same effort exerted in using just one of the systems, but also simplifies the burden of comparing and merging the results, thus expanding our capabilities in detecting important phenomena related to tandem repeats.</p>
</sec>
</abstract>
<conference>
<conf-date>20-22 June 2011</conf-date>
<conf-name>Eighth Annual Meeting of the Italian Society of Bioinformatics (BITS)</conf-name>
<conf-loc>Pisa, Italy</conf-loc>
</conference>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<sec>
<title>Overview on repeat expansion diseases</title>
<p>At present 29 diseases are classified as
<italic>repeat expansion diseases </italic>
(RE) [
<xref ref-type="bibr" rid="B1">1</xref>
-
<xref ref-type="bibr" rid="B3">3</xref>
], and the number is growing. These are mostly neurodegenerative and neuromuscolar disorders, including Huntington disease (HD), Kennedy disease (SBMA), and several types of Spinocerebral Ataxias (SCA). Since up to recently all known cases involved repeating a motif of 3 nucleotides, this class was denoted also as
<italic>trinucleotide repeat </italic>
(TNR)
<italic>expansion disease</italic>
. However, cases of repeating units with 4, 5 and 12 nucleotides have been discovered thus we talk more generally of repeat expansion diseases. Recent surveys devoted to DNA repeats [
<xref ref-type="bibr" rid="B4">4</xref>
] have extended discussion of repeat expansion disorders, while specific surveys for repeat expansion diseases can be found in [
<xref ref-type="bibr" rid="B5">5</xref>
-
<xref ref-type="bibr" rid="B7">7</xref>
].</p>
<p>The locus of expansion can be located in various regions of the resident gene: in the coding sequences, in the 5'- untranslated region (5'-UTR), in the 3'- untranslated region (3'-UTR), in introns and in promoter regions. Two main questions are related to the study of these diseases from a genetic point of view: (a) which mechanisms or conditions lead to the repeat expansion? and (b) how do repeat expansions result in diseases?</p>
<p>Only a small fraction of all the tandem repeats found in the human genome expand and result in a disease. Thus researchers have tried to identify which unusual structural features favor such expansion, and found a propensity to forming hairpins (or other structures, such as: quadruplex-like structures, H-DNA and sticky DNA) as a key mechanism leading to expansion. Several studies also tried to identify cis-regulating elements that do favor the onset of the above structural features and of the expansion. Our study falls in this category and proposes
<italic>long fuzzy tandem repeats </italic>
as a novel cis-regulating element for repeat expansion, thus contributing to investigating question (a).</p>
</sec>
<sec>
<title>Cis-acting factors for TNR instability</title>
<p>Several papers tackle the problem of determining cis-acting factors associated with loci of TNR instability. In particular one quite studied factor is the proximity and orientation of DNA replication initiation regions (IR) w.r.t. the TNR instability locus [
<xref ref-type="bibr" rid="B8">8</xref>
,
<xref ref-type="bibr" rid="B9">9</xref>
]. In [
<xref ref-type="bibr" rid="B8">8</xref>
] the position of the DNA replication initiation region for three TNR diseases loci (HD, SCA7, and SBMA) is analyzed, and a correlation pattern is proposed. The role of flanking regions to the expansion locus (EL) has been analyzed in literature. For example, close proximity of the TNR locus to CpG-rich regions has been noticed in some cases (10 diseases) [
<xref ref-type="bibr" rid="B10">10</xref>
]. The presence of the transcription factor binding site (TFBS) CTCF has been discovered in the flanking region for SCA7 [
<xref ref-type="bibr" rid="B11">11</xref>
]. An association between HD and an haplogroup (with SNPs not necessarily in the flanking sequences) is described in [
<xref ref-type="bibr" rid="B12">12</xref>
]. Note that such studies identify cis-acting factors relevant only for a few RE diseases.</p>
<p>Fuzzy tandem repeats (FTRs) have been recently proposed as a new genomic feature worth of study [
<xref ref-type="bibr" rid="B13">13</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
]. Informally, FTRs are tandem repeats with high divergence (30-40%) between the repeating units and the consensus motif. At the best of our knowledge, up to now the hypothesis that Fuzzy TRs can act as cis-elements for human diseases was not explored in the literature. Interestingly we have found FTRs in almost all the RE disease independently from the specific repeating motif, coding/non-coding characterization, etc. Thus FTRs may be seen as a "generic" cis-acting factor that may in particular cases interact with other cis-acting factors specific for the single protein/disease.</p>
<p>Analysis of TNR instability has been conducted also in other model species, e.g.
<italic>Saccaromyces cerevisiae </italic>
[
<xref ref-type="bibr" rid="B15">15</xref>
] and
<italic>Escherichia Coli </italic>
[
<xref ref-type="bibr" rid="B16">16</xref>
].</p>
</sec>
<sec>
<title>Role of hairpins</title>
<p>In several cases it has been noticed that the TNR RNA coding sequences tend to form hairpin structures [
<xref ref-type="bibr" rid="B17">17</xref>
-
<xref ref-type="bibr" rid="B19">19</xref>
] or RNA-DNA hybrids such as R-loops [
<xref ref-type="bibr" rid="B20">20</xref>
]. This is relevant in particular for the TNR located in the transcribed sections of DNA. These results on hairpin are obtained via experiments
<italic>in vitro</italic>
, usually involving a relatively short repeating sequence (a trinucleotide unit repeated 16 or 17 times) and a promoter sequence. In these experiments the role of the native flanking regions is factored out or in some cases different (non-native) flanking sequences are used. Evidence of hairpin formation with the natural flanking sequence for SCA3, SCA6 and Dentatorubropallidoluysian atrophy (DRPLA) is reported in [
<xref ref-type="bibr" rid="B21">21</xref>
]. Notice thus that, although hairpin formation is an important mechanism to explain trinucleotide instability, one cannot infer the presence of a FTR just from the tendency to form hairpin (or other) RNA structure in vitro. The relationship of FTR and hairpin formation is at the moment unclear and it is an open area for future research, as in this stage we are interested in establishing FTR as a potential cis-regulatory element, rather than exploring the precise mechanisms of the action.</p>
</sec>
<sec>
<title>PolyQ repeats</title>
<p>For the subfamily of nine polyQ repeat diseases the corresponding polyglutammine peptide has been studied in some detail [
<xref ref-type="bibr" rid="B22">22</xref>
,
<xref ref-type="bibr" rid="B23">23</xref>
]. Such studies are important for determining the toxicity mechanism of the mutant proteins, however they explain the onset of the disease only after the expansion at the DNA locus occurs. In particular the pathogenic length of the polyQ chain is a specific trait of each disease. A list of such diseases is reported in table
<xref ref-type="table" rid="T1">1</xref>
.</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>Table of polyglutammine diseases.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Disease code</th>
<th align="left">Disease name</th>
<th align="left">Gene code</th>
<th align="left">Normal repeats</th>
<th align="left">Pathogenic repeats</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">DRPLA</td>
<td align="left">Dentatorubropallidoluysian atrophy</td>
<td align="left">ATN1</td>
<td align="left">6 - 35</td>
<td align="left">49 - 88</td>
</tr>
<tr>
<td align="left">HD</td>
<td align="left">Huntington's disease</td>
<td align="left">HTT (Huntingtin)</td>
<td align="left">10 - 35</td>
<td align="left">35+</td>
</tr>
<tr>
<td align="left">SBMA</td>
<td align="left">Kennedy disease (Spinobulbar muscular atrophy)</td>
<td align="left">HS-AR</td>
<td align="left">9 - 36</td>
<td align="left">38 - 62</td>
</tr>
<tr>
<td align="left">SCA1</td>
<td align="left">Spinocerebellar ataxia Type 1</td>
<td align="left">ATXN1</td>
<td align="left">6 - 35</td>
<td align="left">49 - 88</td>
</tr>
<tr>
<td align="left">SCA2</td>
<td align="left">Spinocerebellar ataxia Type 2</td>
<td align="left">ATXN2</td>
<td align="left">14 - 32</td>
<td align="left">33 - 77</td>
</tr>
<tr>
<td align="left">SCA6</td>
<td align="left">Spinocerebellar ataxia Type 6</td>
<td align="left">CACNA1A</td>
<td align="left">4 - 18</td>
<td align="left">21 - 30</td>
</tr>
<tr>
<td align="left">SCA7</td>
<td align="left">Spinocerebellar ataxia Type 7</td>
<td align="left">ATXN7</td>
<td align="left">7 - 17</td>
<td align="left">38 - 120</td>
</tr>
<tr>
<td align="left">SCA17</td>
<td align="left">Spinocerebellar ataxia Type 17</td>
<td align="left">TBP</td>
<td align="left">25 - 42</td>
<td align="left">47 - 63</td>
</tr>
<tr>
<td align="left">SCA3</td>
<td align="left">Machado-Joseph disease (Spinocerebellar ataxia Type 3)</td>
<td align="left">ATXN3</td>
<td align="left">12 - 40</td>
<td align="left">55 - 86</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of polyglutammine diseases. The table reports: disease code and full name, associated gene, ranges of healthy and pathogenic repeat numbers.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>PolyA repeats</title>
<p>A second class of repeat expansion diseases involve repetitions of the imperfect GCN triplets that encode the Alanine amino acid. Such REs are characterized by relative low copy numbers (both in the normal and expanded states). In addition the expanded polyA repeats are stable both in the somatic and intergenerational transmission, unlike polyQ repeat expansions. A list of such diseases is reported in table
<xref ref-type="table" rid="T2">2</xref>
.</p>
<table-wrap id="T2" position="float">
<label>Table 2</label>
<caption>
<p>Table of polyalanine diseases.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Disease code</th>
<th align="left">Disease name</th>
<th align="left">Gene code</th>
<th align="left">Normal repeats</th>
<th align="left">Pathogenic repeats</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">BPES</td>
<td align="left">Blepharophimosis-ptosis-epicanthus inversus syndactyly</td>
<td align="left">FOXL2</td>
<td align="left">14</td>
<td align="left">19-24</td>
</tr>
<tr>
<td align="left">HPE5</td>
<td align="left">Holoprosencephaly 5</td>
<td align="left">ZIC2</td>
<td align="left">15</td>
<td align="left">25</td>
</tr>
<tr>
<td align="left">CCHS</td>
<td align="left">Congenital failure of autonomic control</td>
<td align="left">PHOX2B</td>
<td align="left">20</td>
<td align="left">25-33</td>
</tr>
<tr>
<td align="left">ISSX</td>
<td align="left">X-linked infantile spasm syndrome</td>
<td align="left">ARX</td>
<td align="left">16</td>
<td align="left">27</td>
</tr>
<tr>
<td align="left">MRGH</td>
<td align="left">X-linked mental retardation with isolated growth hormone deficiency</td>
<td align="left">SOX3</td>
<td align="left">15</td>
<td align="left">22-26</td>
</tr>
<tr>
<td align="left">CCD</td>
<td align="left">Cleidocranial dysplasia</td>
<td align="left">RUNX2</td>
<td align="left">17</td>
<td align="left">27</td>
</tr>
<tr>
<td align="left">HFGS</td>
<td align="left">Hand-foot-genital syndrome</td>
<td align="left">HOXA13</td>
<td align="left">18</td>
<td align="left">24-26</td>
</tr>
<tr>
<td align="left">SPD1</td>
<td align="left">Synpolydactyly 1</td>
<td align="left">HOXD13</td>
<td align="left">15</td>
<td align="left">22-29</td>
</tr>
<tr>
<td align="left">OPMD</td>
<td align="left">Oculopharyngeal muscular dystrophy</td>
<td align="left">PABPN1</td>
<td align="left">10</td>
<td align="left">11-17</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of polyalanine diseases. The table reports: disease code and full name, associated gene, ranges of healthy and pathogenic repeat numbers.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Non-polyQ and non-polyA repeats</title>
<p>Non-polyQ and non-polyA expanding repeats may have motifs of length 3,4,5, and 12. They may be located in several sections of the gene sequence. A list of such diseases is reported in table
<xref ref-type="table" rid="T3">3</xref>
.</p>
<table-wrap id="T3" position="float">
<label>Table 3</label>
<caption>
<p>Table of non-polyglutammine, non-polyalanine diseases.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Disease code</th>
<th align="left">Disease name</th>
<th align="left">Gene</th>
<th align="left">Motif</th>
<th align="left">Location</th>
<th align="left">Normal repeats</th>
<th align="left">Pathogenic repeats</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">FRAXA</td>
<td align="left">Fragile X syndrome</td>
<td align="left">FMR1</td>
<td align="left">CGG</td>
<td align="left">5'-UTR</td>
<td align="left">6 - 53</td>
<td align="left">230+</td>
</tr>
<tr>
<td align="left">FXTAS</td>
<td align="left">Fragile Xassociated tremor/ataxia syndrome</td>
<td align="left">FMR1</td>
<td align="left">CGG</td>
<td align="left">5'-UTR</td>
<td align="left">6 - 53</td>
<td align="left">55-200</td>
</tr>
<tr>
<td align="left">FRAXE</td>
<td align="left">Fragile XE mental retardation</td>
<td align="left">AFF2</td>
<td align="left">GCC</td>
<td align="left">5'-UTR</td>
<td align="left">6 - 35</td>
<td align="left">200+</td>
</tr>
<tr>
<td align="left">FRDA</td>
<td align="left">Friedreich's ataxia</td>
<td align="left">FXN</td>
<td align="left">GAA</td>
<td align="left">Intr.</td>
<td align="left">7 - 34</td>
<td align="left">100+</td>
</tr>
<tr>
<td align="left">DM1</td>
<td align="left">Myotonic dystrophy type</td>
<td align="left">DMPK</td>
<td align="left">CTG</td>
<td align="left">3'-UTR</td>
<td align="left">5 - 37</td>
<td align="left">50+</td>
</tr>
<tr>
<td align="left">DM2</td>
<td align="left">Myotonic dystrophy type 2</td>
<td align="left">ZNF9</td>
<td align="left">CCTG</td>
<td align="left">Intr.</td>
<td align="left">27-</td>
<td align="left">75+</td>
</tr>
<tr>
<td align="left">SCA10</td>
<td align="left">Spinocerebellar ataxia Type 10</td>
<td align="left">ATXN10</td>
<td align="left">ATTCT</td>
<td align="left">Intr.</td>
<td align="left">10-29</td>
<td align="left">280+</td>
</tr>
<tr>
<td align="left">SCA12</td>
<td align="left">Spinocerebellar ataxia Type 12</td>
<td align="left">PPP2R2B</td>
<td align="left">CAG</td>
<td align="left">5'-UTR</td>
<td align="left">7 - 28</td>
<td align="left">66 - 78</td>
</tr>
<tr>
<td align="left">EPM1</td>
<td align="left">Progressive myoclonus epilipsy</td>
<td align="left">CSTB</td>
<td align="left">(
<italic>C</italic>
)
<sub>4</sub>
<italic>G</italic>
(
<italic>C</italic>
)
<sub>4</sub>
<italic>GCG</italic>
</td>
<td align="left">Prom.</td>
<td align="left">2-3</td>
<td align="left">60+</td>
</tr>
<tr>
<td align="left">HDL-2</td>
<td align="left">Huntington diesease-like</td>
<td align="left">JPH3</td>
<td align="left">CAG/CTG</td>
<td align="left">3'-UTR</td>
<td align="left">66-</td>
<td align="left">66+</td>
</tr>
<tr>
<td align="left">SCA8</td>
<td align="left">Spinocerebellar ataxia Type 8</td>
<td align="left">ATXN8OS</td>
<td align="left">CTG</td>
<td align="left">3'-UTR</td>
<td align="left">16 - 37</td>
<td align="left">110 - 250</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of non-polyglutammine and non-polyalanine diseases. The table reports: disease code and full name, associated gene, repeating unit genic region, ranges of healthy and pathogenic repeat numbers.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Fuzzy tandem repeats as potential cis-regulatory elements in repeat expansion disorders</title>
<p>In [
<xref ref-type="bibr" rid="B14">14</xref>
] we noticed that the locus associated with the unstable trinucleotide repeat in the Frataxin protein mRNA coding sequence (whose abnormal expansion is cause of Frederich's ataxia) was included in a much longer fuzzy TR, detected using the proposed TRStalker system.</p>
<p>The present research originated from the hypothesis that this fact (a long fuzzy TR covering the unstable locus) could be observed in a large number of trinucleotide repeat disorders. Consequently, FTR could be exposed as a novel cis-regulatory element not yet studied in literature.</p>
<p>We employ the tool
<italic>TReaDS </italic>
in order to quickly collect and organize the output of several TR finding algorithms into a single easy to read report in support to this hypothesis.</p>
</sec>
<sec>
<title>Tools for finding tandem repeats</title>
<p>Tandem repeats (TRs) of different forms (satellites, microsatellites, minisatellites) have been studied extensively because of their role in several biological processes. In fact, TRs are privileged targets in activities such as fingerprinting or tracing the evolution of populations [
<xref ref-type="bibr" rid="B24">24</xref>
,
<xref ref-type="bibr" rid="B25">25</xref>
]; several diseases, disorders and addictive behaviors are linked to specific TRs loci [
<xref ref-type="bibr" rid="B26">26</xref>
]; the role of TRs has been also studied within coding regions [
<xref ref-type="bibr" rid="B27">27</xref>
] and in relation to gene functions [
<xref ref-type="bibr" rid="B28">28</xref>
].</p>
<p>The scope and depth of the research on TRs have been boosted by the availability of efficient non-trivial algorithms for finding TRs, even when mutations occur with non-negligible probability. Tandem Repeat Finder (TRF) [
<xref ref-type="bibr" rid="B29">29</xref>
], CRISPRFinder [
<xref ref-type="bibr" rid="B30">30</xref>
], mreps [
<xref ref-type="bibr" rid="B31">31</xref>
], Reputer [
<xref ref-type="bibr" rid="B32">32</xref>
], Approximate Tandem Repeat Hunter (ATRHunter) [
<xref ref-type="bibr" rid="B33">33</xref>
], TandemSWAN [
<xref ref-type="bibr" rid="B13">13</xref>
], and Tread [
<xref ref-type="bibr" rid="B34">34</xref>
] are some examples of currently operational systems that can be accessed via a web interface.</p>
<p>Comparative studies [
<xref ref-type="bibr" rid="B13">13</xref>
,
<xref ref-type="bibr" rid="B35">35</xref>
], for the case of short TRs with high percentages of substitutions, report significant differences among the sets of TRs that can be detected by using different tools. Moreover, in [
<xref ref-type="bibr" rid="B35">35</xref>
] it is highlighted how critical it is the choice of parameters. Thus, biologists could highly benefit from a tool that gives them the possibility of simultaneously querying multiple systems and getting a global, comparative and synthetic view of the results, with the same effort one would exert in using just one of the systems.</p>
<p>In this paper we present
<italic>TReaDS - Tandem Repeats Discovery Service</italic>
, a
<italic>TRs meta search engine </italic>
that forwards the user requests to different tandem repeat finding services and aggregates the results. More in detail,
<italic>TReaDS </italic>
allows the user to (
<italic>i</italic>
) simultaneously run different algorithms on the same data set, (
<italic>ii</italic>
) choose manually, for each algorithm, a different parameter settings, or express her/his request in a simple and concise way (exact or approximate, short or long TRs), delegating to
<italic>TReaDS </italic>
the burden of choosing the right choice of parameters for all the systems, and (
<italic>iii</italic>
) get back a report that can be downloaded for further, off-line, investigations.</p>
<p>
<italic>TReaDS </italic>
is currently interfaced with five services based on different algorithmic principles and techniques, thus a joint use of them is likely to lead to increased precision. In order to improve the quality of service
<italic>TReaDS </italic>
offers to its users, we plan to add to
<italic>TReaDS </italic>
other existing systems and new ones at the time they become available.</p>
</sec>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<p>
<italic>TReaDS </italic>
is a web application, and it has been completely developed by using Java-based technologies. In particular, a pool of Servlets takes care of handling the users' request (file upload, parameter settings, search), and collects the results generated by the systems involved in the query.
<italic>TReaDS </italic>
merges the results received from the external services and produces the final report with the support of the JasperReports publicly available libraries [
<xref ref-type="bibr" rid="B36">36</xref>
] On the client side there is no special requirement: just a standard browser and a viewer (suitable for the report format selected by the user).</p>
<p>
<italic>TReaDS </italic>
has the proper structure of a meta search engine, with options for changing the set of parameters of each algorithm, and for choosing the output format. The publicly available web tools for finding tandem repeats currently supported by
<italic>TReaDS </italic>
are: ATRHunter [
<xref ref-type="bibr" rid="B37">37</xref>
] mreps [
<xref ref-type="bibr" rid="B38">38</xref>
] TandemSWAN [
<xref ref-type="bibr" rid="B39">39</xref>
] and TRF [
<xref ref-type="bibr" rid="B40">40</xref>
].
<italic>TReaDS </italic>
is interfaced with the version of these tools available on-line. Note that a binary version of these systems can be also downloaded and, in some cases, there are some small differences between the web-based and the downloadable versions, especially in terms of the number of parameters that can be customized. Furthermore
<italic>TReaDS </italic>
supports TRStalker [
<xref ref-type="bibr" rid="B14">14</xref>
], an algorithm developed by our team aimed at finding long fuzzy TRs under weighted edit distance.</p>
<sec>
<title>TReaDS input/output</title>
<p>The main page of
<italic>TReaDS </italic>
is essentially composed of four sections: (1)
<italic>Algorithms</italic>
, (2)
<italic>Parameter Settings</italic>
, (3)
<italic>Report</italic>
, and (4)
<italic>Sequence </italic>
(see Figure
<xref ref-type="fig" rid="F1">1</xref>
).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>TReaDS: main page of the graphics user interface</bold>
. The main page of the graphics user interface allows setting the input parameters. This page has sub-sections for: algorithms selection, parameters of the tandem repeats to be reported, style of the output report, and input sequence.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-1"></graphic>
</fig>
<p>In the
<bold>Algorithms </bold>
section it is possible to choose any combination of the supported systems.</p>
<p>In the
<bold>Parameter setting </bold>
section
<italic>TReaDS </italic>
provides two ways to set the parameters for the chosen systems: (
<italic>i</italic>
) the
<italic>simple mode</italic>
, where it is possible to specify the kind of TRs to look for, by setting the minimum and maximum motif length, the minimum exponent (i.e. the number of repetitions), and the maximum percentages of allowed substitutions and in/dels (insertions and deletions); (
<italic>ii</italic>
) the
<italic>advanced mode</italic>
, where the user can run each system with manually selected parameters, if she wants a fine-grained control over the settings.</p>
<p>In the
<bold>Report </bold>
section the user:</p>
<p>1. decides if she wants in the final report a graphical visualization of the found TRs;</p>
<p>2. chooses if the input sequence (or a part of it) must be included into the final report;</p>
<p>3. sets the length of the
<italic>flanking sequence</italic>
; and</p>
<p>4. chooses the final report format among the available ones: HTML, Excel, PDF, and RTF.</p>
<p>In the
<bold>Sequence </bold>
section it is possible to submit a sequence as a file, or to paste it in a given text area; furthermore the user can chose if the whole sequence or just a part of it must be analyzed.
<italic>TReaDS </italic>
takes as input either a FASTA or plain text genomic sequence. The size limit for an input sequence corresponds to the present limit of ATRHunter: 2Mbp.</p>
<p>The user can decide to wait on-line for the result or to receive them via email by providing a valid email address.</p>
<p>Once the responses coming from the TR finding services have been received,
<italic>TReaDS </italic>
merges the results and produces a report containing the following sub-reports:</p>
<p>
<bold>Sequence sub-report</bold>
. The sequence sub-report contains the sequence, if requested, and some information such as length and distribution of the different bases (see Figure
<xref ref-type="fig" rid="F2">2</xref>
).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>TReaDS: example of the sequence sub-report</bold>
. The sequence sub-report of the output report gives basic statistics on the nucleotide frequencies of the input sequence and the total length of the clusters of tandem repeats reported.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-2"></graphic>
</fig>
<p>
<bold>Summary sub-report</bold>
. The summary sub-report contains, for each system involved in the query, the algorithm name, the number of TRs found, whether the connection has been successful (if not, the type of error encountered is reported), and the response time. It is also provided a chart that shows a comparison of the systems (the comparison is simply based on the number of TRs found) (see Figure
<xref ref-type="fig" rid="F3">3</xref>
).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>TReaDS: example of the summary sub-report</bold>
. The summary sub-report of the output report gives the termination codes for each algorithm and basic statistics on the number of tandem repeats detected by each algorithm.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-3"></graphic>
</fig>
<p>
<bold>Algorithm sub-reports</bold>
. There is one algorithm sub-report for each system included in the search process (see, for instance, Figure
<xref ref-type="fig" rid="F2">2</xref>
). It contains the detail of the parameters used and the list of the TRs found by the specific algorithm, including their initial position, length, number of repetitions, and consensus. In case of
<italic>advanced mode </italic>
search the parameters are those the user set for the given algorithm, while in case of
<italic>simple mode </italic>
search the global parameters given as input are reported (see Figure
<xref ref-type="fig" rid="F4">4</xref>
).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>TReaDS: example of an algorithm sub-report</bold>
. The algorithm sub-report of the output report lists separately the tandem repeats found by each algorithm and their basic features.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-4"></graphic>
</fig>
<p>
<bold>Clusters sub-report</bold>
.
<italic>TReaDS </italic>
merges the results of all algorithms to give a global view of them by identifying overlapping TRs. Two TRs overlap if they share one or more positions in the sequence. The overlapping relation is an equivalence relation thus it allows us to partition the found TRs into groups that we call
<italic>clusters</italic>
. Such clusters are reported in the
<italic>clusters sub-report </italic>
(see Figure
<xref ref-type="fig" rid="F5">5</xref>
). Graphically, a cluster covers a contiguous segment of the input sequence without gaps. The report contains a list of all
<italic>clusters </italic>
found. For each cluster the following information is included: flanking sequence (if requested), starting and ending positions of the covered segment, list of TRs that form the cluster, and some details for each TR (starting and ending position, length, number of repetitions, consensus). If the user has chosen to include the images in the final report, it is also possible to view each cluster in a graphical form (see Figure
<xref ref-type="fig" rid="F6">6</xref>
).</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Example of a cluster returned by TReaDS</bold>
. The clusters sub-report of the output report lists all tandem repeats organized in clusters of overlapping tandem repeats. For each cluster its beginning and end positions are reported, and the constituent tandem repeats.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-5"></graphic>
</fig>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption>
<p>
<bold>A cluster returned in graphical format</bold>
. Example of a cluster returned by
<italic>TReaDS</italic>
: the cluster is showed in a graphical format, with the original sequence opportunely underlined in the position of the found TR, and with different colors, each corresponding to the particular queried algorithm returning the TR.</p>
</caption>
<graphic xlink:href="1471-2105-13-S4-S3-6"></graphic>
</fig>
</sec>
</sec>
<sec>
<title>Results</title>
<sec>
<title>Experimental methodology</title>
<p>The relevant sequences have been downloaded from PubMed (See NCBI codes in Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
) and the position of the expansion locus identified via reference to the relevant literature for the target disease. For sequences up to 10000 nt the whole sequence has been analyzed. For longer sequences a sub-sequence in the range -5000 +5000 nt centered on the expansion locus has been analyzed. The tool
<italic>TReaDS </italic>
has been set with 5 algorithms; the parameter setting is reported in table
<xref ref-type="table" rid="T4">4</xref>
.</p>
<table-wrap id="T4" position="float">
<label>Table 4</label>
<caption>
<p>Parameters for TReaDS used in the experiments.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Parameter name</th>
<th align="left">Value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Minimum motif length</td>
<td align="left">10</td>
</tr>
<tr>
<td align="left">Maximum motif length</td>
<td align="left">unlimited</td>
</tr>
<tr>
<td align="left">Minimum repeat number</td>
<td align="left">2</td>
</tr>
<tr>
<td align="left">Maximum substitution</td>
<td align="left">20%</td>
</tr>
<tr>
<td align="left">Maximum indel</td>
<td align="left">20%</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Parameters for
<italic>TReaDS </italic>
used in the experiments..</p>
</table-wrap-foot>
</table-wrap>
<p>First, we run
<italic>TReaDS </italic>
and by inspecting the output returned it is possible to identify the longest TR covering the expansion locus. In a second phase, for each analyzed sequence, the algorithm that found a covering FTR has been tuned so to possibly find a better fuzzy TR (with a longer motif, and lower error level), while minimizing the measure of the union of fuzzy TRs of the same type in that sequence.</p>
<p>In most cases a single covering FTR has been found. In one case (SCA10) two partially overlapping FTRs cover the expansion locus. The FTRs found have copy number roughly between 2 and 3 in most cases. In principle, a FTR containing an EL may arise from a large self-overlapping of the EL segment in the FTR. Thus we need to show that such self-overlapping does not influence our data. Simple consideration based on the ratio of the lengths of the FTR and EL segments imply that no self-overlapping can occur when the ratio is greater or equal to 2. For a ratio 1.8 at most the overlap can be of the order of 10% of the length of the EL.</p>
<p>We also measure the total length of the regions of the sequence covered by FTR of the same type (same motif length or longer, and same percentage of error) as the one identified as covering the expansion locus. The ratio of this length and the length of the sequence gives a conservative estimate to the probability that a randomly chosen position in the sequence is covered by a FTR of the type considered. The value of such probability is quite small for almost all of the sequences, resulting in an average probability over all the sequences associated to repeat expansion diseases of 0.12.</p>
</sec>
<sec>
<title>Experiments with repeat expansion sequences</title>
<p>The list of the major diseases due to repeat expansion are taken from [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B3">3</xref>
].</p>
<p>An important subfamily is composed of polyglutammine diseases (polyQ) since the repeated triplet motif is the codon CAG, in a coding region, that encodes the glutamine (Q) amino acid (see table
<xref ref-type="table" rid="T1">1</xref>
and
<xref ref-type="table" rid="T5">5</xref>
). A second subfamily is the family of polyanaline (polyA) expansion disease, where the expanding motif is formed by triplets GCN (see table
<xref ref-type="table" rid="T2">2</xref>
and
<xref ref-type="table" rid="T6">6</xref>
). Other diseases are classified as non-polyQ and non-polyA and are listed in tables
<xref ref-type="table" rid="T3">3</xref>
and
<xref ref-type="table" rid="T7">7</xref>
.</p>
<table-wrap id="T5" position="float">
<label>Table 5</label>
<caption>
<p>Table of fuzzy tandem repeats for PolQ TR.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene code</th>
<th align="left">Seq length</th>
<th align="left">Cover</th>
<th align="left">TR-beg</th>
<th align="left">TR-end</th>
<th align="left">FTR-beg</th>
<th align="left">FTR-end</th>
<th align="left">FTR/TR</th>
<th align="left">Cover/Length</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">ATN1</td>
<td align="left">4367</td>
<td align="left">206</td>
<td align="left">1687</td>
<td align="left">1743</td>
<td align="left">1646</td>
<td align="left">1751</td>
<td align="left">1.875</td>
<td align="left">0.047</td>
</tr>
<tr>
<td align="left">HTT</td>
<td align="left">13481</td>
<td align="left">196</td>
<td align="left">197</td>
<td align="left">265</td>
<td align="left">196</td>
<td align="left">367</td>
<td align="left">2.514</td>
<td align="left">0.014</td>
</tr>
<tr>
<td align="left">HS-AR</td>
<td align="left">4314</td>
<td align="left">377</td>
<td align="left">1286</td>
<td align="left">1354</td>
<td align="left">1224</td>
<td align="left">1391</td>
<td align="left">2.455</td>
<td align="left">0.087</td>
</tr>
<tr>
<td align="left">ATXN1</td>
<td align="left">10636</td>
<td align="left">4237</td>
<td align="left">1560</td>
<td align="left">1646</td>
<td align="left">1500</td>
<td align="left">1718</td>
<td align="left">2.534</td>
<td align="left">0.398</td>
</tr>
<tr>
<td align="left">ATXN2</td>
<td align="left">4712</td>
<td align="left">401</td>
<td align="left">658</td>
<td align="left">726</td>
<td align="left">629</td>
<td align="left">748</td>
<td align="left">1.75</td>
<td align="left">0.085</td>
</tr>
<tr>
<td align="left">CACNA1A</td>
<td align="left">8641</td>
<td align="left">579</td>
<td align="left">7186</td>
<td align="left">7224</td>
<td align="left">7160</td>
<td align="left">7425</td>
<td align="left">6.973</td>
<td align="left">0.067</td>
</tr>
<tr>
<td align="left">ATXN7</td>
<td align="left">7242</td>
<td align="left">443</td>
<td align="left">641</td>
<td align="left">670</td>
<td align="left">576</td>
<td align="left">743</td>
<td align="left">5.758</td>
<td align="left">0.061</td>
</tr>
<tr>
<td align="left">TBP</td>
<td align="left">1921</td>
<td align="left">305</td>
<td align="left">451</td>
<td align="left">564</td>
<td align="left">389</td>
<td align="left">636</td>
<td align="left">2.185</td>
<td align="left">0.158</td>
</tr>
<tr>
<td align="left">ATXN3</td>
<td align="left">10000(*)</td>
<td align="left">-</td>
<td align="left">943</td>
<td align="left">984</td>
<td align="left">-</td>
<td align="left">-</td>
<td align="left">-</td>
<td align="left">-</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of fuzzy tandem repeats for PolQ TR. The table reports: gene code, sequence length, length of the region covered by FTRs, TR expansion begin and TR expansion end, FTR begin and FTR end, ratio of FTR length over TR length, ratio of region covered by FTRs over total sequence length. (*) indicates that a subsequence has been analyzed.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T6" position="float">
<label>Table 6</label>
<caption>
<p>Table of covering fuzzy tandem repeats for polyalanine TR.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene code</th>
<th align="left">Seq length</th>
<th align="left">Cover</th>
<th align="left">TR-beg</th>
<th align="left">TR-end</th>
<th align="left">FTR-beg</th>
<th align="left">FTR-end</th>
<th align="left">FTR/TR</th>
<th align="left">Cover/Length</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">FOXL2</td>
<td align="left">9900</td>
<td align="left">5000</td>
<td align="left">6079</td>
<td align="left">6115</td>
<td align="left">6082</td>
<td align="left">6258</td>
<td align="left">4.888</td>
<td align="left">0.505</td>
</tr>
<tr>
<td align="left">ZIC2</td>
<td align="left">11701</td>
<td align="left">677</td>
<td align="left">8385</td>
<td align="left">8429</td>
<td align="left">8304</td>
<td align="left">8534</td>
<td align="left">5.227</td>
<td align="left">0.057</td>
</tr>
<tr>
<td align="left">PHOX2B</td>
<td align="left">11889</td>
<td align="left">187</td>
<td align="left">7940</td>
<td align="left">7993</td>
<td align="left">7830</td>
<td align="left">8015</td>
<td align="left">3.490</td>
<td align="left">0.015</td>
</tr>
<tr>
<td align="left">ARX</td>
<td align="left">19255</td>
<td align="left">1039</td>
<td align="left">7252</td>
<td align="left">7299</td>
<td align="left">7199</td>
<td align="left">7424</td>
<td align="left">4.787</td>
<td align="left">0.053</td>
</tr>
<tr>
<td align="left">SOX3</td>
<td align="left">9074</td>
<td align="left">1114</td>
<td align="left">5700</td>
<td align="left">5744</td>
<td align="left">5584</td>
<td align="left">6191</td>
<td align="left">13.795</td>
<td align="left">0.122</td>
</tr>
<tr>
<td align="left">RUNX2</td>
<td align="left">10000 (*)</td>
<td align="left">766</td>
<td align="left">99435</td>
<td align="left">99485</td>
<td align="left">99310</td>
<td align="left">99488</td>
<td align="left">3.560</td>
<td align="left">0.076</td>
</tr>
<tr>
<td align="left">HOXA13</td>
<td align="left">10227</td>
<td align="left">658</td>
<td align="left">5375</td>
<td align="left">5428</td>
<td align="left">5328</td>
<td align="left">5493</td>
<td align="left">3.113</td>
<td align="left">0.064</td>
</tr>
<tr>
<td align="left">HOXD13</td>
<td align="left">10135</td>
<td align="left">279</td>
<td align="left">5256</td>
<td align="left">5300</td>
<td align="left">5025</td>
<td align="left">5304</td>
<td align="left">6.340</td>
<td align="left">0.027</td>
</tr>
<tr>
<td align="left">PABPN1</td>
<td align="left">12976</td>
<td align="left">1337</td>
<td align="left">6286</td>
<td align="left">6303</td>
<td align="left">6278</td>
<td align="left">6405</td>
<td align="left">7.470</td>
<td align="left">0.103</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of covering fuzzy tandem repeats for polyalanine TR. The table reports: gene code, sequence length, length of the region covered by FTRs, TR expansion begin and TR expansion end, FTR begin and FTR end, ratio of FTR length over TR length, ratio of region covered by FTRs over total sequence length. (*) indicates that a subsequence has been analyzed.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T7" position="float">
<label>Table 7</label>
<caption>
<p>Table of covering fuzzy tandem repeats for non-polyglutammine and non-polyalanine TR.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene code</th>
<th align="left">Seq length</th>
<th align="left">Cover</th>
<th align="left">TR-beg</th>
<th align="left">TR-end</th>
<th align="left">FTR-beg</th>
<th align="left">FTR-end</th>
<th align="left">FTR/TR</th>
<th align="left">Cover/Length</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">FMR1</td>
<td align="left">46137</td>
<td align="left">3415</td>
<td align="left">5061</td>
<td align="left">5171</td>
<td align="left">4983</td>
<td align="left">5168</td>
<td align="left">1.681</td>
<td align="left">0.074</td>
</tr>
<tr>
<td align="left">AFF2</td>
<td align="left">16800</td>
<td align="left">595</td>
<td align="left">5021</td>
<td align="left">5062</td>
<td align="left">4958</td>
<td align="left">5429</td>
<td align="left">11.487</td>
<td align="left">0.035</td>
</tr>
<tr>
<td align="left">FXN</td>
<td align="left">2465</td>
<td align="left">723</td>
<td align="left">2185</td>
<td align="left">2212</td>
<td align="left">2036</td>
<td align="left">2414</td>
<td align="left">14.000</td>
<td align="left">0.293</td>
</tr>
<tr>
<td align="left">DMPK</td>
<td align="left">2465</td>
<td align="left">273</td>
<td align="left">2304</td>
<td align="left">2363</td>
<td align="left">2213</td>
<td align="left">2365</td>
<td align="left">2.576</td>
<td align="left">0.110</td>
</tr>
<tr>
<td align="left">ZNF9</td>
<td align="left">23153</td>
<td align="left">5462</td>
<td align="left">16312</td>
<td align="left">16387</td>
<td align="left">16264</td>
<td align="left">17088</td>
<td align="left">10.986</td>
<td align="left">0.235</td>
</tr>
<tr>
<td align="left">ATXN10(**)</td>
<td align="left">50000(*)</td>
<td align="left">1301</td>
<td align="left">128559</td>
<td align="left">128628</td>
<td align="left">28543</td>
<td align="left">28654</td>
<td align="left">1.608</td>
<td align="left">0.026</td>
</tr>
<tr>
<td align="left">PPP2R2B</td>
<td align="left">5120</td>
<td align="left">703</td>
<td align="left">2088</td>
<td align="left">2366</td>
<td align="left">1842</td>
<td align="left">2363</td>
<td align="left">1.874</td>
<td align="left">0.137</td>
</tr>
<tr>
<td align="left">CSTB</td>
<td align="left">9429</td>
<td align="left">3098</td>
<td align="left">4899</td>
<td align="left">4935</td>
<td align="left">4472</td>
<td align="left">5019</td>
<td align="left">15.194</td>
<td align="left">0.328</td>
</tr>
<tr>
<td align="left">JPH3</td>
<td align="left">10000(*)</td>
<td align="left">807</td>
<td align="left">35581</td>
<td align="left">35746</td>
<td align="left">35476</td>
<td align="left">35755</td>
<td align="left">1.690</td>
<td align="left">0.080</td>
</tr>
<tr>
<td align="left">ATXN8OS</td>
<td align="left">39541</td>
<td align="left">-</td>
<td align="left">37142</td>
<td align="left">37216</td>
<td align="left">-</td>
<td align="left">-</td>
<td align="left">-</td>
<td align="left">-</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of covering fuzzy tandem repeats for non-polyglutammine and non-polyalanine TR. The table reports: gene code, sequence length, length of the region covered by FTRs, TR expansion begin and TR expansion end, FTR begin and FTR end, ratio of FTR length over TR length, ratio of region covered by FTRs over total sequence length. (*) indicates that a subsequence has been analyzed. (**) indicates two overlapping FTRs.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Specificity of fuzzy tandem repeats for genes with CAG-encoded polyglutammine</title>
<p>In order to test the specificity of the association of covering fuzzy TRs with repeat expansion loci we have analyzed a sample of genes with long CAG-encoded polyglutammine (more than 6 repeating units). We have chosen this subclass since it has been extensively studied in literature. The statistics for this type of repeats have been collected in [
<xref ref-type="bibr" rid="B6">6</xref>
] that lists 148 sequences in ORF regions (out of a total of 718), and [
<xref ref-type="bibr" rid="B41">41</xref>
] listing 64 polyQ genes. We have examined the first 25 entries of the list in [
<xref ref-type="bibr" rid="B6">6</xref>
] having CAG repeats in ORF regions. Entries no more present in NCBI Nucleotide databases have been replaced by the newer version of the same gene when possible; entries for the same gene have been merged. Thus we have examined a total of 17 sequences in tables
<xref ref-type="table" rid="T8">8</xref>
and
<xref ref-type="table" rid="T9">9</xref>
.</p>
<table-wrap id="T8" position="float">
<label>Table 8</label>
<caption>
<p>Table of covering fuzzy tandem repeats for a sample of CAG-encoded polyglutammine that have been investigated for possible connections to pathologies.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene code</th>
<th align="left">Tri-repeat position</th>
<th align="left">FTR position</th>
<th align="left">References</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">RAI1</td>
<td align="left">1300+39</td>
<td align="left">1290-1368</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
,
<xref ref-type="bibr" rid="B42">42</xref>
]</td>
</tr>
<tr>
<td align="left">DACH1</td>
<td align="left">846+42</td>
<td align="left">830-926</td>
<td align="left">[
<xref ref-type="bibr" rid="B43">43</xref>
,
<xref ref-type="bibr" rid="B44">44</xref>
]</td>
</tr>
<tr>
<td align="left">(TNRC3) MAML3</td>
<td align="left">2220+36,</td>
<td align="left">2187-2292,</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
,
<xref ref-type="bibr" rid="B45">45</xref>
]</td>
</tr>
<tr>
<td></td>
<td align="left">2667+24,</td>
<td align="left">2628-2698,</td>
<td></td>
</tr>
<tr>
<td></td>
<td align="left">3030+24</td>
<td align="left">2960-3053</td>
<td></td>
</tr>
<tr>
<td align="left">NRG2</td>
<td align="left">302+18, 329+24</td>
<td align="left">227-401</td>
<td align="left">[
<xref ref-type="bibr" rid="B46">46</xref>
,
<xref ref-type="bibr" rid="B52">52</xref>
]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of covering fuzzy tandem repeats for a sample of CAG-encoded polyglutammine that have been investigated for possible connection to pathologies. The table reports: gene code, location of the polyQ locus, location of the fuzzy TR if existing, relevant references.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T9" position="float">
<label>Table 9</label>
<caption>
<p>Table of covering fuzzy tandem repeats for a sample of CAG-encoded polyglutammine.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene code</th>
<th align="left">Tri-repeat position</th>
<th align="left">FTR position</th>
<th align="left">References</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">NFAT5</td>
<td align="left">1497+18</td>
<td align="left">-</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
]</td>
</tr>
<tr>
<td align="left">vascular endothelial cadherin 2</td>
<td align="left">4739+18</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">PRDM8</td>
<td align="left">1865+18</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">PRDM10</td>
<td align="left">3327+27</td>
<td align="left">-</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
]</td>
</tr>
<tr>
<td align="left">ATBF1-A</td>
<td align="left">10262+21</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">USP7</td>
<td align="left">208+21</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">IRS1</td>
<td align="left">2049+18</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">(ATBF1) ZFHX3</td>
<td align="left">10262+21</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">FBX11</td>
<td align="left">90+21</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">PCQAP</td>
<td align="left">611+18,
<break></break>
711+18,
<break></break>
831+36</td>
<td align="left">607-652, 712-869</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
]</td>
</tr>
<tr>
<td align="left">(DRIL2) ARID3B</td>
<td align="left">214+24</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">POU3F2</td>
<td align="left">594+18</td>
<td align="left">516-618</td>
<td align="left">[
<xref ref-type="bibr" rid="B41">41</xref>
]</td>
</tr>
<tr>
<td align="left">PALM2-AKAP2</td>
<td align="left">1738+18</td>
<td align="left">-</td>
<td></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of covering fuzzy tandem repeats for a sample of CAG-encoded polyglutammine. The table reports: gene code, location of the polyQ locus, location of the fuzzy TR if existing, relevant references.</p>
</table-wrap-foot>
</table-wrap>
<p>Four sequences have been investigated in literature for their potential role in diseases (table
<xref ref-type="table" rid="T8">8</xref>
).</p>
<p>Polymorphism of the the CAG repeat in protein RAI1 has been found to influence the onset age in patient affected by the spinocerebellar ataxia type 2 (SCA2) [
<xref ref-type="bibr" rid="B42">42</xref>
]. Data shown in [
<xref ref-type="bibr" rid="B43">43</xref>
,
<xref ref-type="bibr" rid="B44">44</xref>
] indicate a genetic linkage of the chromosomal region containing the gene DACH with many developmental disorders affecting limbs, kidneys, eyes, and ears, although specific causality and mechanisms still need to be elucidated. The gene MAML3 is shortlisted in [
<xref ref-type="bibr" rid="B45">45</xref>
] for further study in disease associations, based on comparing the conservation patterns among human, mouse and rat genomes. The human neuregulin-2 (NRG2) gene has been evaluated for a possible association with the Charcot-Marie-Tooth disease [
<xref ref-type="bibr" rid="B46">46</xref>
]. Since the pathogenic status of these repeats is still unclear we exclude them from further analysis.</p>
<p>For the remaining 13 sequences (table
<xref ref-type="table" rid="T9">9</xref>
) we have found evidence of a covering Fuzzy TR in 2 cases (15%).</p>
</sec>
<sec>
<title>Specificity of fuzzy tandem repeats for genes with pathological SNPs</title>
<p>In this section we explore the issue of the specificity of FTR covering mutation loci linked to pathological conditions. We explore two different types of mutations, the first one is due to single nucleotide substitutions (SNP). The data base dbSNP (Human Build 135) [
<xref ref-type="bibr" rid="B47">47</xref>
,
<xref ref-type="bibr" rid="B48">48</xref>
] lists as of today, 1835 records of pathogenic SNPs for Homo Sapiens sequences. We have selected a sample (See Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
) of such sequences and analyzed them using
<italic>TReaDS</italic>
. Results reported in table
<xref ref-type="table" rid="T10">10</xref>
show that out of 43 pathogenic SNPs in 14 sequences, only 2 are covered by a long FTR (14%).</p>
<table-wrap id="T10" position="float">
<label>Table 10</label>
<caption>
<p>Table of pathogenic SNPs in Homo sapiens from dbSNP and covering fuzzy tandem repeats.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene/Protein</th>
<th align="left">Seq length</th>
<th align="left">Num. path. SNP</th>
<th align="left">Covered by FTR</th>
<th align="left">FTR</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">FZD6</td>
<td align="left">3806</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">NSDHL</td>
<td align="left">1581</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">GJB1</td>
<td align="left">1623</td>
<td align="left">10</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">IDS</td>
<td align="left">1437</td>
<td align="left">5</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">IDS</td>
<td align="left">5832</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">SLC16A2</td>
<td align="left">4396</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">NSDHL</td>
<td align="left">1581</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">ABCB7</td>
<td align="left">2404</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">TIMM8A</td>
<td align="left">1459</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">UBA1</td>
<td align="left">3544</td>
<td align="left">3</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">FLNA</td>
<td align="left">8533</td>
<td align="left">2</td>
<td align="left">2</td>
<td align="left">[1000- 3946]</td>
</tr>
<tr>
<td align="left">MED12</td>
<td align="left">6985</td>
<td align="left">1</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">PRPS1</td>
<td align="left">2156</td>
<td align="left">4</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
<tr>
<td align="left">ARSE</td>
<td align="left">2220</td>
<td align="left">4</td>
<td align="left">0</td>
<td align="left">-</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of pathogenic SNPs in Homo sapiens from dbSNP and covering fuzzy tandem repeats. The table reports: gene/protein code, sequence length, number of pathogenic SNP, number of pathogenic SNPs covered by FTR, FTR [Begin - End] if existing.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Specificity of fuzzy tandem repeats for genes with pathological in/dels</title>
<p>The data base dbSNP (Human Build 135) lists, as of today, 391 records of pathogenic short in/dels for sequences of Homo Sapiens. We have selected a sample of such sequences (See Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
) and analyzed them using
<italic>TReaDS</italic>
. Data in table
<xref ref-type="table" rid="T11">11</xref>
show that for 67 pathogenic in/dels only 9 are covered by FTR (13%).</p>
<table-wrap id="T11" position="float">
<label>Table 11</label>
<caption>
<p>Table of pathogenic in/dels in Homo sapiens from dbSNP and covering fuzzy tandem repeats.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Gene/Protein</th>
<th align="left">Seq length</th>
<th align="left">Num. path. in/dels</th>
<th align="left">Covered by FTR</th>
<th align="left">FTR</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">CFTR/MRP</td>
<td align="left">1000 (*)</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">OTC</td>
<td align="left">1000 (*)</td>
<td align="left">3</td>
<td align="left">2</td>
<td align="left">[117 - 883],</td>
<td align="left">[429 - 569]</td>
</tr>
<tr>
<td align="left">OTC</td>
<td align="left">1647</td>
<td align="left">30</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">HS mitochondrion</td>
<td align="left">16569</td>
<td align="left">1</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">NSDHL</td>
<td align="left">1581</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">GJB1</td>
<td align="left">1623</td>
<td align="left">2</td>
<td align="left">1</td>
<td align="left">[319-373]</td>
<td></td>
</tr>
<tr>
<td align="left">SLC16A2</td>
<td align="left">4396</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">SLC6A8</td>
<td align="left">3580</td>
<td align="left">2</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">CACNA1F</td>
<td align="left">6080</td>
<td align="left">1</td>
<td align="left">0</td>
<td align="left">-</td>
<td></td>
</tr>
<tr>
<td align="left">FLNA</td>
<td align="left">8533</td>
<td align="left">1</td>
<td align="left">1</td>
<td align="left">[280 329]</td>
<td></td>
</tr>
<tr>
<td align="left">KCNQ2</td>
<td align="left">3158</td>
<td align="left">21</td>
<td align="left">5</td>
<td align="left">[162-275]</td>
<td align="left">[1666-1691]</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td align="left">[2188-2214]</td>
<td align="left">[2654-2728]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Table of pathogenic in/dels in Homo sapiens from dbSNP and covering fuzzy tandem repeats. The table reports: gene/protein code, NCBI code for the analyzed sequence, sequence length, codes of pathogenic in/del, number of pathogenic in/dels covered by FTR, FTR [Begin - End] if existing. (*) indicates that the analysis has been done on a subsequence of length 1000 centered on the position of each in/del.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec>
<title>Conclusions</title>
<sec>
<title>Results on repeat expansion diseases</title>
<p>We have found that for the current set of 29 repeat expansion diseases in 27 cases (93%) there is a long fuzzy TR covering the expansion locus. The ratio of the length of the fuzzy TR to the expansion locus ranges from a minimum of 1.608 and a maximum of 15.194. Also the specificity of the association has been investigated for the set of genes with CAG-encoded polyglutammine tracts, for pathogenic SNPs, for pathogenic in/dels, and for the non-pathogenic sections of the sequences. This specificity analysis shows that in just about 15% of the control cases there is an association to fuzzy TRs. These preliminary results indicate that fuzzy TRs may be an important novel cis-element that influences the instability of the expansion locus. However, a more in depth analysis and consideration of causal mechanisms involved is needed to confirm the correlation between fuzzy TRs and RE diseases.</p>
</sec>
<sec>
<title>The power of TReaDS</title>
<p>As large scale studies are being pursued, it is important to facilitate the use of the TR search engines publicly available. In the literature, the comparison of several TR finding tools highlighted significant differences among the sets of results. Other work made evident the importance of tuning the parameters of operation. In this paper we presented
<italic>TReaDS</italic>
, a web application which provides a single user interface and enables a simultaneous application of different techniques on the same data set. With
<italic>TReaDS </italic>
the user can express the characteristics of her request through a simple and unified interface, or she can customize the set of parameters of each system. The user gets back a report that contains a global and comparative view of the results. The report can be downloaded for a deeper off-line investigation. This way,
<italic>TReaDS </italic>
allows to harness the power of different web-based TR search engines with a minimal effort.</p>
<p>Furthermore, merging and comparing the outcome of different search tools on the same data can be useful for gaining higher confidence that all the relevant TRs in the data set have been found.</p>
<p>To the best of our knowledge
<italic>TReaDS </italic>
is the first meta search engine for tandem repeats and there is no similar and comparable system freely available.</p>
</sec>
<sec>
<title>Future work</title>
<p>The database
<italic>TRbase </italic>
[
<xref ref-type="bibr" rid="B49">49</xref>
] maintains an annotated correspondence between genes known to be involved in some disease and the tandem repeats in their DNA sequence (detected with TRF [
<xref ref-type="bibr" rid="B29">29</xref>
]). For the class of repeat expansion diseases a direct causal link between TRs and the onset of the disease is known. As future work we plan to analyze the correlation between other diseases (or disease classes) and the presence and type of fuzzy TRs, using
<italic>TReaDS</italic>
, in order to suggest hypothesis on possible roles for fuzzy TRs in that context.</p>
<p>In this paper we studied those trinucleotide expansion (and repeat expansion) leading to the manifestation of diseases. However, polymorphic microsatellites and ministatellites are very common in the human genome (as well as in all eukaryote genomes), thus one could advance the hypothesis that FTR may have a facilitating role in such polymorphisms (independently from the manifestation of a pathology). Testing this far-reaching hypothesis which is our next objective, is far from trivial since comprehensive maps of polymorphic/monomorphics TRs for the human genome, (even restricted the coding regions) are just being produced [
<xref ref-type="bibr" rid="B50">50</xref>
,
<xref ref-type="bibr" rid="B51">51</xref>
].</p>
</sec>
</sec>
<sec>
<title>List of abbreviations</title>
<p>DRPLA: Dentatorubropallidoluysian atrophy; EL: Expansion locus; FTR: Fuzzy tandem repeats; HD: Huntington disease; IR: Initiation region; polyA: polyalanine; polyQ: polyglutammine; RE: Repeat expansion; SCA: Spinocerebral ataxia; TFBS: Transcription factor binding site; TNR: Trinucleotide repeat; TR: Tandem repeat; TReaDS: Tandem repeats discovery service.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<title>Availability and requirements</title>
<p>
<bold>Project name: </bold>
<italic>TReaDS</italic>
</p>
<p>
<bold>Project home page: </bold>
<ext-link ext-link-type="uri" xlink:href="http://bioalgo.iit.cnr.it/treads">http://bioalgo.iit.cnr.it/treads</ext-link>
</p>
<p>
<bold>Operating system(s): </bold>
Platform independent</p>
<p>
<bold>Programming language: </bold>
Java</p>
<p>
<bold>Other requirements: </bold>
JavaScripts Enabled (on the client side)</p>
<p>
<bold>License: </bold>
Lesser General Public License (LGPL)</p>
<p>
<bold>Any restrictions to use by non-academics: </bold>
None,
<italic>TReaDS </italic>
is a web application free and open to all users</p>
</sec>
<sec>
<title>Authors' contributions</title>
<p>AV conceived of the application tool, participated in its design and development, and helped to draft the manuscript. MER participated in the design and development of the application, performed the testing and debugging phases, performed experiments, and helped to draft the manuscript. MP conceived the application of
<italic>TReaDS </italic>
to repeat expansion sequences, performed experiments, drafted the final manuscript and exercised general supervision. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>
<bold>"Tandem repeats discovery service (
<italic>TReaDS</italic>
) applied to finding novel cis-acting factors in repeat expansion diseases - supplementary information --" contains NCBI codes of analyzed sequences and dbSNP codes for the analyzed SNPs and in/dels</bold>
.</p>
</caption>
<media xlink:href="1471-2105-13-S4-S3-S1.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<sec>
<title>Acknowledgements</title>
<p>This work was partially supported by the European Community's Seventh Framework Programme FP7/2007-2013 under grant agreement N 223920 (Virtual Physiological Human Network of Excellence).</p>
<p>This article has been published as part of
<italic>BMC Bioinformatics </italic>
Volume 13 Supplement 4, 2012: Italian Society of Bioinformatics (BITS): Annual Meeting 2011. The full contents of the supplement are available online at
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S4">http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S4</ext-link>
.</p>
</sec>
<ref-list>
<ref id="B1">
<mixed-citation publication-type="journal">
<name>
<surname>Cummings</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Zoghbi</surname>
<given-names>HY</given-names>
</name>
<article-title>Fourteen and counting: unraveling trinucleotide repeat diseases</article-title>
<source>Human Molecular Genetics</source>
<year>2000</year>
<volume>9</volume>
<issue>6</issue>
<fpage>909</fpage>
<lpage>916</lpage>
<pub-id pub-id-type="doi">10.1093/hmg/9.6.909</pub-id>
<pub-id pub-id-type="pmid">10767314</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<name>
<surname>Usdin</surname>
<given-names>K</given-names>
</name>
<article-title>The biological effects of simple tandem repeats: Lessons from the repeat expansion diseases</article-title>
<source>Genome Research</source>
<year>2008</year>
<volume>18</volume>
<issue>7</issue>
<fpage>1011</fpage>
<lpage>1019</lpage>
<pub-id pub-id-type="doi">10.1101/gr.070409.107</pub-id>
<pub-id pub-id-type="pmid">18593815</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<name>
<surname>Mirkin</surname>
<given-names>SM</given-names>
</name>
<article-title>Expandable DNA repeats and human disease</article-title>
<source>Nature</source>
<year>2007</year>
<volume>447</volume>
<fpage>932</fpage>
<lpage>940</lpage>
<pub-id pub-id-type="doi">10.1038/nature05977</pub-id>
<pub-id pub-id-type="pmid">17581576</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<name>
<surname>Richard</surname>
<given-names>GF</given-names>
</name>
<name>
<surname>Kerrest</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Dujon</surname>
<given-names>B</given-names>
</name>
<article-title>Comparative Genomics and Molecular Dynamics of DNA Repeats in Eukaryotes</article-title>
<source>Microbiol Mol Biol Rev</source>
<year>2008</year>
<volume>72</volume>
<issue>4</issue>
<fpage>686</fpage>
<lpage>727</lpage>
<pub-id pub-id-type="doi">10.1128/MMBR.00011-08</pub-id>
<pub-id pub-id-type="pmid">19052325</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<name>
<surname>Richards</surname>
<given-names>RI</given-names>
</name>
<article-title>Dynamic mutations: a decade of unstable expanded repeats in human genetic disease</article-title>
<source>Human Molecular Genetics</source>
<year>2001</year>
<volume>10</volume>
<issue>20</issue>
<fpage>2187</fpage>
<lpage>2194</lpage>
<pub-id pub-id-type="doi">10.1093/hmg/10.20.2187</pub-id>
<pub-id pub-id-type="pmid">11673400</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<name>
<surname>Jasinska</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Michlewski</surname>
<given-names>G</given-names>
</name>
<name>
<surname>de Mezer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sobczak</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kozlowski</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Napierala</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Krzyzosiak</surname>
<given-names>WJ</given-names>
</name>
<article-title>Structures of trinucleotide repeats in human transcripts and their functional implications</article-title>
<source>Nucleic Acids Research</source>
<year>2003</year>
<volume>31</volume>
<issue>19</issue>
<fpage>5463</fpage>
<lpage>5468</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg767</pub-id>
<pub-id pub-id-type="pmid">14500808</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<name>
<surname>Wells</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Dere</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hebert</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Napierala</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Son</surname>
<given-names>LS</given-names>
</name>
<article-title>Advances in mechanisms of genetic instability related to hereditary neurological diseases</article-title>
<source>Nucleic Acids Research</source>
<year>2005</year>
<volume>33</volume>
<issue>12</issue>
<fpage>3785</fpage>
<lpage>3798</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gki697</pub-id>
<pub-id pub-id-type="pmid">16006624</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<name>
<surname>Nenguke</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Aladjem</surname>
<given-names>MI</given-names>
</name>
<name>
<surname>Gusella</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Wexler</surname>
<given-names>NS</given-names>
</name>
<name>
<surname>Project</surname>
<given-names>TVH</given-names>
</name>
<name>
<surname>Arnheim</surname>
<given-names>N</given-names>
</name>
<article-title>Candidate DNA replication initiation regions at human trinucleotide repeat disease loci</article-title>
<source>Human Molecular Genetics</source>
<year>2003</year>
<volume>12</volume>
<issue>12</issue>
<fpage>1461</fpage>
<pub-id pub-id-type="doi">10.1093/hmg/ddg155</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<name>
<surname>Cleary</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nichol</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>YH</given-names>
</name>
<name>
<surname>Pearson</surname>
<given-names>C</given-names>
</name>
<article-title>Evidence of cis-acting factors in replication-mediated trinucleotide repeat instability in primate cells</article-title>
<source>Nature Genetics</source>
<year>2002</year>
<volume>31</volume>
<fpage>37</fpage>
<lpage>46</lpage>
<pub-id pub-id-type="doi">10.1038/ng870</pub-id>
<pub-id pub-id-type="pmid">11967533</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<name>
<surname>Brock</surname>
<given-names>GJR</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>NH</given-names>
</name>
<name>
<surname>Monckton</surname>
<given-names>DG</given-names>
</name>
<article-title>Cis-Acting Modifiers of Expanded CAG/CTG Triplet Repeat Expandability: Associations with Flanking GC Content and Proximity to CpG Islands</article-title>
<source>Human Molecular Genetics</source>
<year>1999</year>
<volume>8</volume>
<issue>6</issue>
<fpage>1061</fpage>
<lpage>1067</lpage>
<pub-id pub-id-type="doi">10.1093/hmg/8.6.1061</pub-id>
<pub-id pub-id-type="pmid">10332038</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<name>
<surname>Libby</surname>
<given-names>RT</given-names>
</name>
<name>
<surname>Hagerman</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Pineda</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Lau</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Cho</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Baccam</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Axford</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Cleary</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Sopher</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Tapscott</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Filippova</surname>
<given-names>GN</given-names>
</name>
<name>
<surname>Pearson</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>La Spada</surname>
<given-names>AR</given-names>
</name>
<article-title>CTCF cis-Regulates Trinucleotide Repeat Instability in an Epigenetic Manner: A Novel Basis for Mutational Hot Spot Determination</article-title>
<source>PLoS Genet</source>
<year>2008</year>
<volume>4</volume>
<issue>11</issue>
<fpage>e1000257</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pgen.1000257</pub-id>
<pub-id pub-id-type="pmid">19008940</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<name>
<surname>Warby</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Montpetit</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hayden</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Carroll</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Butland</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Visscher</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Collins</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Semaka</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hudson</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Hayden</surname>
<given-names>MR</given-names>
</name>
<article-title>CAG expansion in the Huntington disease gene is associated with a specific and targetable predisposing haplogroup</article-title>
<source>Am J Hum Genet</source>
<year>2009</year>
<volume>84</volume>
<issue>3</issue>
<fpage>351</fpage>
<lpage>366</lpage>
<pub-id pub-id-type="doi">10.1016/j.ajhg.2009.02.003</pub-id>
<pub-id pub-id-type="pmid">19249009</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<name>
<surname>Boeva</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Regnier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Papatsenko</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Makeev</surname>
<given-names>V</given-names>
</name>
<article-title>Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression</article-title>
<source>Bioinformatics</source>
<year>2006</year>
<volume>22</volume>
<issue>6</issue>
<fpage>676</fpage>
<lpage>684</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btk032</pub-id>
<pub-id pub-id-type="pmid">16403795</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<name>
<surname>Pellegrini</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Renda</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Vecchio</surname>
<given-names>A</given-names>
</name>
<article-title>TRStalker: an efficient heuristic for finding fuzzy tandem repeats</article-title>
<source>Bioinformatics</source>
<year>2010</year>
<volume>26</volume>
<issue>12</issue>
<fpage>i358</fpage>
<lpage>366</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btq209</pub-id>
<pub-id pub-id-type="pmid">20529928</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<name>
<surname>Rolfsmeier</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Dixon</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Pessoa-Brandão</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Pelletier</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Miret</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Lahue</surname>
<given-names>RS</given-names>
</name>
<article-title>Cis-Elements Governing Trinucleotide Repeat Instability in Saccharomyces cerevisiae</article-title>
<source>Genetics</source>
<year>2001</year>
<volume>157</volume>
<issue>4</issue>
<fpage>1569</fpage>
<lpage>1579</lpage>
<pub-id pub-id-type="pmid">11290713</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<name>
<surname>Bichara</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wagner</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lambert</surname>
<given-names>IB</given-names>
</name>
<article-title>Mechanisms of tandem repeat instability in bacteria</article-title>
<source>Mutat Res</source>
<year>2006</year>
<volume>598</volume>
<issue>1-2</issue>
<fpage>144</fpage>
<lpage>163</lpage>
<pub-id pub-id-type="doi">10.1016/j.mrfmmm.2006.01.020</pub-id>
<pub-id pub-id-type="pmid">16519906</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<name>
<surname>Sobczak</surname>
<given-names>K</given-names>
</name>
<name>
<surname>de Mezer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Michlewski</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Krol</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Krzyzosiak</surname>
<given-names>WJ</given-names>
</name>
<article-title>RNA structure of trinucleotide repeats associated with human neurological diseases</article-title>
<source>Nucleic Acids Research</source>
<year>2003</year>
<volume>31</volume>
<issue>19</issue>
<fpage>5469</fpage>
<lpage>5482</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg766</pub-id>
<pub-id pub-id-type="pmid">14500809</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<name>
<surname>Heidenfelder</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Makhof</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Topal</surname>
<given-names>MD</given-names>
</name>
<article-title>Hairpin formation in Friedreich's Ataxia triplet-repeat expansion</article-title>
<source>J Biol Chem</source>
<year>2003</year>
<volume>278</volume>
<fpage>2425</fpage>
<lpage>2431</lpage>
<pub-id pub-id-type="doi">10.1074/jbc.M210643200</pub-id>
<pub-id pub-id-type="pmid">12441336</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<name>
<surname>Marquis Gacy</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Goellner</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Juranic</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Macura</surname>
<given-names>S</given-names>
</name>
<name>
<surname>McMurray</surname>
<given-names>CT</given-names>
</name>
<article-title>Trinucleotide repeats that expand in human disease form hairpin structures in vitro</article-title>
<source>Cell</source>
<year>1995</year>
<volume>81</volume>
<issue>4</issue>
<fpage>533</fpage>
<lpage>540</lpage>
<pub-id pub-id-type="doi">10.1016/0092-8674(95)90074-8</pub-id>
<pub-id pub-id-type="pmid">7758107</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<name>
<surname>Reddy</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Tam</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bowater</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Barber</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tomlinson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Nichol Edamura</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>YH</given-names>
</name>
<name>
<surname>Pearson</surname>
<given-names>CE</given-names>
</name>
<article-title>Determinants of R-loop formation at convergent bidirectionally transcribed trinucleotide repeats</article-title>
<source>Nucleic Acids Research</source>
<year>2011</year>
<volume>39</volume>
<issue>5</issue>
<fpage>1749</fpage>
<lpage>1762</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkq935</pub-id>
<pub-id pub-id-type="pmid">21051337</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<name>
<surname>Michlewski</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Krzyzosiak</surname>
<given-names>WJ</given-names>
</name>
<article-title>Molecular Architecture of CAG Repeats in Human Disease Related Transcripts</article-title>
<source>Journal of Molecular Biology</source>
<year>2004</year>
<volume>340</volume>
<issue>4</issue>
<fpage>665</fpage>
<lpage>679</lpage>
<pub-id pub-id-type="doi">10.1016/j.jmb.2004.05.021</pub-id>
<pub-id pub-id-type="pmid">15223312</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Vitalis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wyczalkowski</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Pappu</surname>
<given-names>RV</given-names>
</name>
<article-title>Characterizing the conformational ensemble of monomeric polyglutamine</article-title>
<source>Proteins</source>
<year>2006</year>
<volume>63</volume>
<issue>2</issue>
<fpage>297</fpage>
<lpage>311</lpage>
<pub-id pub-id-type="pmid">16299774</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<name>
<surname>Faux</surname>
<given-names>NG</given-names>
</name>
<name>
<surname>Bottomley</surname>
<given-names>SP</given-names>
</name>
<name>
<surname>Lesk</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Irving</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Morrison</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>de la Banda</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Whisstock</surname>
<given-names>JC</given-names>
</name>
<article-title>Functional insights from the distribution and role of homopeptide repeat-containing proteins</article-title>
<source>Genome Research</source>
<year>2005</year>
<volume>15</volume>
<issue>4</issue>
<fpage>537</fpage>
<lpage>551</lpage>
<pub-id pub-id-type="doi">10.1101/gr.3096505</pub-id>
<pub-id pub-id-type="pmid">15805494</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal">
<name>
<surname>Kelkar</surname>
<given-names>YDD</given-names>
</name>
<name>
<surname>Tyekucheva</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chiaromonte</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Makova</surname>
<given-names>KDD</given-names>
</name>
<article-title>The genome-wide determinants of human and chimpanzee microsatellite evolution</article-title>
<source>Genome Research</source>
<year>2008</year>
<volume>18</volume>
<fpage>30</fpage>
<lpage>38</lpage>
<pub-id pub-id-type="pmid">18032720</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<name>
<surname>Vogler</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Keys</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Nemoto</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Colman</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Jay</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Keim</surname>
<given-names>P</given-names>
</name>
<article-title>Effect of repeat copy number on variable-number tandem repeat mutations in Escherichia coli O157:H7</article-title>
<source>Journal of Bacteriology</source>
<year>2006</year>
<volume>188</volume>
<issue>12</issue>
<fpage>4253</fpage>
<lpage>63</lpage>
<pub-id pub-id-type="doi">10.1128/JB.00001-06</pub-id>
<pub-id pub-id-type="pmid">16740932</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<name>
<surname>Wooster</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Cleton-Jansen</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Collins</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Mangion</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Cornelis</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gusterson</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ponder</surname>
<given-names>B</given-names>
</name>
<name>
<surname>von Deimling</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wiestler</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Cornelisse</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Devilee</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Stratton</surname>
<given-names>M</given-names>
</name>
<article-title>Instability of short tandem repeats (microsatellites) in human cancers</article-title>
<source>Nature Genetics</source>
<year>1994</year>
<volume>6</volume>
<issue>2</issue>
<fpage>152</fpage>
<lpage>156</lpage>
<pub-id pub-id-type="doi">10.1038/ng0294-152</pub-id>
<pub-id pub-id-type="pmid">8162069</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<name>
<surname>O'Dushlaine</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Shields</surname>
<given-names>D</given-names>
</name>
<article-title>Tandem repeat copy-number variation in protein-coding regions of human genes</article-title>
<source>Genome Biology</source>
<year>2005</year>
<volume>6</volume>
<issue>8</issue>
<fpage>R69</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2005-6-8-r69</pub-id>
<pub-id pub-id-type="pmid">16086851</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<name>
<surname>Legendre</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Pochet</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Pak</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Verstrepen</surname>
<given-names>KJ</given-names>
</name>
<article-title>Sequence-based estimation of minisatellite and microsatellite repeat variability</article-title>
<source>Genome Research</source>
<year>2007</year>
<volume>17</volume>
<issue>12</issue>
<fpage>1787</fpage>
<lpage>1796</lpage>
<pub-id pub-id-type="doi">10.1101/gr.6554007</pub-id>
<pub-id pub-id-type="pmid">17978285</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<name>
<surname>Benson</surname>
<given-names>G</given-names>
</name>
<article-title>Tandem repeats finder: A program to analyze DNA sequences</article-title>
<source>Nucleic Acids Research</source>
<year>1999</year>
<volume>27</volume>
<issue>2</issue>
<fpage>573</fpage>
<lpage>580</lpage>
<pub-id pub-id-type="doi">10.1093/nar/27.2.573</pub-id>
<pub-id pub-id-type="pmid">9862982</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<name>
<surname>Grissa</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Vergnaud</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Pourcel</surname>
<given-names>C</given-names>
</name>
<article-title>CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>35</volume>
<issue>Web Server issue</issue>
<fpage>W52</fpage>
<lpage>W57</lpage>
<pub-id pub-id-type="pmid">17537822</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<name>
<surname>Kolpakov</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bana</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
<article-title>mreps: efficient and flexible detection of tandem repeats in DNA</article-title>
<source>Nucleic Acids Research</source>
<year>2003</year>
<volume>31</volume>
<issue>13</issue>
<fpage>3672</fpage>
<lpage>3678</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg617</pub-id>
<pub-id pub-id-type="pmid">12824391</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<name>
<surname>Kurtz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Choudhuri</surname>
<given-names>JV</given-names>
</name>
<name>
<surname>Ohlebusch</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Schleiermacher</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Stoye</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Giegerich</surname>
<given-names>R</given-names>
</name>
<article-title>REPuter: the manifold applications of repeat analysis on a genomic scale</article-title>
<source>Nucleic Acids Research</source>
<year>2001</year>
<volume>29</volume>
<issue>22</issue>
<fpage>4633</fpage>
<lpage>42</lpage>
<pub-id pub-id-type="doi">10.1093/nar/29.22.4633</pub-id>
<pub-id pub-id-type="pmid">11713313</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="journal">
<name>
<surname>Wexler</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Yakhini</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Kashi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Geiger</surname>
<given-names>D</given-names>
</name>
<article-title>Finding approximate tandem repeats in genomic sequences</article-title>
<source>Journal of Computational Biology</source>
<year>2005</year>
<volume>12</volume>
<issue>7</issue>
<fpage>928</fpage>
<lpage>942</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2005.12.928</pub-id>
<pub-id pub-id-type="pmid">16201913</pub-id>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<name>
<surname>Sokol</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Benson</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Tojeira</surname>
<given-names>J</given-names>
</name>
<article-title>Tandem repeats over the edit distance</article-title>
<source>Bioinformatics</source>
<year>2007</year>
<volume>23</volume>
<issue>2</issue>
<fpage>e30</fpage>
<lpage>35</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btl309</pub-id>
<pub-id pub-id-type="pmid">17237101</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<name>
<surname>Leclercq</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Rivals</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Jarne</surname>
<given-names>P</given-names>
</name>
<article-title>Detecting microsatellites within genomes: significant variation among algorithms</article-title>
<source>BMC Bioinformatics</source>
<year>2007</year>
<volume>8</volume>
<fpage>125</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-8-125</pub-id>
<pub-id pub-id-type="pmid">17442102</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="other">
<article-title>JasperReports Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.jasperforge.org">http://www.jasperforge.org</ext-link>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="other">
<article-title>ATRhunter Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cs.technion.ac.il/atrhunter">http://bioinfo.cs.technion.ac.il/atrhunter</ext-link>
</mixed-citation>
</ref>
<ref id="B38">
<mixed-citation publication-type="other">
<article-title>mreps Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.lifl.fr/mreps/">http://bioinfo.lifl.fr/mreps/</ext-link>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="other">
<article-title>TandemSWAN Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://favorov.imb.ac.ru/swan/home.html">http://favorov.imb.ac.ru/swan/home.html</ext-link>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="other">
<article-title>Tandem Repeats Finder Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://tandem.bu.edu/trf/trf.html">http://tandem.bu.edu/trf/trf.html</ext-link>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="journal">
<name>
<surname>Butland</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Devon</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Mead</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Meynert</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Neal</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wilkinson</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Yuen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hayden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Holt</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Leavitt</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ouellette</surname>
<given-names>BF</given-names>
</name>
<article-title>CAG-encoded polyglutamine length polymorphism in the human genome</article-title>
<source>BMC Genomics</source>
<year>2007</year>
<volume>8</volume>
<fpage>126</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-8-126</pub-id>
<pub-id pub-id-type="pmid">17519034</pub-id>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<name>
<surname>Hayes</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Turecki</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Brisebois</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Lopes-Cendes</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Gaspar</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Riess</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Ranum</surname>
<given-names>LP</given-names>
</name>
<name>
<surname>Pulst</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Rouleau</surname>
<given-names>GA</given-names>
</name>
<article-title>CAG repeat length in RAI1 is associated with age at onset variability in spinocerebellar ataxia type 2 (SCA2)</article-title>
<source>Human Molecular Genetics</source>
<year>2000</year>
<volume>9</volume>
<issue>12</issue>
<fpage>1753</fpage>
<lpage>1758</lpage>
<pub-id pub-id-type="doi">10.1093/hmg/9.12.1753</pub-id>
<pub-id pub-id-type="pmid">10915763</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<name>
<surname>Ayres</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Shum</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Akarsu</surname>
<given-names>AN</given-names>
</name>
<name>
<surname>Dashner</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Takahashi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ikura</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Slavkin</surname>
<given-names>HC</given-names>
</name>
<name>
<surname>Nuckolls</surname>
<given-names>GH</given-names>
</name>
<article-title>DACH: Genomic Characterization, Evaluation as a Candidate for Postaxial Polydactyly Type A2, and Developmental Expression Pattern of the Mouse Homologue</article-title>
<source>Genomics</source>
<year>2001</year>
<volume>77</volume>
<issue>1-2</issue>
<fpage>18</fpage>
<lpage>26</lpage>
<pub-id pub-id-type="doi">10.1006/geno.2001.6618</pub-id>
<pub-id pub-id-type="pmid">11543628</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="journal">
<name>
<surname>Köttgen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pattaro</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Böger</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Fuchsberger</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Olden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Glazer</surname>
<given-names>NL</given-names>
</name>
<name>
<surname>Parsa</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>AV</given-names>
</name>
<name>
<surname>O'Connell</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Tanaka</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Isaacs</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ketkar</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hwang</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Dehghan</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Teumer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Paré</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Atkinson</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Zeller</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Lohman</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Cornelis</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Probst-Hensch</surname>
<given-names>NM</given-names>
</name>
<name>
<surname>Kronenberg</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Tönjes</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hayward</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Aspelund</surname>
<given-names>T</given-names>
</name>
<etal></etal>
<article-title>New loci associated with kidney function and chronic kidney disease</article-title>
<source>Nat Genet</source>
<year>2010</year>
<volume>42</volume>
<issue>5</issue>
<fpage>376</fpage>
<lpage>384</lpage>
<pub-id pub-id-type="doi">10.1038/ng.568</pub-id>
<pub-id pub-id-type="pmid">20383146</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<name>
<surname>Huang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Winter</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Weinstock</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Xing</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Goodstadt</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Stenson</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Alba</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Ponting</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fechtel</surname>
<given-names>K</given-names>
</name>
<article-title>Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes</article-title>
<source>Genome Biology</source>
<year>2004</year>
<volume>5</volume>
<issue>7</issue>
<fpage>R47</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2004-5-7-r47</pub-id>
<pub-id pub-id-type="pmid">15239832</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="journal">
<name>
<surname>Ring</surname>
<given-names>HZ</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Guilbot</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Brice</surname>
<given-names>A</given-names>
</name>
<name>
<surname>LeGuern</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Francke</surname>
<given-names>U</given-names>
</name>
<article-title>The human neuregulin-2 (NRG2) gene: cloning, mapping and evaluation as a candidate for the autosomal recessive form of Charcot-Marie-Tooth disease linked to 5q</article-title>
<source>Human Genetics</source>
<year>1999</year>
<volume>104</volume>
<fpage>326</fpage>
<lpage>332</lpage>
<pub-id pub-id-type="doi">10.1007/s004390050961</pub-id>
<pub-id pub-id-type="pmid">10369162</pub-id>
</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<name>
<surname>Sherry</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kholodov</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Phan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Smigielski</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Sirotkin</surname>
<given-names>K</given-names>
</name>
<article-title>dbSNP: the NCBI database of genetic variation</article-title>
<source>Nucleic Acids Research</source>
<year>2001</year>
<volume>29</volume>
<fpage>308</fpage>
<lpage>311</lpage>
<pub-id pub-id-type="doi">10.1093/nar/29.1.308</pub-id>
<pub-id pub-id-type="pmid">11125122</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="other">
<article-title>dbSNP Welcome Page</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/snp">http://www.ncbi.nlm.nih.gov/snp</ext-link>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<name>
<surname>Boby</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Patch</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Aves</surname>
<given-names>SJ</given-names>
</name>
<article-title>TRbase: a database relating tandem repeats to disease genes for the human genome</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<fpage>811</fpage>
<lpage>816</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bti059</pub-id>
<pub-id pub-id-type="pmid">15479712</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<name>
<surname>Payseur</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Jing</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Haasl</surname>
<given-names>RJ</given-names>
</name>
<article-title>A Genomic Portrait of Human Microsatellite Variation</article-title>
<source>Molecular Biology and Evolution</source>
<year>2011</year>
<volume>28</volume>
<fpage>303</fpage>
<lpage>312</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msq198</pub-id>
<pub-id pub-id-type="pmid">20675409</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<name>
<surname>Mills</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Luttig</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>Larkins</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Beauchamp</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tsui</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Pittard</surname>
<given-names>WS</given-names>
</name>
<name>
<surname>Devine</surname>
<given-names>SE</given-names>
</name>
<article-title>An initial map of insertion and deletion (INDEL) variation in the human genome</article-title>
<source>Genome Research</source>
<year>2006</year>
<volume>16</volume>
<issue>9</issue>
<fpage>1182</fpage>
<lpage>1190</lpage>
<pub-id pub-id-type="doi">10.1101/gr.4565806</pub-id>
<pub-id pub-id-type="pmid">16902084</pub-id>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<name>
<surname>Reddy</surname>
<given-names>PH</given-names>
</name>
<name>
<surname>Stockburger</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gillevet</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Tagle</surname>
<given-names>DA</given-names>
</name>
<article-title>Mapping and Characterization of Novel (CAG)n Repeat cDNAs from Adult Human Brain Derived by the Oligo Capture Method</article-title>
<source>Genomics</source>
<year>1997</year>
<volume>46</volume>
<issue>2</issue>
<fpage>174</fpage>
<lpage>182</lpage>
<pub-id pub-id-type="doi">10.1006/geno.1997.5044</pub-id>
<pub-id pub-id-type="pmid">9417904</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/TelematiV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000419  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000419  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    TelematiV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Thu Nov 2 16:09:04 2017. Site generation: Sun Mar 10 16:42:28 2024