Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0007389 ( Pmc/Corpus ); précédent : 0007388; suivant : 0007390 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Long-range evolutionary constraints reveal
<italic>cis</italic>
-regulatory interactions on the human X chromosome</title>
<author>
<name sortKey="Naville, Magali" sort="Naville, Magali" uniqKey="Naville M" first="Magali" last="Naville">Magali Naville</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ishibashi, Minaka" sort="Ishibashi, Minaka" uniqKey="Ishibashi M" first="Minaka" last="Ishibashi">Minaka Ishibashi</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ferg, Marco" sort="Ferg, Marco" uniqKey="Ferg M" first="Marco" last="Ferg">Marco Ferg</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bengani, Hemant" sort="Bengani, Hemant" uniqKey="Bengani H" first="Hemant" last="Bengani">Hemant Bengani</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rinkwitz, Silke" sort="Rinkwitz, Silke" uniqKey="Rinkwitz S" first="Silke" last="Rinkwitz">Silke Rinkwitz</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Krecsmarik, Monika" sort="Krecsmarik, Monika" uniqKey="Krecsmarik M" first="Monika" last="Krecsmarik">Monika Krecsmarik</name>
<affiliation>
<nlm:aff id="a7">
<institution>Paris-Saclay Institute for Neuroscience (Neuro-PSI), UMR9197 CNRS-Université Paris Sud</institution>
, Avenue de la Terrasse, Gif-sur-Yvette 91190,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hawkins, Thomas A" sort="Hawkins, Thomas A" uniqKey="Hawkins T" first="Thomas A." last="Hawkins">Thomas A. Hawkins</name>
<affiliation>
<nlm:aff id="a8">
<institution>C.D.B. Division of Biosciences, Anatomy building, UCL</institution>
, Gower street, London, WC1E 6BT,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wilson, Stephen W" sort="Wilson, Stephen W" uniqKey="Wilson S" first="Stephen W." last="Wilson">Stephen W. Wilson</name>
<affiliation>
<nlm:aff id="a8">
<institution>C.D.B. Division of Biosciences, Anatomy building, UCL</institution>
, Gower street, London, WC1E 6BT,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Manning, Elizabeth" sort="Manning, Elizabeth" uniqKey="Manning E" first="Elizabeth" last="Manning">Elizabeth Manning</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chilamakuri, Chandra S R" sort="Chilamakuri, Chandra S R" uniqKey="Chilamakuri C" first="Chandra S. R." last="Chilamakuri">Chandra S. R. Chilamakuri</name>
<affiliation>
<nlm:aff id="a9">
<institution>Department of Tumor Biology, The Norwegian Radium Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wilson, David I" sort="Wilson, David I" uniqKey="Wilson D" first="David I." last="Wilson">David I. Wilson</name>
<affiliation>
<nlm:aff id="a10">
<institution>University of Southampton and University Hospital Southampton NHS Foundation Trust, Centre for Human Development, Stem Cells and Regeneration, MP808, Faculty of Medicine, Southampton General Hospital</institution>
, Tremona Road, Southampton 16 6YD,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Louis, Alexandra" sort="Louis, Alexandra" uniqKey="Louis A" first="Alexandra" last="Louis">Alexandra Louis</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lucy Raymond, F" sort="Lucy Raymond, F" uniqKey="Lucy Raymond F" first="F." last="Lucy Raymond">F. Lucy Raymond</name>
<affiliation>
<nlm:aff id="a11">
<institution>Cambridge Institute for Medical Research, University of Cambridge</institution>
, Hills Road, Cambridge CB2 OXY,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rastegar, Sepand" sort="Rastegar, Sepand" uniqKey="Rastegar S" first="Sepand" last="Rastegar">Sepand Rastegar</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Str Hle, Uwe" sort="Str Hle, Uwe" uniqKey="Str Hle U" first="Uwe" last="Str Hle">Uwe Str Hle</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lenhard, Boris" sort="Lenhard, Boris" uniqKey="Lenhard B" first="Boris" last="Lenhard">Boris Lenhard</name>
<affiliation>
<nlm:aff id="a12">
<institution>Institute of Clinical Sciences, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus</institution>
, Du Cane Road, London W12 0NN,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bally Cuif, Laure" sort="Bally Cuif, Laure" uniqKey="Bally Cuif L" first="Laure" last="Bally-Cuif">Laure Bally-Cuif</name>
<affiliation>
<nlm:aff id="a7">
<institution>Paris-Saclay Institute for Neuroscience (Neuro-PSI), UMR9197 CNRS-Université Paris Sud</institution>
, Avenue de la Terrasse, Gif-sur-Yvette 91190,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Van Heyningen, Veronica" sort="Van Heyningen, Veronica" uniqKey="Van Heyningen V" first="Veronica" last="Van Heyningen">Veronica Van Heyningen</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzpatrick, David R" sort="Fitzpatrick, David R" uniqKey="Fitzpatrick D" first="David R." last="Fitzpatrick">David R. Fitzpatrick</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Becker, Thomas S" sort="Becker, Thomas S" uniqKey="Becker T" first="Thomas S." last="Becker">Thomas S. Becker</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a13">
<institution>Department of Clinical Medicine, University of Bergen</institution>
, Bergen 5009,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Roest Crollius, Hugues" sort="Roest Crollius, Hugues" uniqKey="Roest Crollius H" first="Hugues" last="Roest Crollius">Hugues Roest Crollius</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">25908307</idno>
<idno type="pmc">4423230</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4423230</idno>
<idno type="RBID">PMC:4423230</idno>
<idno type="doi">10.1038/ncomms7904</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000738</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000738</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Long-range evolutionary constraints reveal
<italic>cis</italic>
-regulatory interactions on the human X chromosome</title>
<author>
<name sortKey="Naville, Magali" sort="Naville, Magali" uniqKey="Naville M" first="Magali" last="Naville">Magali Naville</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ishibashi, Minaka" sort="Ishibashi, Minaka" uniqKey="Ishibashi M" first="Minaka" last="Ishibashi">Minaka Ishibashi</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ferg, Marco" sort="Ferg, Marco" uniqKey="Ferg M" first="Marco" last="Ferg">Marco Ferg</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bengani, Hemant" sort="Bengani, Hemant" uniqKey="Bengani H" first="Hemant" last="Bengani">Hemant Bengani</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rinkwitz, Silke" sort="Rinkwitz, Silke" uniqKey="Rinkwitz S" first="Silke" last="Rinkwitz">Silke Rinkwitz</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Krecsmarik, Monika" sort="Krecsmarik, Monika" uniqKey="Krecsmarik M" first="Monika" last="Krecsmarik">Monika Krecsmarik</name>
<affiliation>
<nlm:aff id="a7">
<institution>Paris-Saclay Institute for Neuroscience (Neuro-PSI), UMR9197 CNRS-Université Paris Sud</institution>
, Avenue de la Terrasse, Gif-sur-Yvette 91190,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hawkins, Thomas A" sort="Hawkins, Thomas A" uniqKey="Hawkins T" first="Thomas A." last="Hawkins">Thomas A. Hawkins</name>
<affiliation>
<nlm:aff id="a8">
<institution>C.D.B. Division of Biosciences, Anatomy building, UCL</institution>
, Gower street, London, WC1E 6BT,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wilson, Stephen W" sort="Wilson, Stephen W" uniqKey="Wilson S" first="Stephen W." last="Wilson">Stephen W. Wilson</name>
<affiliation>
<nlm:aff id="a8">
<institution>C.D.B. Division of Biosciences, Anatomy building, UCL</institution>
, Gower street, London, WC1E 6BT,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Manning, Elizabeth" sort="Manning, Elizabeth" uniqKey="Manning E" first="Elizabeth" last="Manning">Elizabeth Manning</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chilamakuri, Chandra S R" sort="Chilamakuri, Chandra S R" uniqKey="Chilamakuri C" first="Chandra S. R." last="Chilamakuri">Chandra S. R. Chilamakuri</name>
<affiliation>
<nlm:aff id="a9">
<institution>Department of Tumor Biology, The Norwegian Radium Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wilson, David I" sort="Wilson, David I" uniqKey="Wilson D" first="David I." last="Wilson">David I. Wilson</name>
<affiliation>
<nlm:aff id="a10">
<institution>University of Southampton and University Hospital Southampton NHS Foundation Trust, Centre for Human Development, Stem Cells and Regeneration, MP808, Faculty of Medicine, Southampton General Hospital</institution>
, Tremona Road, Southampton 16 6YD,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Louis, Alexandra" sort="Louis, Alexandra" uniqKey="Louis A" first="Alexandra" last="Louis">Alexandra Louis</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lucy Raymond, F" sort="Lucy Raymond, F" uniqKey="Lucy Raymond F" first="F." last="Lucy Raymond">F. Lucy Raymond</name>
<affiliation>
<nlm:aff id="a11">
<institution>Cambridge Institute for Medical Research, University of Cambridge</institution>
, Hills Road, Cambridge CB2 OXY,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rastegar, Sepand" sort="Rastegar, Sepand" uniqKey="Rastegar S" first="Sepand" last="Rastegar">Sepand Rastegar</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Str Hle, Uwe" sort="Str Hle, Uwe" uniqKey="Str Hle U" first="Uwe" last="Str Hle">Uwe Str Hle</name>
<affiliation>
<nlm:aff id="a5">
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lenhard, Boris" sort="Lenhard, Boris" uniqKey="Lenhard B" first="Boris" last="Lenhard">Boris Lenhard</name>
<affiliation>
<nlm:aff id="a12">
<institution>Institute of Clinical Sciences, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus</institution>
, Du Cane Road, London W12 0NN,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bally Cuif, Laure" sort="Bally Cuif, Laure" uniqKey="Bally Cuif L" first="Laure" last="Bally-Cuif">Laure Bally-Cuif</name>
<affiliation>
<nlm:aff id="a7">
<institution>Paris-Saclay Institute for Neuroscience (Neuro-PSI), UMR9197 CNRS-Université Paris Sud</institution>
, Avenue de la Terrasse, Gif-sur-Yvette 91190,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Van Heyningen, Veronica" sort="Van Heyningen, Veronica" uniqKey="Van Heyningen V" first="Veronica" last="Van Heyningen">Veronica Van Heyningen</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzpatrick, David R" sort="Fitzpatrick, David R" uniqKey="Fitzpatrick D" first="David R." last="Fitzpatrick">David R. Fitzpatrick</name>
<affiliation>
<nlm:aff id="a6">
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Becker, Thomas S" sort="Becker, Thomas S" uniqKey="Becker T" first="Thomas S." last="Becker">Thomas S. Becker</name>
<affiliation>
<nlm:aff id="a4">
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a13">
<institution>Department of Clinical Medicine, University of Bergen</institution>
, Bergen 5009,
<country>Norway</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Roest Crollius, Hugues" sort="Roest Crollius, Hugues" uniqKey="Roest Crollius H" first="Hugues" last="Roest Crollius">Hugues Roest Crollius</name>
<affiliation>
<nlm:aff id="a1">
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a2">
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a3">
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nature Communications</title>
<idno type="eISSN">2041-1723</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Benko, S" uniqKey="Benko S">S. Benko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lettice, L A" uniqKey="Lettice L">L. A. Lettice</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goode, D K" uniqKey="Goode D">D. K. Goode</name>
</author>
<author>
<name sortKey="Snell, P" uniqKey="Snell P">P. Snell</name>
</author>
<author>
<name sortKey="Smith, S F" uniqKey="Smith S">S. F. Smith</name>
</author>
<author>
<name sortKey="Cooke, J E" uniqKey="Cooke J">J. E. Cooke</name>
</author>
<author>
<name sortKey="Elgar, G" uniqKey="Elgar G">G. Elgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kikuta, H" uniqKey="Kikuta H">H. Kikuta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mongin, E" uniqKey="Mongin E">E. Mongin</name>
</author>
<author>
<name sortKey="Dewar, K" uniqKey="Dewar K">K. Dewar</name>
</author>
<author>
<name sortKey="Blanchette, M" uniqKey="Blanchette M">M. Blanchette</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blanchette, M" uniqKey="Blanchette M">M. Blanchette</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lindblad Toh, K" uniqKey="Lindblad Toh K">K. Lindblad-Toh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ernst, J" uniqKey="Ernst J">J. Ernst</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blow, M J" uniqKey="Blow M">M. J. Blow</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Visel, A" uniqKey="Visel A">A. Visel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stevenson, R E" uniqKey="Stevenson R">R. E. Stevenson</name>
</author>
<author>
<name sortKey="Schwartz, C E" uniqKey="Schwartz C">C. E. Schwartz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sprague, J" uniqKey="Sprague J">J. Sprague</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, G" uniqKey="Li G">G. Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fong, A P" uniqKey="Fong A">A. P. Fong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ince Dunn, G" uniqKey="Ince Dunn G">G. Ince-Dunn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Narayanan, G" uniqKey="Narayanan G">G. Narayanan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chepelev, I" uniqKey="Chepelev I">I. Chepelev</name>
</author>
<author>
<name sortKey="Wei, G" uniqKey="Wei G">G. Wei</name>
</author>
<author>
<name sortKey="Wangsa, D" uniqKey="Wangsa D">D. Wangsa</name>
</author>
<author>
<name sortKey="Tang, Q" uniqKey="Tang Q">Q. Tang</name>
</author>
<author>
<name sortKey="Zhao, K" uniqKey="Zhao K">K. Zhao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Merkenschlager, M" uniqKey="Merkenschlager M">M. Merkenschlager</name>
</author>
<author>
<name sortKey="Odom, D T" uniqKey="Odom D">D. T. Odom</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andersson, R" uniqKey="Andersson R">R. Andersson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kleinjan, D A" uniqKey="Kleinjan D">D. A. Kleinjan</name>
</author>
<author>
<name sortKey="Van Heyningen, V" uniqKey="Van Heyningen V">V. van Heyningen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benko, S" uniqKey="Benko S">S. Benko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smemo, S" uniqKey="Smemo S">S. Smemo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Weedon, M N" uniqKey="Weedon M">M. N. Weedon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noonan, J P" uniqKey="Noonan J">J. P. Noonan</name>
</author>
<author>
<name sortKey="Mccallion, A S" uniqKey="Mccallion A">A. S. McCallion</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Flicek, P" uniqKey="Flicek P">P. Flicek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dunham, I" uniqKey="Dunham I">I. Dunham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M. Ashburner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Soutoglou, E" uniqKey="Soutoglou E">E. Soutoglou</name>
</author>
<author>
<name sortKey="Talianidis, I" uniqKey="Talianidis I">I. Talianidis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Toedling, J" uniqKey="Toedling J">J. Toedling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ishibashi, M" uniqKey="Ishibashi M">M. Ishibashi</name>
</author>
<author>
<name sortKey="Mechaly, A S" uniqKey="Mechaly A">A. S. Mechaly</name>
</author>
<author>
<name sortKey="Becker, T S" uniqKey="Becker T">T. S. Becker</name>
</author>
<author>
<name sortKey="Rinkwitz, S" uniqKey="Rinkwitz S">S. Rinkwitz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marz, M" uniqKey="Marz M">M. Marz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turner, K J" uniqKey="Turner K">K. J. Turner</name>
</author>
<author>
<name sortKey="Bracewell, T G" uniqKey="Bracewell T">T. G. Bracewell</name>
</author>
<author>
<name sortKey="Hawkins, T A" uniqKey="Hawkins T">T. A. Hawkins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lauter, G" uniqKey="Lauter G">G. Lauter</name>
</author>
<author>
<name sortKey="Soll, I" uniqKey="Soll I">I. Soll</name>
</author>
<author>
<name sortKey="Hauptmann, G" uniqKey="Hauptmann G">G. Hauptmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thisse, C" uniqKey="Thisse C">C. Thisse</name>
</author>
<author>
<name sortKey="Thisse, B" uniqKey="Thisse B">B. Thisse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bailey, T L" uniqKey="Bailey T">T. L. Bailey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thomas Chollier, M" uniqKey="Thomas Chollier M">M. Thomas-Chollier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wingender, E" uniqKey="Wingender E">E. Wingender</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jolma, A" uniqKey="Jolma A">A. Jolma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathelier, A" uniqKey="Mathelier A">A. Mathelier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rothenaigner, I" uniqKey="Rothenaigner I">I. Rothenaigner</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nat Commun</journal-id>
<journal-id journal-id-type="iso-abbrev">Nat Commun</journal-id>
<journal-title-group>
<journal-title>Nature Communications</journal-title>
</journal-title-group>
<issn pub-type="epub">2041-1723</issn>
<publisher>
<publisher-name>Nature Pub. Group</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">25908307</article-id>
<article-id pub-id-type="pmc">4423230</article-id>
<article-id pub-id-type="pii">ncomms7904</article-id>
<article-id pub-id-type="doi">10.1038/ncomms7904</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Long-range evolutionary constraints reveal
<italic>cis</italic>
-regulatory interactions on the human X chromosome</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Naville</surname>
<given-names>Magali</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="aff" rid="a3">3</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ishibashi</surname>
<given-names>Minaka</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ferg</surname>
<given-names>Marco</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bengani</surname>
<given-names>Hemant</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
<xref ref-type="author-notes" rid="n1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rinkwitz</surname>
<given-names>Silke</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Krecsmarik</surname>
<given-names>Monika</given-names>
</name>
<xref ref-type="aff" rid="a7">7</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hawkins</surname>
<given-names>Thomas A.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wilson</surname>
<given-names>Stephen W.</given-names>
</name>
<xref ref-type="aff" rid="a8">8</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-8557-5940</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Manning</surname>
<given-names>Elizabeth</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chilamakuri</surname>
<given-names>Chandra S. R.</given-names>
</name>
<xref ref-type="aff" rid="a9">9</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wilson</surname>
<given-names>David I.</given-names>
</name>
<xref ref-type="aff" rid="a10">10</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Louis</surname>
<given-names>Alexandra</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lucy Raymond</surname>
<given-names>F.</given-names>
</name>
<xref ref-type="aff" rid="a11">11</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rastegar</surname>
<given-names>Sepand</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Strähle</surname>
<given-names>Uwe</given-names>
</name>
<xref ref-type="aff" rid="a5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lenhard</surname>
<given-names>Boris</given-names>
</name>
<xref ref-type="aff" rid="a12">12</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-1114-1509</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bally-Cuif</surname>
<given-names>Laure</given-names>
</name>
<xref ref-type="aff" rid="a7">7</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>van Heyningen</surname>
<given-names>Veronica</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>FitzPatrick</surname>
<given-names>David R.</given-names>
</name>
<xref ref-type="aff" rid="a6">6</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Becker</surname>
<given-names>Thomas S.</given-names>
</name>
<xref ref-type="corresp" rid="c1">a</xref>
<xref ref-type="aff" rid="a4">4</xref>
<xref ref-type="aff" rid="a13">13</xref>
<xref ref-type="author-notes" rid="n2"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Roest Crollius</surname>
<given-names>Hugues</given-names>
</name>
<xref ref-type="corresp" rid="c2">b</xref>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="aff" rid="a3">3</xref>
<xref ref-type="author-notes" rid="n2"></xref>
</contrib>
<aff id="a1">
<label>1</label>
<institution>Ecole Normale Supérieure, Institut de Biologie de l'ENS, IBENS</institution>
, 46 rue d'Ulm, Paris F-75005,
<country>France</country>
</aff>
<aff id="a2">
<label>2</label>
<institution>CNRS</institution>
, UMR 8197, Paris F-75005,
<country>France</country>
</aff>
<aff id="a3">
<label>3</label>
<institution>Inserm</institution>
, U1024, Paris F-75005,
<country>France</country>
</aff>
<aff id="a4">
<label>4</label>
<institution>Brain and Mind Research Institute, Sydney Medical School, University of Sydney</institution>
, Camperdown, New South Wales 2050,
<country>Australia</country>
</aff>
<aff id="a5">
<label>5</label>
<institution>Institute of Toxicology and Genetics and European Zebrafish Resource Centre, Karlsruhe Institute of Technology</institution>
, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen,
<country>Germany</country>
</aff>
<aff id="a6">
<label>6</label>
<institution>MRC Human Genetics Unit, MRC Institute of Medical Genetic and Molecular Medicine, University of Edinburgh</institution>
, Edinburgh EH4 2XU,
<country>UK</country>
</aff>
<aff id="a7">
<label>7</label>
<institution>Paris-Saclay Institute for Neuroscience (Neuro-PSI), UMR9197 CNRS-Université Paris Sud</institution>
, Avenue de la Terrasse, Gif-sur-Yvette 91190,
<country>France</country>
</aff>
<aff id="a8">
<label>8</label>
<institution>C.D.B. Division of Biosciences, Anatomy building, UCL</institution>
, Gower street, London, WC1E 6BT,
<country>UK</country>
</aff>
<aff id="a9">
<label>9</label>
<institution>Department of Tumor Biology, The Norwegian Radium Hospital</institution>
, 0310 Oslo,
<country>Norway</country>
</aff>
<aff id="a10">
<label>10</label>
<institution>University of Southampton and University Hospital Southampton NHS Foundation Trust, Centre for Human Development, Stem Cells and Regeneration, MP808, Faculty of Medicine, Southampton General Hospital</institution>
, Tremona Road, Southampton 16 6YD,
<country>UK</country>
</aff>
<aff id="a11">
<label>11</label>
<institution>Cambridge Institute for Medical Research, University of Cambridge</institution>
, Hills Road, Cambridge CB2 OXY,
<country>UK</country>
</aff>
<aff id="a12">
<label>12</label>
<institution>Institute of Clinical Sciences, MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus</institution>
, Du Cane Road, London W12 0NN,
<country>UK</country>
</aff>
<aff id="a13">
<label>13</label>
<institution>Department of Clinical Medicine, University of Bergen</institution>
, Bergen 5009,
<country>Norway</country>
</aff>
</contrib-group>
<author-notes>
<corresp id="c1">
<label>a</label>
<email>tom.becker@sydney.edu.au</email>
</corresp>
<corresp id="c2">
<label>b</label>
<email>hrc@ens.fr</email>
</corresp>
<fn id="n1">
<label>*</label>
<p>These authors contributed equally to this work.</p>
</fn>
<fn id="n2">
<label></label>
<p>These authors jointly supervised this work.</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>04</month>
<year>2015</year>
</pub-date>
<volume>6</volume>
<elocation-id>6904</elocation-id>
<history>
<date date-type="received">
<day>01</day>
<month>05</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>12</day>
<month>03</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2015, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<pmc-comment>author-paid</pmc-comment>
<license-p>This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
</license-p>
</license>
</permissions>
<abstract>
<p>Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression.</p>
</abstract>
<abstract abstract-type="web-summary">
<p>
<inline-graphic id="i1" xlink:href="ncomms7904-i1.jpg"></inline-graphic>
Enhancers regulate the transcription of genes over long genomic distances. Here, the authors show that enhancer function is correlated with maintenance of linkage between non-coding elements and neighbouring genes in the human X chromosome and that enhancers in zebrafish drive expression in a pattern consistent with the expression of a linked gene.</p>
</abstract>
</article-meta>
</front>
<body>
<p>C
<italic>is</italic>
-regulation is a vital mechanism for the normal development and health of an organism. The
<italic>cis</italic>
-regulation of protein-coding gene expression in vertebrate genomes is mediated by regulatory factors binding to enhancer elements that may be located as much as 1.5 Mb from their target genes
<xref ref-type="bibr" rid="b1">1</xref>
<xref ref-type="bibr" rid="b2">2</xref>
, and longer distances are entirely possible. Given the importance of this
<italic>cis</italic>
-interaction, negative selection is thought to prevent the evolutionary fixation of rearrangements that would either physically dissociate the enhancer from the target gene or separate them by an excessive genomic distance. Genomic regions bearing these properties have been described as genome regulatory blocks
<xref ref-type="bibr" rid="b3">3</xref>
<xref ref-type="bibr" rid="b4">4</xref>
, but systematic efforts to exploit this evolutionary signature on a genomic scale
<xref ref-type="bibr" rid="b5">5</xref>
have yet to be experimentally validated. Here we perform such an analysis on the human X chromosome, by developing a score that measures the evolutionary linkage between putative enhancers and their surrounding genes. We show that conserved noncoding elements (CNEs) showing the highest linkage scores are also enriched in functional marks such as epigenetic modifications characteristic of enhancers. We experimentally test >1,000 CNEs for their ability to replicate the expression pattern of their most strongly linked genes, and validate the predicted association for 60% of the cases where the expression pattern of the target gene was known. We finally show that putative enhancers linked to the same target gene are enriched in sequence motifs that may trigger the binding of specific transcription factors.</p>
<sec disp-level="1" sec-type="results">
<title>Results</title>
<sec disp-level="2">
<title>Prediction of CNE/target gene associations</title>
<p>We identified human X-chromosome CNEs by scanning a multispecies genomic alignment encompassing 46 vertebrate genomes
<xref ref-type="bibr" rid="b6">6</xref>
(
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1A</xref>
), and looked for conserved regions, excluding exons and repeat sequences (Methods). This set was then merged with CNEs previously identified in eutherian mammals
<xref ref-type="bibr" rid="b7">7</xref>
. Together, these regions represent 174,473 distinct CNEs covering 4.4% of the human X chromosome, likely to represent most noncoding sequences under conservation. To test the hypothesis that functional interactions translate in physical linkage, we first devised a scoring procedure based on evolutionary conservation of linkage between a CNE and one of the human genes located within a radius of 1 Mb from the CNE. For a given CNE, the position of the orthologous CNEs were first sought in all the vertebrate genomes that align at this position. Next, the orthologs of the human genes found in the 1-Mb radius were also collected in all vertebrate genomes. Four situations may arise depending on whether and where the orthologous gene is present: (i) it too is linked to the orthologous CNE in the defined radius, (ii) it is located on the same chromosome but beyond the defined radius, (iii) it is located on a different chromosome and (iv) it is not annotated in the genome. In each genome, each situation was diagnosed and labelled with a score that accounts for the conservation of synteny between the human genome and the genome of interest, and the sequencing coverage of the latter (
<xref ref-type="fig" rid="f1">Fig. 1a</xref>
and Methods). The maximum genomic interval allowed for linking the orthologous CNE and gene(s) was conservatively taken as 1 Mb but scaled in each genome depending on its relative size compared with the human genome. Together, this linkage and synteny information was used to compute an absolute score
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
between each CNE and each human gene within the 1 Mb radius (0<
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
<1), reflecting the degree of linkage between them in vertebrate genomes (
<xref ref-type="fig" rid="f1">Fig. 1a</xref>
and Methods). For each CNE, the best scoring genes were selected as plausible targets, with no minimal score threshold (
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 1</xref>
), and CNEs targeting the same genes were merged if their positions were <100 bp apart (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1B</xref>
). These merged CNEs are hereafter called RegHsa elements. We identified 102,647 RegHsa elements on the X chromosome with a mean size of 88 bp. Only 1% of RegHsas are not associated with a potential target gene (that is, their distance to the nearest human gene exceeds 1 Mb), 37.5% are associated with a single predicted target (single targets), and 61.5% are associated with several target genes with identical maximal score (multiple targets, not necessarily contiguous). Such multiple targets occur when evolutionarily neutral breakpoints have not yet dissociated the locus, some ‘bystander' genes may be captured in a genome regulatory block between an enhancer and its target gene
<xref ref-type="bibr" rid="b4">4</xref>
, or an enhancer may regulate several neighbouring genes. Of the 812 protein-coding genes annotated on the X chromosome, 389 were associated with at least one RegHsa element, while some genes, including
<italic>DIAP2</italic>
,
<italic>DMD</italic>
or
<italic>ODZ1</italic>
, are associated with >100 RegHsa elements. Of the RegHsa elements predicted to target a single gene, 60.7% target a gene that is not their direct neighbour. Interestingly, we observe a remarkably stable median linkage score in a 600-kb radius from the RegHsa element, with a sharp drop in linkage score values beyond this distance (
<xref ref-type="fig" rid="f1">Fig. 1b</xref>
). Although enhancers are known to function beyond 600 kb, this result may indicate that factors such as the three-dimensional chromatin conformation or breakpoint frequencies may generally be unfavourable to long-range regulatory interactions beyond this distance.</p>
</sec>
<sec disp-level="2">
<title>The linkage score is correlated with functional marks</title>
<p>If our method correctly reflects a functional association between enhancers and their target genes, we expect the linkage score
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
to correlate with functional annotations known to be associated with enhancers. To examine this, we annotated all CNEs that constitute RegHsa elements with functional signals known to be associated with enhancer function including chromatin accessibility by DNAseI assays, H3K4me1, H3K4me3, H3K27ac histone modifications and transcription factor-binding assays obtained from seven human cell lines
<xref ref-type="bibr" rid="b8">8</xref>
, as well as p300 signals from the mouse embryonic heart, forebrain, midbrain and limb
<xref ref-type="bibr" rid="b9">9</xref>
<xref ref-type="bibr" rid="b10">10</xref>
. Because the human X chromosome is known to harbour a high proportion of genes involved in cognitive functions and expressed in neural tissues
<xref ref-type="bibr" rid="b11">11</xref>
, we also performed H3K4me1, H3K27ac and p300 ChIP-on-chip experiments on human foetal brain and mouse E14.5, E16.5 and P0 developing brain tissues (Methods). When ranking CNEs and target gene associations by increasing the
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score, we observe a pronounced enrichment in all functional annotations (
<xref ref-type="fig" rid="f1">Fig. 1c</xref>
and Methods), with a fivefold increase in DNAse1 accessibility (average over seven human cell lines) and a striking 10.8-fold increase in H3K4me1 marks in human developing brain. Notably, the enrichment is not solely a consequence of the positive correlation between linkage score and conservation (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 2</xref>
) because the result remains even when controlling for conservation (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 3</xref>
). High scoring RegHsa elements (
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
>0.9) are linked to genes showing a marked enrichment in gene ontology (GO) terms, notably those associated to neuronal cell body, axon guidance and synapse (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
). Finally, the linkage score
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
strongly correlates with an enrichment of known transcription factor-binding motifs (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 4</xref>
). Together, these results indicate that
<italic>cis</italic>
-interactions predicted only using evolutionary information are enriched in functional enhancers. Notably, this result is not limited to the X chromosome, because when we compute the
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score on autosomes, they also show the same enrichment in functional annotations as a function of linkage score (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5</xref>
).</p>
</sec>
<sec disp-level="2">
<title>Functional validation of predicted interactions</title>
<p>Next we directly tested the enhancer function of the interaction predicted by our comparative and functional genomic analyses by using transgenic assays. We selected 450 regions of ∼1 kb on the human X chromosome and overlapping 1,013 human RegHsa elements. These elements encompass a range of conservation levels and a large range of
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
scores (0.320–0.980) linking them to genes known to be involved in brain development (
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 2</xref>
). We examined their ability to drive specific green fluorescent protein (GFP) expression patterns in zebrafish embryos, by analysing at least five different insertions in F1 lines at 2 days post fertilization for each element. RegHsa elements with a reproducible or partially reproducible pattern of expression (448 cases) allowed us to test if the predicted target gene or genes of the enhancer are compatible with this pattern. For 323 RegHsas, expression data were available for the zebrafish (described in the ZFIN database
<xref ref-type="bibr" rid="b12">12</xref>
) for at least one predicted target. Of these, 200 RegHsa elements (60%) drive a transgenic GFP pattern that fully or partially overlaps the ZFIN pattern of one of the predicted targets (
<xref ref-type="fig" rid="f2">Fig. 2</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Figs 6 and 7</xref>
,
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 2</xref>
and Methods). These cases support the prediction that the enhancer indeed regulates the target gene showing the best
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score. Consistent with this result, the average
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score is significantly higher for the 200 supported enhancer–gene associations than for those that are not (
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score 0.923 versus 0.863;
<italic>P</italic>
<2.10
<sup>−16</sup>
, Wilcoxon test). Interestingly, while 25% of tested RegHsa elements are conserved in zebrafish genomic DNA, this figure increases to 44% for elements with a predicted target that is supported in the transgenic experiments. This shows that conservation of a RegHsa elements is correlated with its functional property as enhancer, but it also shows that absence of conservation in fish does not preclude validation since 56% of enhancers are validated without conservation in fish. To further confirm the identity of the target gene in a limited number of cases, we verified if the enhancer drives GFP expression in the same brain region or cell type where the mRNA of its predicted target gene is expressed. To this end, we performed a detailed anatomical characterization of the GFP expression pattern in juvenile and/or adult zebrafish brains, for transgenic lines corresponding to 15 different human sequences elements overlapping 67 RegHsas (Methods). Out of the 15 transgenic assays analysed, 13 (87%) show that the gene that is evolutionarily linked to the RegHsa element is expressed in a pattern that completely (6 cases) or partially (7 cases) overlaps with the transgenic GFP pattern in either juvenile or adult zebrafish brain (
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 3</xref>
). For example, the RegHsa0032185 element is predicted to regulate the
<italic>BCOR</italic>
gene (
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
=0.917) yet is located 286 Kb downstream of the nearest
<italic>BCOR</italic>
promoter (
<xref ref-type="fig" rid="f2">Fig. 2a</xref>
). The elements reproducibly drive GFP expression in the developing zebrafish telencephalon and hindbrain. Neuroanatomical characterization of GFP expression in transgenic zebrafish lines carrying the RegHsa230032185 compared with endogenous zebrafish
<italic>bcor</italic>
mRNA expression in both juvenile and adult brains shows a strong overlap in the anterior telencephalon (
<xref ref-type="fig" rid="f3">Fig. 3</xref>
). Critically, the GFP expression pattern strongly overlaps the endogenous zebrafish
<italic>bcor</italic>
mRNA expression (
<xref ref-type="fig" rid="f3">Fig. 3c</xref>
). In addition, target gene predictions are consistent with published chromatin interaction maps. Indeed, of the 2,096 RegHsa elements that overlap the regions involved in 781 long-range chromatin interactions experimentally observed on the X chromosome by ChIA-PET in five human cell lines
<xref ref-type="bibr" rid="b13">13</xref>
, 69% are evolutionary associated (that is, show the best
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score) with the same gene as shown to be involved in the chromatin interaction (
<italic>P</italic>
value <10
<sup>−5</sup>
, permutation test). Notably, this overlap is the same if we only consider cases where the predicted target is the nearest gene to the RegHsa element or if we consider cases where one or more genes separate the two. Together, these results support the original target gene prediction, which was obtained solely using genome comparisons. Interestingly, while our data agree with the ‘nearest gene' strategy 60% of the time (as does the ChIA-PET data, 62%), a greater rate of validation is observed when comparing our data with the ChIA-PET data (69%), which necessarily includes non-nearest genes.</p>
</sec>
<sec disp-level="2">
<title>Motif discovery in CNEs assigned to the same target gene</title>
<p>On average, 389 single target genes are associated with a mean of 17 RegHsa elements each with
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
>0.9. We postulated that if different RegHsa elements are predicted to regulate the same target gene, they might share common sequence motifs recognized by the same transcription factor (TF). Consistent with this, we found significantly enriched motifs in elements targeting 124 genes (Methods), with up to 15 motifs per set of RegHsa targeting the same gene. Remarkably, different genes appear to be regulated by RegHsa elements that share the same motifs, despite the analysis being restricted to one human chromosome. The most striking case is a motif resembling the recognition sequence for the
<italic>NEUROD2</italic>
TF, present from 5 to 30 times in RegHsa elements targeting nine genes (
<xref ref-type="fig" rid="f4">Fig. 4a</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 4</xref>
).
<italic>NEUROD2</italic>
is expressed in the developing brain and is important for lineage progression through chromatin remodelling
<xref ref-type="bibr" rid="b14">14</xref>
<xref ref-type="bibr" rid="b15">15</xref>
. Notably, several of the nine genes that are suggested here to be regulated by
<italic>NEUROD2</italic>
through common binding motifs are known to participate in different aspects of brain development and activity. In addition, 19 pairs of X-chromosome genes are linked to different sets of RegHsa elements that share three or more overrepresented motifs in common. For example RegHsa elements linked to
<italic>AFF2</italic>
and
<italic>IL1RAPL1</italic>
share five motifs in common (
<xref ref-type="fig" rid="f4">Fig. 4b</xref>
), including a motif similar to that of the
<italic>KLF12</italic>
transcription factor, which is differentially expressed in a cellular model of neural progenitors
<xref ref-type="bibr" rid="b16">16</xref>
. Similarly, RegHsa elements linked to
<italic>BCOR</italic>
and
<italic>MAGEB10</italic>
share four overrepresented motifs (
<xref ref-type="fig" rid="f4">Fig. 4c</xref>
) suggesting that each pair is co-regulated.</p>
</sec>
</sec>
<sec disp-level="1" sec-type="discussion">
<title>Discussion</title>
<p>In summary, we describe a method to identify the evolutionary linkage between human CNEs (here, RegHsa elements) and neighbouring protein-coding target genes. We show that this linkage is indicative of a regulatory action of the element on the expression of the linked protein-coding gene. Some of these interactions were confirmed experimentally, but detailed characterization of the different CNEs is still required. Experimental methods are already able to indicate the interactions between enhancers and genes
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b13">13</xref>
<xref ref-type="bibr" rid="b17">17</xref>
<xref ref-type="bibr" rid="b18">18</xref>
<xref ref-type="bibr" rid="b19">19</xref>
but they are strongly constrained by the tissue and time where and when the interaction takes place. In contrast, evolutionary linkage is independent of the tissue or time of expression of the gene, and is applicable to any sequenced vertebrate genome, as it was done here for human.</p>
<p>Regulatory mutations are known to cause diseases but few have been identified so far
<xref ref-type="bibr" rid="b20">20</xref>
<xref ref-type="bibr" rid="b21">21</xref>
<xref ref-type="bibr" rid="b22">22</xref>
<xref ref-type="bibr" rid="b23">23</xref>
, largely because the functional link between enhancers and their target gene is difficult to ascertain
<xref ref-type="bibr" rid="b24">24</xref>
. Here we provide a direct and simple approach to predict such interactions. For example, of the 45,449 RegHsa elements associated to one or more genes with a strong score (
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
> 0.8), 8,217 elements target a gene where coding mutations have already been shown to cause intellectual disabilities. This strategy thus provides new material to accelerate the discovery of disease causing mutations.</p>
</sec>
<sec disp-level="1" sec-type="methods">
<title>Methods</title>
<sec disp-level="2">
<title>Identification of CNEs</title>
<p>CNEs are defined based on their conservation in a range of vertebrate species, using an in-house algorithm called ‘ScanMaf' implemented in a python script (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1a</xref>
). ScanMaf scans the UCSC 46-species multiZ alignment and looks for conserved regions of a minimal length and identity, excluding exons annotated in Ensembl as well as repeats annotated by RepeatMasker and Tandem Repeat Finder. This algorithm does not require the presence of a fixed set of species in the alignment, but instead only requires a minimal number of seven species in addition to human, with no consideration of their respective phylogenetic group, allowing us to retrieve with the same procedure elements restricted to mammals as well as elements conserved between mammals and fish. It allows substitutions to occur, under a threshold of 12%, in each column of the alignment (in the minimal situation where only seven species are aligned to human, this threshold allows for one substitution); above this threshold columns are considered as conserved. The algorithm first identifies core windows of 10 bp containing at least 90% of such conserved columns. It then extends this nucleus in both the directions by allowing up to three non-conserved consecutive columns. If these human regions are conserved in the same subset of species, consecutive in each of their genomes, and separated by <100 bp in human, they were fused in a single resulting element in order to ease further analysis. These predictions were then fused with the regions obtained by the Siphy algorithm
<xref ref-type="bibr" rid="b7">7</xref>
. The resulting 174,473 distinct CNEs on the human X chromosome were used for further analysis. Each CNE was annotated with a score to characterize its evolutionary conservation between the human sequence and the other vertebrate sequences that align to this sequence. For this purpose, vertebrate genome sequences from the UCSC 46 species multiple alignments were classified into five groups according to their phylogenetic position: Boreoeutheria, Atlantogenata, Monotremes and Marsupials, Sauropsids and Amphibians, Teleostean fish. The maximum % ID between the human sequence and the sequences of each group, when present, are identified and summed to compute the conservation score. For example, a CNE is identified and is conserved from human to fish. The maximum % ID in each group are: Boreoeutheria 97% (with chimpanzee), Atlantogenata 68% (with elephant), Monotreme and Marsupials 62% (with opossum), Sauropsids and Amphibians 54% (with chicken) and Teleosts 49% (with medaka). The conservation score for this CNE will thus be: score=97+68+62+54+49=330.</p>
</sec>
<sec disp-level="2">
<title>Scoring CNE-target genes evolutionary linkage</title>
<p>Families of orthologous genes were retrieved from the Ensembl database
<xref ref-type="bibr" rid="b25">25</xref>
(version 66). Starting from the human genome as a reference (version hg19), the first step of the target prediction consists in collecting immediate neighbouring genes (distant from <1 Mb) of each given CNE within the human genome. A scoring procedure is then applied on these genes to try to identify the most probable CNE target. For any given CNE
<italic>i</italic>
present in
<italic>N</italic>
species, the absolute linkage score
<italic>S</italic>
<sub>
<italic>Ai</italic>
</sub>
is computed as follows:</p>
<p>
<disp-formula id="eq1">
<inline-graphic id="d33e822" xlink:href="ncomms7904-m1.jpg"></inline-graphic>
</disp-formula>
</p>
<p>where
<italic>C</italic>
<sub>
<italic>e</italic>
</sub>
is a corrective factor to minimize the influence of genome assemblies obtained at low sequence coverage (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 2</xref>
),
<italic>R</italic>
<sub>
<italic>e</italic>
</sub>
the rearrangement rate of the genome of species
<italic>e</italic>
by comparison with the human genome (see below and
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 2</xref>
) and
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,0</sub>
,
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,1</sub>
,
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,2</sub>
,
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,3</sub>
the respective status of the orthologous gene considered in species
<italic>e</italic>
:
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,0</sub>
if absent (or mis-annotated),
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,1</sub>
if present and within distance
<italic>d</italic>
from the CNE,
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,2</sub>
if present and beyond distance
<italic>d</italic>
from the CNE,
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
,3</sub>
if present but on another chromosome or scaffold. These
<italic>S</italic>
<sub>
<italic>i</italic>
,
<italic>e</italic>
</sub>
parameters take the value of 1 if the condition is fulfilled, 0 otherwise. Genome coverage, rearrangement rates and distance thresholds are listed in
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 2</xref>
. Distance
<italic>d</italic>
is taken as 1 Mb adjusted for the size of the genome of species
<italic>e</italic>
compared with the human (if the genome of
<italic>e</italic>
is 80% of the human genome, then
<italic>d</italic>
=0.8 Mb). The level of synteny
<italic>R</italic>
<sub>
<italic>e</italic>
</sub>
is computed as follows:</p>
<p>
<disp-formula id="eq2">
<inline-graphic id="d33e972" xlink:href="ncomms7904-m2.jpg"></inline-graphic>
</disp-formula>
</p>
<p>where
<italic>H</italic>
is the total number of gene pairs in the human genome, and
<italic>P</italic>
<sub>
<italic>e</italic>
</sub>
the number of these gene pairs that are direct neighbours (in conserved synteny) in species
<italic>e</italic>
with the human gene pairs.
<italic>R</italic>
<sub>
<italic>e</italic>
</sub>
thus varies between 0 (a genome with no gene pairs in conserved synteny) and 1 (the human genome against itself). Of note, the baboon (papHam1) and the lamprey (petMar1) genome sequences, despite being present in the 46-species multiple alignment, were not used for the target search because of the high degree of fragmentation of their assemblies. These linkage scores, after being calculated for every gene families neighbouring each CNEs, are then normalized in a [0,1] interval using a sigmoid transformation as follows:</p>
<p>
<disp-formula id="eq3">
<inline-graphic id="d33e995" xlink:href="ncomms7904-m3.jpg"></inline-graphic>
</disp-formula>
</p>
<p>
<disp-formula id="eq4">
<inline-graphic id="d33e998" xlink:href="ncomms7904-m4.jpg"></inline-graphic>
</disp-formula>
</p>
<p>
<disp-formula id="eq5">
<inline-graphic id="d33e1001" xlink:href="ncomms7904-m5.jpg"></inline-graphic>
</disp-formula>
</p>
<p>After sorting linked genes by descending linkage score, a relative score can be computed for each, corresponding to the linkage score difference between the top-ranking linkage score and the second-best linkage score. The greater the relative linkage score, the more contrast. However, if a CNE presents only one putative target in its environment, the corresponding gene family will have no relative score attributed. The relative score is useful to identify cases where, among all possible targets within 1 Mb of a given CNE, one gene stands out: this gene will have a high relative score, because there will be a high difference between its linkage score and that of the next-best target. CNEs targeting the same genes and located <100 bp apart were fused, resulting in 102,647 RegHsa elements. The complete set of RegHsa elements together with their scores and target genes are available in
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 1</xref>
. RegHsa elements linked to their target gene with a score>0.9 are available in a graphical interactive server on
<ext-link ext-link-type="uri" xlink:href="http://www.genomicus.biologie.ens.fr/genomicus">http://www.genomicus.biologie.ens.fr/genomicus</ext-link>
.</p>
</sec>
<sec disp-level="2">
<title>Enrichment in enhancer functional data</title>
<p>Functional information was collected from the Ensembl project for DNase hypersensitive sites (DHS)
<xref ref-type="bibr" rid="b26">26</xref>
, chromatin immunoprecipitation sequencing (ChIP-seq) for TFs
<xref ref-type="bibr" rid="b26">26</xref>
, and H3K4me1, H3K4me3 and H3K27ac histones modifications
<xref ref-type="bibr" rid="b26">26</xref>
for seven different cell lines (Gm12878, H1-hESC, HSMM, HUVEC, K562, NHEK, NHLF, HMEC and NH-A). We also collected published p300 functional annotations for mouse developing heart
<xref ref-type="bibr" rid="b9">9</xref>
and mouse developing forebrain, midbrain and limb
<xref ref-type="bibr" rid="b10">10</xref>
(see below for links to public data sources). Finally, we generated p300, H3K4me1 and H3K27ac annotations using ChIP-on-chip on the human X chromosome with chromatin isolated from human fetal brain and E14.5 and P0 developing mouse brain (Methods). To compute the intersection between the functional data listed above and CNE intervals, the positions of the functional annotations and of the CNEs were compared. When the intervals overlapped by at least 1 bp, the CNE was assigned a ‘functional score' corresponding to the value of the overlapping signal weighted by the percentage of the CNE covered by the signal. For instance, if a 100-bp CNE overlaps a DHS peak of value 12 over 40 bp, the DHS value associated to the CNE is: 12 × (40/100)=4.8. For CNEs overlapping several distinct peaks, the resulting signal value is additive. In
<xref ref-type="fig" rid="f1">Fig. 1b</xref>
and
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5</xref>
, the proportion of RegHsa elements that overlap a functional annotation (with value >0) through at least one of their constitutive CNEs was computed for each of the annotations, for classes of RegHsa elements of increasing linkage score. To associate GO
<xref ref-type="bibr" rid="b27">27</xref>
terms with X-chromosome genes predicted to be functionally linked to RegHsa elements, we used the PathwayStudio platform (Elsevier B.V., Amsterdam). GO annotations from lists of genes linked to CNEs above a certain linkage score thresholds were compared with the lists drawn from the complete list of genes of the X chromosome (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
). Statistical significance was estimated by Fisher's test, without correction for multiple testing.</p>
</sec>
<sec disp-level="2">
<title>Sources of public data for enhancer enrichment tests</title>
<p>CNEs were annotated with a range of functional annotations, both published and obtained in the course of this project:</p>
<p>ENCODE (Feb. 2012) DHS
<xref ref-type="bibr" rid="b26">26</xref>
.</p>
<p>(
<ext-link ext-link-type="uri" xlink:href="http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751083&c=chr21&g=wgEncodeAwgDnaseUniform">http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751083&c=chr21&g=wgEncodeAwgDnaseUniform</ext-link>
)</p>
<p>ENCODE (Feb. 2012) ChIP-seq for Transcription Factors
<xref ref-type="bibr" rid="b26">26</xref>
.</p>
<p>(
<ext-link ext-link-type="uri" xlink:href="http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751457&c=chr21&g=wgEncodeAwgTfbsUniform">http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751457&c=chr21&g=wgEncodeAwgTfbsUniform</ext-link>
)</p>
<p>ENCODE (Feb. 2012) H3K4me1, H3K4me3 and H3K27ac histones modifications
<xref ref-type="bibr" rid="b26">26</xref>
.</p>
<p>(
<ext-link ext-link-type="uri" xlink:href="http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751457&c=chr21&g=wgEncodeBroadHistone">http://genome-euro.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195751457&c=chr21&g=wgEncodeBroadHistone</ext-link>
).</p>
<p>In the 3 ENCODE data sets above, peaks correspond to local maxima of the different signals. We used data obtained in seven different cell lines (Gm12878, H1-hESC, HSMM, HUVEC, K562, NHEK, NHLF, HMEC and NH-A), by computing the mean of each functional signal in 25-bp windows along the X chromosome before intersecting these annotations with the CNE intervals.</p>
<p>Blow
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b9">9</xref>
: p300 ChIP-seq data from mouse developing heart</p>
<p>(
<ext-link ext-link-type="uri" xlink:href="http://www.nature.com/ng/journal/v42/n9/extref/ng.650-S2.xls">
<italic>http://www.nature.com/ng/journal/v42/n9/extref/ng.650-S2.xls</italic>
</ext-link>
)</p>
<p>Visel
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b10">10</xref>
: p300 ChIP-seq data from mouse developing forebrain, midbrain and limb.</p>
<p>(
<ext-link ext-link-type="uri" xlink:href="http://www.nature.com/nature/journal/v457/n7231/extref/nature07730-s2.xls">
<italic>http://www.nature.com/nature/journal/v457/n7231/extref/nature07730-s2.xls</italic>
</ext-link>
)</p>
</sec>
<sec disp-level="2">
<title>Overlap between this study and interactions shown by ChIA-PET</title>
<p>The genomic positions of RegHsa elements were compared with the regions shown by Li
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b13">13</xref>
to interact with gene promoters via ChiA-PET experiments. The best scoring genes of each overlapping RegHsa elements were compared with the genes that interact with the corresponding region by ChIA-PET. The two ‘experiments' (linkage score in this study and ChIA-PET by Li
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b13">13</xref>
) were considered consistent if one of the linked genes (with maximal score) was the same as one of the gene shown to interact by ChIA-PET. Of the 102,647 RegHsa elements identified on the human X chromosome, 2,096 elements overlap regions shown in the ChIA-PET experiment to interact with a promoter. We compared the genes evolutionarily linked (with a maximum score) with these elements, and the gene(s) shown to interact, via their promoter, with the overlapping regions via ChIA-PET. For 1,454 elements (69%), the linked genes and the interacting gene are consistent. To compute a
<italic>P</italic>
value expressing the probability of obtaining the same result by chance, we performed 10.000 resamplings of the genes linked to the 2,096 RegHsa elements that overlap ChIA-PET enhancers. In each resampling, each RegHsa element was associated to the same number of best scoring linked genes, but randomly selected among all genes present in a 2-Mb window centred on the RegHsa element. If the ChIA-PET gene target was found among these randomly associated genes, we considered the two experiments to be consistent by chance. No resampling trial reached the number of coincidences between ChIA-PET and ‘linkage score' experiment obtained from in the real data. We thus estimate that the
<italic>P</italic>
value of the test is <10
<sup>−5</sup>
.</p>
</sec>
<sec disp-level="2">
<title>ChIP-on-chip from human and mouse developing brain</title>
<p>This assay was performed for p300, H3K4me1 and H3K27ac as described
<xref ref-type="bibr" rid="b28">28</xref>
, with several modifications. Embryonic brain was isolated from human (three samples at 50 days of gestation) and mouse (E 14.5 and P0) embryos. Human fetal brain tissues were collected with informed written consent and ethical approval by Southampton and South West Hants LREC. Pools of whole brain were treated with 1.5% formaldehyde for 10 min at room temperature. Crosslinking was stopped by the addition of glycine to a final concentration of 0.125 M. The brain tissue was chopped into small pieces (∼1 mm
<sup>3</sup>
) with a razor blade in cold 1 × PBS and single cell suspension was made using dounce homogenizer. The cells were swelled on ice for 10 min. in 25 mM HEPES, pH 7.8, 1.5 mM MgCl
<sub>2</sub>
, 10 mM KCl, 0.1% NP-40, 1 mM DTT (dithiothreitol) and protease inhibitor cocktail (Roche) and the nuclei were collected by centrifugation at 2,500 r.p.m. Nuclei were resuspended in ‘sonication buffer' containing 50 mM HEPES pH 7.9, 140 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% Na-deoxycholate, 0.1% SDS and protease inhibitors, and sonicated on ice to an average length of 200–500 bp. The samples were centrifuged at 14,000 r.p.m. and the chromatin was precleared with protein-A-Dynabeads. Precleared chromatin were imunoprecipitated with 5 μg of H3K4me1 (ab8895, Abcam), 5μg of H3K27Ac (ab4729, Abcam) and 10 μg of p300(C-20:sc585,Santacruz) antibodies and the immune complexes were collected by incubating with protein-A-Dynabeads. The beads were washed twice with ‘sonication buffer', twice with sonication buffer containing 500 mM NaCl, twice with 20 mM Tris, pH 8.0, 1 mM EDTA, 250 mM LiCl, 0.5% NP-40, 0.5% Na-deoxycholate and twice with TE buffer. The immunocomplexes were eluted with 50 mM Tris, pH 8.0, 1 mM EDTA and 1% SDS at 65 °C for 10 min., adjusted to 200 mM NaCl and incubated at 65 °C overnight to reverse the cross-links. After successive treatments with 10 μg ml
<sup>−1</sup>
Rnase A and 20 μg ml
<sup>−1</sup>
proteinase-K, the samples were eluted into 50 μl H2O using the QIAquick Spin Gel Purification Kit (Qiagen). ChIP DNA and input DNA were labelled with Cy5 or Cy3, respectively, using random priming with dye-labelled random hexamers and hybridized according to the manufacturer's protocol to a HX1 (2.16 million probes) custom microarray containing specific tiled regions encompassing 99.2 and 93.8 Mb of the human and mouse X chromosome, respectively, (Nimblegen). Arrays were scanned on a NimbleGen MS 200 Microarray scanner (Nimblegen) using a laser power of 100% and 2-μm resolution and TIFF images analysed using MS 200 Data Collection software to quantitate raw signal intensities. Computational analysis of the data was carried out using the Ringo R/Bioconductor package
<xref ref-type="bibr" rid="b29">29</xref>
.The Cy5/Cy3 log
<sub>2</sub>
ratio were calculated for each probe and scaled by subtracting Tukey's biweight mean, as recommended in the standard manufacturer's procedure (Nimblegen). Before calling ChIP-enriched regions, we performed a smoothing over individual probe intensities. ChIP-enriched regions were called using the findChersOnSmoothed function from the Ringo package, using parameters distCutOff=100 and minProbesInRow=6. ChIP-chip data have been deposited to the GEO repository under accession number GSE57358. Human fetal tissue was obtained with informed consent and according to the protocol ethically approved by Southampton and South West Hants LREC. The principal investigator of these ethical approvals is D.I.W.</p>
</sec>
<sec disp-level="2">
<title>Zebrafish transgenic assays of human REG elements</title>
<p>Sequences chosen for testing were PCR-amplified from human genomic DNA as elements of 1–3 kb size and subcloned into pCR8 plasmid to create an entry vector for the Gateway system. Subsequent cloning into a Tol2-GFP-destination vector, microinjection of the plasmid into fertilized zebrafish eggs as well as fluorescent screening of the embryos, establishing transgenic lines and expression pattern documentation have been described elsewhere
<xref ref-type="bibr" rid="b30">30</xref>
. All the experiments were approved by the animal ethics committee of the University of Sydney and in accordance with the German protection standards and were approved by the Government of Baden-Württemberg Regierungspräsidum Karlsruhe, Germany</p>
</sec>
<sec disp-level="2">
<title>CNE-target gene predictions and transgenic experiments</title>
<p>Transgenic elements tested in the course of this study were chosen based on a number of criteria, including sequence conservation, location near genes of medical interest and published information on enhancer function. Importantly, they were never chosen based on the linkage score described in the Methods section 2. It is therefore possible to use the transgenic experiments as a means to provide an indirect support for the two predictions:
<list id="l1" list-type="order">
<list-item>
<p>The regulatory potential of the CNE, if the latter drives specific and reproducible expression of the reporter gene (GFP) during zebrafish development.</p>
</list-item>
<list-item>
<p>The target gene being regulated by the CNE, if the GFP expression pattern overlaps the expression pattern of the predicted target.</p>
</list-item>
</list>
</p>
<p>The experiment may fail to deliver an interpretable result independently of the absence of function of the CNE as human regulatory enhancer. For example, this may happen if the CNE regulates the expression of its target genes exclusively after zebrafish development is complete, if the reporter cassette (see Methods section 6) is integrated in repressive chromatin environment, or if the human sequence element is not recognized by the zebrafish orthologue of the human TF (for example, if the zebrafish ortholog has an affinity for a different sequence, or if it is altogether absent from the zebrafish genome).</p>
<p>Here 436 human sequence elements were tested using zebrafish transgenic experiments (Methods). These sequence elements include 1,013 RegHsa elements (
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 2</xref>
). Thereafter, results will be described and discussed in terms of RegHsa elements, because RegHsa elements are the basic ‘units' of human sequences that are linked to target genes using the
<italic>S</italic>
<sub>
<italic>A</italic>
</sub>
score described in Methods. Of the 1,013 RegHsa, 574 (57%) overlap sequences that produced inconsistent expression patterns in the different F1 lines or no expression at all. The remaining 448 RegHsa produced partially or fully consistent GFP expression patterns and were further exploited. Of these, 125 elements are evolutionarily linked to one or several human genes with orthologues in zebrafish that have no recorded expression pattern in the ZFIN database. Therefore, these elements are not useful to assess the prediction that the RegHsa element is an enhancer that regulates its linked gene(s). Only the remaining 323 RegHsa elements fulfil the two conditions required to test if the transgenic experiment supports the prediction: they are contained in a sequence element that drives a partially reproducible or reproducible GFP pattern during zebrafish development, and their predicted human gene target(s) include at least one human gene with a zebrafish orthologue of known expression pattern. For the transgenic experiment, we examined the GFP expression pattern in at least five independent zebrafish F1 lines to assess the reproducibility of the pattern. The pattern was then manually recorded using ZFIN nomenclature according to the tissue(s) showing GFP expression. For the known expression pattern of the zebrafish orthologue(s), we listed the tissue(s) showing expression by
<italic>in situ</italic>
hybridization during development, or the tissue(s) affected by a mutation in the gene, or both (ZFIN database:
<ext-link ext-link-type="uri" xlink:href="http://zfin.org">http://zfin.org</ext-link>
). The GFP expression patterns and the ZFIN expression patterns were then compared, and results show that 200 RegHsa elements (60% of 323) drive a GFP expression pattern in a tissue that is included in the published expression pattern of the predicted target, or of one of the predicted targets when several exist with an identical maximum linkage score. A schematic diagram of the decision process described here is shown in
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 6</xref>
. We tested the possibility that this result may be due to a bias in the RegHsa elements. Indeed, the probability of a RegHsa elements driving GFP expression by chance in at least one tissue in common with its predicted targets increases in proportion to the number of predicted targets. However, the 200 RegHsa elements that drive a GFP pattern that overlaps with the pattern of a target gene possess an average of 4.7 targets, while the 123 RegHsa elements that drive a GFP pattern that does not overlap with that of any of the target genes possess an average of 6.3 targets. Therefore the results are consistent with the starting hypothesis, that a strong evolutionary linkage score between a CNE and one or more neighbouring genes reflects a regulatory role of the CNE on the expression of one of the linked genes.</p>
</sec>
<sec disp-level="2">
<title>Anatomical characterization of zebrafish GFP expression</title>
<p>(a) Adult GFP expression analysis: the dissected brains of F1 adult (3–9 months) zebrafish from two different transgene integrations of each tested element were fixed in 4% paraformaldehyde for 4 h at room temperature. The following primary antibodies were applied onto free-floating 80-μm-thick vibratome sections: GFP (1:500, chicken, Aves Laboratories), HuC/D (1:2,000, human, a gift from Dr B. Zalc, Salpêtrière Hospital, Paris), glutamine-synthase (1:500, mouse, Millipore). DAPI (diamidino-2-phenylindole; 1:3,000) was used as a nuclear counterstain. Secondary antibodies raised in goat coupled to AlexaFluor dyes (Invitrogen) were used (1:1,000). HuC/D as a neuronal marker and glutamine-synthase as a glial marker label the two main cell types of the zebrafish telencephalon
<xref ref-type="bibr" rid="b31">31</xref>
and therefore make it possible to identify GFP expressing cells. All images were taken on a Zeiss LSM700 confocal microscope using × 20 air, × 40 oil or × 63 oil objectives. Images were processed using the ZEN software (Zeiss). Composite images were automatically stitched upon acquisition using ‘Tilescan' mode on the Zeiss ZEN software. (b) Adult mRNA expression analysis by chromogenic
<italic>in situ</italic>
hybridization: the dissected brains of adult (3–9 months) zebrafish from the wild-type AB strain were fixed in 4% paraformaldehyde for 14 h at 4 °C. Whole brains were incubated at 65 °C for 18 h in 2 ng μl
<sup>−1</sup>
digoxigenin (DIG)-labelled mRNA probes. After hybridization, the brains were embedded in 3% agarose and 80-μm-thick cross sections were cut using a vibratome. The sections were blocked in blocking buffer (2% normal goat serum, 2 mg ml
<sup>−1</sup>
bovine serum albumin) and incubated with anti-DIG AP Fab fragments (sheep, Roche, 1:5,000) and the signal was revealed with NBT/BCIP. Pictures were taken on a Nikon AZ100 microscope equipped with a Nikon DS Ri1 camera. Expression of GFP from transgenic lines and the expression of mRNA in wild-type fish were compared manually using neuroanatomical landmarks and immunohistochemical labels. (c) Detailed expression analysis in juvenile fish: F1 juvenile zebrafish (3dpf and 6dpf) from three different transgene integrations of each tested element were anaesthetised in MS-222 and fixed immersion in 4% paraformaldehyde in 4% sucrose PBS (pH7.3). Samples were split into two sets. One set (called neuroanatomy test) was examined using wholemount immunohistochemistry to detect GFP in the context of two immunohistochemical neuroanatomical markers: SV2 and acetylated α tubulin. These neuroanatomical markers provide well characterized neuroanatomical landmarks to interpret the location of GFP expression. The protocol followed was the same as that employed to prepare samples for
<ext-link ext-link-type="uri" xlink:href="zebrafishbrain.org">zebrafishbrain.org</ext-link>
<xref ref-type="bibr" rid="b32">32</xref>
. The second set (called
<italic>in situ</italic>
test) was used to perform wholemount fluorescent
<italic>in situ</italic>
hybridization using DIG-labelled probes and tyramide detection according to the protocol of Lauter
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b33">33</xref>
followed by immunohistochemical detection of GFP. Both the sets of samples were examined using confocal microscopy from a dorsal and lateral aspect (eye removed). Stacks were examined in 3D using Fiji software for neuroanatomical location and overlap between native gene expression and GFP expression. Frequently the
<italic>in situ</italic>
test set showed poor expression data for the
<italic>in situ</italic>
hybridization channel. For these sets,
<italic>in situ</italic>
hybridization was carried out on wild-type AB embryos using chromogenic detection of DIG-labelled probes according the standard protocol of the Thisse laboratory
<xref ref-type="bibr" rid="b34">34</xref>
. Expression could then be compared between this sample and the neuroanatomical test sample. Output data took the form of text annotations of the neuroanatomical locations of GFP expression and its comparison with native zebrafish gene expression.</p>
</sec>
<sec disp-level="2">
<title>Analysis of sequence motifs in RegHsa elements</title>
<p>(a)
<italic>De novo</italic>
motif identification in CNEs. Conserved motifs were searched in each set of CNEs constitutive of a given REG element as long as they fulfil the following conditions (to minimize false positives): they must be associated to a single best target gene with a linkage score >0.3 and a relative score >0.05. Only sets comprising at least 10 CNEs (153 sets in total) were searched for possible motif enrichment. Motifs were detected using MEME3 (ref.
<xref ref-type="bibr" rid="b35">35</xref>
) with the following options and parameters: -dna -nmotifs 15 -revcomp -mod anr -wg 6 -ws 1 -minsites 5 -maxw 8. The different motif occurrences identified by MEME in the CNEs were further reviewed to increase the motif stringency. This was done by removing sequences presenting <80% identity with the first motif occurrence identified by MEME, which is considered to be the most similar to the motif. A threshold score characterizing each motif is then defined as the lowest weight obtained while matching the motif against each of its constitutive sequences, using the matrix-scan program of the RSATools suite
<xref ref-type="bibr" rid="b36">36</xref>
. This score will be used to seek the motif in other control CNE sets. For all RSAT tools used here, the background option (‘-bgfile') was applied, with background statistics calculated on the entire set of CNEs using the oligo-analysis program with the following parameters: -l 2 -1str –return freq. This program thus determined the frequencies of every possible dinucleotide in the total set of CNEs, and used these as background frequencies to compute the significance of observed motifs. (b) Are the motifs significantly overrepresented? Two statistical tests are further applied to eliminate motifs that may be due to chance occurrence. The first test consists in calculating a
<italic>P</italic>
value associated to the number of motif hits observed in the CNE set, by searching the motif in 1,000 random sets comprising the same number of CNEs, using matrix-scan and the weight threshold value previously computed. This
<italic>P</italic>
value reflects the number of times an equal or higher number of motif occurrences are found by chance, compared with the set of CNEs predicted to target the same gene. The second test consists in the search for motifs in the same CNE set but using shuffled motifs. These shuffled motifs are obtained by a column permutation of the motif of interest (reference motif), repeated up to 1,000 times until we obtain up to 10 motifs that are significantly different from the reference motif and from each other (the Pearson coefficient of correlation between position weight matrices, obtained by RSAT compare-matrices must be <0.30). Motifs were ultimately considered significant with this second test if none of the shuffled matrices found >2/3 of the number of matches found by the original motif, in the same CNE set. (c) Comparing motifs between sets of CNEs: after this filtering step, motifs obtained for distinct sets of CNEs targeting different genes were compared using the RSAT compare-matrices program
<xref ref-type="bibr" rid="b36">36</xref>
. Two motifs were considered as similar if the Pearson coefficient of correlation between their position weight matrices, further weighted by the length of the match, was >800. (d) Are CNEs enriched in known motifs? We computed the proportion of CNEs that match known motifs, as a function of increasing evolutionary linkage score to a neighbouring gene (similar to
<xref ref-type="fig" rid="f1">Fig. 1b</xref>
). CNEs were divided in classes of increasing linkage score, and each class was compared with the TRANSFAC database (complete vertebrate motifs; version 2010)
<xref ref-type="bibr" rid="b37">37</xref>
, to a list of sites established by high throughput SELEX
<xref ref-type="bibr" rid="b38">38</xref>
and to matrices from the JASPAR database (version 2011)
<xref ref-type="bibr" rid="b39">39</xref>
(
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 4</xref>
). Matches between CNEs and matrices of known motifs were identified using the matrix-scan program from RSAT
<xref ref-type="bibr" rid="b36">36</xref>
, with the background as described above and with the following parameters: -1str –lth score 5.0. Only motifs showing a score >15 were considered. A full description of motifs shown in
<xref ref-type="fig" rid="f4">Fig. 4</xref>
is in
<xref ref-type="supplementary-material" rid="S1">Supplementary Data 4</xref>
.</p>
</sec>
<sec disp-level="2">
<title>Code availability</title>
<p>Python scripts to identify CNEs in multiple alignments and to compute the linkage score are freely available under a GNU GPL v3 or later, and under a CeCiLL v2 license in France, as a GitHub project named Regulus:
<ext-link ext-link-type="uri" xlink:href="https://github.com/DyogenIBENS/Regulus">https://github.com/DyogenIBENS/Regulus</ext-link>
.</p>
</sec>
</sec>
<sec disp-level="1">
<title>Author contributions</title>
<p>M.N. and H.R.C designed the evolutionary genomics method and M.N. performed analyses with help from A.L.. M.I., M.F., E.M. S.Ri. and S.Ra. performed zebrafish transgenic experiments. H.B. performed ChIP-chip experiments. M.K. and T.A.H. performed zebrafish
<italic>in situ</italic>
experiments. C.S.R.C analysed ChIP-chip data. D.I.W. provided human fetal tissues. D.R.F, V.V.H, S.W., B.L., U.S., L.B.-C., T.S.B., H.R.C. co-led the project with advice from F.L.R. T.S.B. designed the initial study. M.N. and H.R.C. wrote the manuscript with contributions from L.B.-C., M.K., T.A.H, U.S., B.L., V.V.H., T.S.B. and D.R.F.</p>
</sec>
<sec disp-level="1">
<title>Additional information</title>
<p>
<bold>Accession codes:</bold>
ChIP-chip data have been deposited to the GEO repository under accession code
<ext-link ext-link-type="NCBI:geo" xlink:href="GSE57358">GSE57358</ext-link>
</p>
<p>
<bold>How to cite this article:</bold>
Naville, M.
<italic>et al</italic>
. Long-range evolutionary constraints reveal
<italic>cis</italic>
-regulatory interactions on the human X chromosome.
<italic>Nat. Commun</italic>
. 6:6904 doi: 10.1038/ncomms7904 (2015).</p>
</sec>
<sec sec-type="supplementary-material" id="S1">
<title>Supplementary Material</title>
<supplementary-material id="d33e18" content-type="local-data">
<caption>
<title>Supplementary Information</title>
<p>Supplementary Figures 1-7, Supplementary Tables 1-2 and Supplementary References</p>
</caption>
<media xlink:href="ncomms7904-s1.pdf"></media>
</supplementary-material>
<supplementary-material id="d33e24" content-type="local-data">
<caption>
<title>Supplementary Data 1</title>
<p>List of 102,647 RegHsa elements, their chromosomal positions, their linkage score and their predicted target genes.</p>
</caption>
<media xlink:href="ncomms7904-s2.txt"></media>
</supplementary-material>
<supplementary-material id="d33e30" content-type="local-data">
<caption>
<title>Supplementary Data 2</title>
<p>List of 1,013 RegHsa elements and the results of the transgenic experiments.</p>
</caption>
<media xlink:href="ncomms7904-s3.xlsx"></media>
</supplementary-material>
<supplementary-material id="d33e36" content-type="local-data">
<caption>
<title>Supplementary Data 3</title>
<p>Results of detailed comparison between transgenic GFP reporter assays and in situ mRNA patterns of predicted targets.</p>
</caption>
<media xlink:href="ncomms7904-s4.xlsx"></media>
</supplementary-material>
<supplementary-material id="d33e42" content-type="local-data">
<caption>
<title>Supplementary Data 4</title>
<p>List of individual sequence instances contributing to motifs described in Figure 4 of the main text.</p>
</caption>
<media xlink:href="ncomms7904-s5.xlsx"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>This work was funded by the 7th framework programme of the European Union (NeuroXsys Project HEALTH- F4-2009-223262). T.H. and S.W. were additionally supported by BBSRC grant FBACJ 512988 and H.R.C. received support under the programme « Investissements d'Avenir » launched by the French Government and implemented by the ANR (ANR-10-LABX-54 MEMO LIFE; ANR-11-IDEX-0001-02 PSL* Research University).</p>
</ack>
<ref-list>
<ref id="b1">
<mixed-citation publication-type="journal">
<name>
<surname>Benko</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence</article-title>
.
<source>Nat. Genet.</source>
<volume>41</volume>
,
<fpage>359</fpage>
<lpage>364</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19234473</pub-id>
</mixed-citation>
</ref>
<ref id="b2">
<mixed-citation publication-type="journal">
<name>
<surname>Lettice</surname>
<given-names>L. A.</given-names>
</name>
<etal></etal>
.
<article-title>A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly</article-title>
.
<source>Hum. Mol. Genet.</source>
<volume>12</volume>
,
<fpage>1725</fpage>
<lpage>1735</lpage>
(
<year>2003</year>
).
<pub-id pub-id-type="pmid">12837695</pub-id>
</mixed-citation>
</ref>
<ref id="b3">
<mixed-citation publication-type="journal">
<name>
<surname>Goode</surname>
<given-names>D. K.</given-names>
</name>
,
<name>
<surname>Snell</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Smith</surname>
<given-names>S. F.</given-names>
</name>
,
<name>
<surname>Cooke</surname>
<given-names>J. E.</given-names>
</name>
&
<name>
<surname>Elgar</surname>
<given-names>G.</given-names>
</name>
<article-title>Highly conserved regulatory elements around the SHH gene may contribute to the maintenance of conserved synteny across human chromosome 7q36.3</article-title>
.
<source>Genomics</source>
<volume>86</volume>
,
<fpage>172</fpage>
<lpage>181</lpage>
(
<year>2005</year>
).
<pub-id pub-id-type="pmid">15939571</pub-id>
</mixed-citation>
</ref>
<ref id="b4">
<mixed-citation publication-type="journal">
<name>
<surname>Kikuta</surname>
<given-names>H.</given-names>
</name>
<etal></etal>
.
<article-title>Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates</article-title>
.
<source>Genome Res.</source>
<volume>17</volume>
,
<fpage>545</fpage>
<lpage>555</lpage>
(
<year>2007</year>
).
<pub-id pub-id-type="pmid">17387144</pub-id>
</mixed-citation>
</ref>
<ref id="b5">
<mixed-citation publication-type="journal">
<name>
<surname>Mongin</surname>
<given-names>E.</given-names>
</name>
,
<name>
<surname>Dewar</surname>
<given-names>K.</given-names>
</name>
&
<name>
<surname>Blanchette</surname>
<given-names>M.</given-names>
</name>
<article-title>Mapping association between long-range cis-regulatory regions and their target genes using synteny</article-title>
.
<source>J. Comput. Biol.</source>
<volume>18</volume>
,
<fpage>1115</fpage>
<lpage>1130</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21899419</pub-id>
</mixed-citation>
</ref>
<ref id="b6">
<mixed-citation publication-type="journal">
<name>
<surname>Blanchette</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>Aligning multiple genomic sequences with the threaded blockset aligner</article-title>
.
<source>Genome Res.</source>
<volume>14</volume>
,
<fpage>708</fpage>
<lpage>715</lpage>
(
<year>2004</year>
).
<pub-id pub-id-type="pmid">15060014</pub-id>
</mixed-citation>
</ref>
<ref id="b7">
<mixed-citation publication-type="journal">
<name>
<surname>Lindblad-Toh</surname>
<given-names>K.</given-names>
</name>
<etal></etal>
.
<article-title>A high-resolution map of human evolutionary constraint using 29 mammals</article-title>
.
<source>Nature</source>
<volume>7</volume>
,
<fpage>476</fpage>
<lpage>482</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21993624</pub-id>
</mixed-citation>
</ref>
<ref id="b8">
<mixed-citation publication-type="journal">
<name>
<surname>Ernst</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
.
<article-title>Mapping and analysis of chromatin state dynamics in nine human cell types</article-title>
.
<source>Nature</source>
<volume>473</volume>
,
<fpage>43</fpage>
<lpage>49</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21441907</pub-id>
</mixed-citation>
</ref>
<ref id="b9">
<mixed-citation publication-type="journal">
<name>
<surname>Blow</surname>
<given-names>M. J.</given-names>
</name>
<etal></etal>
.
<article-title>ChIP-Seq identification of weakly conserved heart enhancers</article-title>
.
<source>Nat. Genet.</source>
<volume>42</volume>
,
<fpage>806</fpage>
<lpage>810</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20729851</pub-id>
</mixed-citation>
</ref>
<ref id="b10">
<mixed-citation publication-type="journal">
<name>
<surname>Visel</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>ChIP-seq accurately predicts tissue-specific activity of enhancers</article-title>
.
<source>Nature</source>
<volume>457</volume>
,
<fpage>854</fpage>
<lpage>858</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19212405</pub-id>
</mixed-citation>
</ref>
<ref id="b11">
<mixed-citation publication-type="journal">
<name>
<surname>Stevenson</surname>
<given-names>R. E.</given-names>
</name>
&
<name>
<surname>Schwartz</surname>
<given-names>C. E.</given-names>
</name>
<article-title>X-linked intellectual disability: unique vulnerability of the male genome</article-title>
.
<source>Dev. Disabil. Res. Rev.</source>
<volume>15</volume>
,
<fpage>361</fpage>
<lpage>368</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">20014364</pub-id>
</mixed-citation>
</ref>
<ref id="b12">
<mixed-citation publication-type="journal">
<name>
<surname>Sprague</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
.
<article-title>The Zebrafish Information Network: the zebrafish model organism database</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>34</volume>
,
<fpage>D581</fpage>
<lpage>D585</lpage>
(
<year>2006</year>
).
<pub-id pub-id-type="pmid">16381936</pub-id>
</mixed-citation>
</ref>
<ref id="b13">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<etal></etal>
.
<article-title>Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation</article-title>
.
<source>Cell</source>
<volume>148</volume>
,
<fpage>84</fpage>
<lpage>98</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22265404</pub-id>
</mixed-citation>
</ref>
<ref id="b14">
<mixed-citation publication-type="journal">
<name>
<surname>Fong</surname>
<given-names>A. P.</given-names>
</name>
<etal></etal>
.
<article-title>Genetic and epigenetic determinants of neurogenesis and myogenesis</article-title>
.
<source>Dev. Cell</source>
<volume>22</volume>
,
<fpage>721</fpage>
<lpage>735</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22445365</pub-id>
</mixed-citation>
</ref>
<ref id="b15">
<mixed-citation publication-type="journal">
<name>
<surname>Ince-Dunn</surname>
<given-names>G.</given-names>
</name>
<etal></etal>
.
<article-title>Regulation of thalamocortical patterning and synaptic maturation by NeuroD2</article-title>
.
<source>Neuron</source>
<volume>49</volume>
,
<fpage>683</fpage>
<lpage>695</lpage>
(
<year>2006</year>
).
<pub-id pub-id-type="pmid">16504944</pub-id>
</mixed-citation>
</ref>
<ref id="b16">
<mixed-citation publication-type="journal">
<name>
<surname>Narayanan</surname>
<given-names>G.</given-names>
</name>
<etal></etal>
.
<article-title>Single-cell mRNA profiling identifies progenitor subclasses in neurospheres</article-title>
.
<source>Stem Cells Dev.</source>
<volume>21</volume>
,
<fpage>3351</fpage>
<lpage>3362</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22834539</pub-id>
</mixed-citation>
</ref>
<ref id="b17">
<mixed-citation publication-type="journal">
<name>
<surname>Chepelev</surname>
<given-names>I.</given-names>
</name>
,
<name>
<surname>Wei</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Wangsa</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Tang</surname>
<given-names>Q.</given-names>
</name>
&
<name>
<surname>Zhao</surname>
<given-names>K.</given-names>
</name>
<article-title>Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization</article-title>
.
<source>Cell Res.</source>
<volume>22</volume>
,
<fpage>490</fpage>
<lpage>503</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22270183</pub-id>
</mixed-citation>
</ref>
<ref id="b18">
<mixed-citation publication-type="journal">
<name>
<surname>Merkenschlager</surname>
<given-names>M.</given-names>
</name>
&
<name>
<surname>Odom</surname>
<given-names>D. T.</given-names>
</name>
<article-title>CTCF and cohesin: linking gene regulatory elements with their targets</article-title>
.
<source>Cell</source>
<volume>152</volume>
,
<fpage>1285</fpage>
<lpage>1297</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23498937</pub-id>
</mixed-citation>
</ref>
<ref id="b19">
<mixed-citation publication-type="journal">
<name>
<surname>Andersson</surname>
<given-names>R.</given-names>
</name>
<etal></etal>
.
<article-title>An atlas of active enhancers across human cell types and tissues</article-title>
.
<source>Nature</source>
<volume>507</volume>
,
<fpage>455</fpage>
<lpage>461</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24670763</pub-id>
</mixed-citation>
</ref>
<ref id="b20">
<mixed-citation publication-type="journal">
<name>
<surname>Kleinjan</surname>
<given-names>D. A.</given-names>
</name>
&
<name>
<surname>van Heyningen</surname>
<given-names>V.</given-names>
</name>
<article-title>Long-range control of gene expression: emerging mechanisms and disruption in disease</article-title>
.
<source>Am. J. Hum. Genet.</source>
<volume>76</volume>
,
<fpage>8</fpage>
<lpage>32</lpage>
(
<year>2005</year>
).
<pub-id pub-id-type="pmid">15549674</pub-id>
</mixed-citation>
</ref>
<ref id="b21">
<mixed-citation publication-type="journal">
<name>
<surname>Benko</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence</article-title>
.
<source>Nat. Genet.</source>
<volume>41</volume>
,
<fpage>359</fpage>
<lpage>364</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19234473</pub-id>
</mixed-citation>
</ref>
<ref id="b22">
<mixed-citation publication-type="journal">
<name>
<surname>Smemo</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>Regulatory variation in a TBX5 enhancer leads to isolated congenital heart disease</article-title>
.
<source>Hum. Mol. Genet.</source>
<volume>21</volume>
,
<fpage>3255</fpage>
<lpage>3263</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22543974</pub-id>
</mixed-citation>
</ref>
<ref id="b23">
<mixed-citation publication-type="journal">
<name>
<surname>Weedon</surname>
<given-names>M. N.</given-names>
</name>
<etal></etal>
.
<article-title>Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis</article-title>
.
<source>Nat. Genet.</source>
<volume>46</volume>
,
<fpage>61</fpage>
<lpage>64</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24212882</pub-id>
</mixed-citation>
</ref>
<ref id="b24">
<mixed-citation publication-type="journal">
<name>
<surname>Noonan</surname>
<given-names>J. P.</given-names>
</name>
&
<name>
<surname>McCallion</surname>
<given-names>A. S.</given-names>
</name>
<article-title>Genomics of long-range regulatory elements</article-title>
.
<source>Annu. Rev. Genomics Hum. Genet.</source>
<volume>11</volume>
,
<fpage>1</fpage>
<lpage>23</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20438361</pub-id>
</mixed-citation>
</ref>
<ref id="b25">
<mixed-citation publication-type="journal">
<name>
<surname>Flicek</surname>
<given-names>P.</given-names>
</name>
<etal></etal>
.
<article-title>Ensembl 2012</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>40</volume>
,
<fpage>D84</fpage>
<lpage>D90</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22086963</pub-id>
</mixed-citation>
</ref>
<ref id="b26">
<mixed-citation publication-type="journal">
<name>
<surname>Dunham</surname>
<given-names>I.</given-names>
</name>
<etal></etal>
.
<article-title>An integrated encyclopedia of DNA elements in the human genome</article-title>
.
<source>Nature</source>
<volume>489</volume>
,
<fpage>57</fpage>
<lpage>74</lpage>
(
<year>2012</year>
).
<pub-id pub-id-type="pmid">22955616</pub-id>
</mixed-citation>
</ref>
<ref id="b27">
<mixed-citation publication-type="journal">
<name>
<surname>Ashburner</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</article-title>
.
<source>Nat. Genet.</source>
<volume>25</volume>
,
<fpage>25</fpage>
<lpage>29</lpage>
(
<year>2000</year>
).
<pub-id pub-id-type="pmid">10802651</pub-id>
</mixed-citation>
</ref>
<ref id="b28">
<mixed-citation publication-type="journal">
<name>
<surname>Soutoglou</surname>
<given-names>E.</given-names>
</name>
&
<name>
<surname>Talianidis</surname>
<given-names>I.</given-names>
</name>
<article-title>Coordination of PIC assembly and chromatin remodeling during differentiation-induced gene activation</article-title>
.
<source>Science</source>
<volume>295</volume>
,
<fpage>1901</fpage>
<lpage>1904</lpage>
(
<year>2002</year>
).
<pub-id pub-id-type="pmid">11884757</pub-id>
</mixed-citation>
</ref>
<ref id="b29">
<mixed-citation publication-type="journal">
<name>
<surname>Toedling</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
.
<article-title>Ringo--an R/Bioconductor package for analyzing ChIP-chip readouts</article-title>
.
<source>BMC Bioinformatics</source>
<volume>8</volume>
, (
<year>2007</year>
).</mixed-citation>
</ref>
<ref id="b30">
<mixed-citation publication-type="journal">
<name>
<surname>Ishibashi</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Mechaly</surname>
<given-names>A. S.</given-names>
</name>
,
<name>
<surname>Becker</surname>
<given-names>T. S.</given-names>
</name>
&
<name>
<surname>Rinkwitz</surname>
<given-names>S.</given-names>
</name>
<article-title>Using zebrafish transgenesis to test human genomic sequences for specific enhancer activity</article-title>
.
<source>Methods.</source>
<volume>62</volume>
,
<fpage>216</fpage>
<lpage>225</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23542551</pub-id>
</mixed-citation>
</ref>
<ref id="b31">
<mixed-citation publication-type="journal">
<name>
<surname>Marz</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>Heterogeneity in progenitor cell subtypes in the ventricular zone of the zebrafish adult telencephalon</article-title>
.
<source>Glia</source>
<volume>58</volume>
,
<fpage>870</fpage>
<lpage>888</lpage>
(
<year>2010</year>
).
<pub-id pub-id-type="pmid">20155821</pub-id>
</mixed-citation>
</ref>
<ref id="b32">
<mixed-citation publication-type="journal">
<name>
<surname>Turner</surname>
<given-names>K. J.</given-names>
</name>
,
<name>
<surname>Bracewell</surname>
<given-names>T. G.</given-names>
</name>
&
<name>
<surname>Hawkins</surname>
<given-names>T. A.</given-names>
</name>
<article-title>Anatomical dissection of zebrafish brain development</article-title>
.
<source>Methods Mol. Biol.</source>
<volume>1082</volume>
,
<fpage>197</fpage>
<lpage>214</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24048936</pub-id>
</mixed-citation>
</ref>
<ref id="b33">
<mixed-citation publication-type="journal">
<name>
<surname>Lauter</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Soll</surname>
<given-names>I.</given-names>
</name>
&
<name>
<surname>Hauptmann</surname>
<given-names>G.</given-names>
</name>
<article-title>Sensitive whole-mount fluorescent
<italic>in situ</italic>
hybridization in zebrafish using enhanced tyramide signal amplification</article-title>
.
<source>Methods Mol. Biol.</source>
<volume>1082</volume>
,
<fpage>175</fpage>
<lpage>185</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24048934</pub-id>
</mixed-citation>
</ref>
<ref id="b34">
<mixed-citation publication-type="journal">
<name>
<surname>Thisse</surname>
<given-names>C.</given-names>
</name>
&
<name>
<surname>Thisse</surname>
<given-names>B.</given-names>
</name>
<article-title>High-resolution in situ hybridization to whole-mount zebrafish embryos</article-title>
.
<source>Nat. Protoc.</source>
<volume>3</volume>
,
<fpage>59</fpage>
<lpage>69</lpage>
(
<year>2008</year>
).
<pub-id pub-id-type="pmid">18193022</pub-id>
</mixed-citation>
</ref>
<ref id="b35">
<mixed-citation publication-type="journal">
<name>
<surname>Bailey</surname>
<given-names>T. L.</given-names>
</name>
<etal></etal>
.
<article-title>MEME SUITE: tools for motif discovery and searching</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>37</volume>
,
<fpage>W202</fpage>
<lpage>W208</lpage>
(
<year>2009</year>
).
<pub-id pub-id-type="pmid">19458158</pub-id>
</mixed-citation>
</ref>
<ref id="b36">
<mixed-citation publication-type="journal">
<name>
<surname>Thomas-Chollier</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>RSAT 2011: regulatory sequence analysis tools</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>39</volume>
,
<fpage>W86</fpage>
<lpage>W91</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21715389</pub-id>
</mixed-citation>
</ref>
<ref id="b37">
<mixed-citation publication-type="journal">
<name>
<surname>Wingender</surname>
<given-names>E.</given-names>
</name>
<etal></etal>
.
<article-title>The TRANSFAC system on gene expression regulation</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>29</volume>
,
<fpage>281</fpage>
<lpage>283</lpage>
(
<year>2001</year>
).
<pub-id pub-id-type="pmid">11125113</pub-id>
</mixed-citation>
</ref>
<ref id="b38">
<mixed-citation publication-type="journal">
<name>
<surname>Jolma</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>DNA-binding specificities of human transcription factors</article-title>
.
<source>Cell</source>
<volume>152</volume>
,
<fpage>327</fpage>
<lpage>339</lpage>
(
<year>2013</year>
).
<pub-id pub-id-type="pmid">23332764</pub-id>
</mixed-citation>
</ref>
<ref id="b39">
<mixed-citation publication-type="journal">
<name>
<surname>Mathelier</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>42</volume>
,
<fpage>D142</fpage>
<lpage>D147</lpage>
(
<year>2014</year>
).
<pub-id pub-id-type="pmid">24194598</pub-id>
</mixed-citation>
</ref>
<ref id="b40">
<mixed-citation publication-type="journal">
<name>
<surname>Rothenaigner</surname>
<given-names>I.</given-names>
</name>
<etal></etal>
.
<article-title>Clonal analysis by distinct viral vectors identifies bona fide neural stem cells in the adult zebrafish telencephalon and characterizes their division properties and fate</article-title>
.
<source>Development</source>
<volume>138</volume>
,
<fpage>1459</fpage>
<lpage>1469</lpage>
(
<year>2011</year>
).
<pub-id pub-id-type="pmid">21367818</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
<floats-group>
<fig id="f1">
<label>Figure 1</label>
<caption>
<title>Scoring evolutionary linkage.</title>
<p>(
<bold>a</bold>
) Strategy to compute the linkage score. The presence of human genes in a 1-Mb radius around a CNE are recorded, as well as the simultaneous presence/absence of their orthologs in the vicinity of the orthologous CNEs in different species (green ticks/red crosses, respectively, in the middle panel; hash signs indicate genes located beyond the 1 Mb threshold). The presence of an orthologue is weighted by the degree of conserved synteny
<italic>R</italic>
between this genome and the human genome, while the costs for the absence of a gene account for the sequencing coverage C of the genome. The final linkage score
<italic>S</italic>
is the sum of these weights in the different genomes where the CNE is present (right panel). The gene(s) showing the maximum linkage score to a given CNE is considered to be the most likely target. (
<bold>b</bold>
) The linkage score of the CNE-target predictions were grouped in bins according to the genomic distance between the CNE and its predicted target (
<italic>x</italic>
axis). The median linkage score of the distributions (
<italic>y</italic>
axis) is stable for genes located up to ∼600 kb from the RegHsa element. (
<bold>c</bold>
) The linkage score is strongly correlated with an enrichment in annotations linked to enhancer function. An asterisk indicates data generated during this project.</p>
</caption>
<graphic xlink:href="ncomms7904-f1"></graphic>
</fig>
<fig id="f2">
<label>Figure 2</label>
<caption>
<title>
<italic>Cis</italic>
-regulatory interactions predicted by the linkage score are experimentally tested in developing zebrafish.</title>
<p>(
<bold>a</bold>
) Individual exons of the predicted target gene are depicted in green and of neighbouring genes in pink. The arrowhead indicates the direction of transcription. Distance in kilobases between the CNE and the promoter of the predicted gene are indicated. (
<bold>b</bold>
) The predictions are supported by transgenic analysis in zebrafish. Expression at 48 hpf: NX_hs79: telencephalon (scale bar, 125 μm); NX_hs54: hindbrain, telencephalon (scale bar, 125 μm); NX_hs162: telencephalon, hypothalamus, otic vesicle (scale bar, 125 μm); NX_hs226: hindbrain (scale bar, 200 μm); NX_hs375: midbrain (scale bar, 200 μm).</p>
</caption>
<graphic xlink:href="ncomms7904-f2"></graphic>
</fig>
<fig id="f3">
<label>Figure 3</label>
<caption>
<title>Neuroanatomical characterization of the element NX_hs54.</title>
<p>This element includes RegHsa0032185 and was characterized in transgenic adult and juvenile zebrafish. (
<bold>a</bold>
<bold>d</bold>
) Immunohistochemical analysis of S100β (grey, radial glial stem cells), GFP (green), and Hu (magenta, neurons) expression in the telencephalon (level in
<bold>g</bold>
) in two different transgene integrations (2–1 and 4–1). Radial glial stem cells outline the telencephalic surface (yellow arrows,
<bold>b</bold>
) and generate neurons (white arrows,
<bold>b</bold>
)
<xref ref-type="bibr" rid="b40">40</xref>
. In one integration, GFP is expressed by virtually all neurons and their fibres underneath the radial glial cell layer (
<bold>b</bold>
). In the other integration (
<bold>c</bold>
,
<bold>d</bold>
), likely due to positional effects, GFP expression is restricted to individual neuronal clones (grey arrows). (
<bold>e</bold>
)
<italic>in situ</italic>
hybridization for endogenous
<italic>bcor</italic>
mRNA in the adult zebrafish telencephalon (level in
<bold>g</bold>
).
<italic>bcor</italic>
mRNA is expressed by the newborn neurons (white arrow,
<bold>f</bold>
) underlying the first cell layer of radial glial stem cells (yellow arrow,
<bold>f</bold>
). The extended GFP expression in transgenic lines is in agreement with GFP protein stability in neurons after endogenous
<italic>bcor</italic>
expression is switched off, and/or with the absence of a repressor element. (
<bold>g</bold>
) schematic lateral and dorsal views of an adult zebrafish brain showing the region (red line) examined in
<bold>a</bold>
,
<bold>c</bold>
,
<bold>e</bold>
.(
<bold>h</bold>
,
<bold>j</bold>
) Immunohistochemical characterization of juvenile GFP expression in NX_hs54#4-1 demonstrates overlap with endogenous
<italic>bcor</italic>
expression (
<bold>l</bold>
,
<bold>m</bold>
). Use of two anatomical markers: acetylated tubulin (
<bold>h,i</bold>
,
<bold>j,k</bold>
; magenta) and nuclear staining (
<bold>i</bold>
,
<bold>k</bold>
; greyscale) permits describing GFP expression in the telencephalon at two different section levels by confocal microscopy (
<bold>h</bold>
anterior to
<bold>j</bold>
). At 3dpf in NX_hs54#4-1 transgenic embryos GFP is widely expressed at a low level but also shows strong expression in the dorsal and lateral area adjacent to the ventricle (
<bold>h</bold>
,
<bold>j</bold>
; white arrowheads). This is similar to endogenous
<italic>bcor</italic>
mRNA, which also shows low level expression throughout the telencephalon and whole brain but has an area of strong expression next to the ventricle (
<bold>l</bold>
,
<bold>m</bold>
; yellow arrowheads, ventricle boundary marked by red dashed line). Abbreviations: AC, anterior commissure, tel, telencephalon, OB, olfactory bulb. Scale bars,
<bold>a</bold>
,
<bold>c</bold>
,
<bold>e</bold>
, 100 μm;
<bold>b</bold>
,
<bold>f</bold>
, 60 μm; d, 40 μm;
<bold>h</bold>
,
<bold>i</bold>
,
<bold>j</bold>
,
<bold>k</bold>
, 100 μm;
<bold>l</bold>
,
<bold>m</bold>
, 40 μm.</p>
</caption>
<graphic xlink:href="ncomms7904-f3"></graphic>
</fig>
<fig id="f4">
<label>Figure 4</label>
<caption>
<title>Motifs shared between RegHsa elements suggest co-regulated genes.</title>
<p>(
<bold>a</bold>
) The NEUROD1/NEUROD2 binding site is recurrently found in multiple RegHsa elements linked to nine genes on the human X chromosome. (
<bold>b</bold>
) AFF2 and IL1RAPL1 share five overrepresented motifs in their linked RegHsa elements. Each motif logo is indicated together with the number of occurrences (occ.) in the set of RegHsa elements. Motif 3 is similar to the binding site of the KLF12 transcription factor. (
<bold>c</bold>
) BCOR and MAGEB10 share four overrepresented motifs in their linked RegHsa elements.</p>
</caption>
<graphic xlink:href="ncomms7904-f4"></graphic>
</fig>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0007389 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0007389 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024