Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes

Identifieur interne : 000460 ( Pmc/Corpus ); précédent : 000459; suivant : 000461

Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes

Auteurs : Matko Glun I ; Ines Vlahovi ; Vladimir Paar

Source :

RBID : PMC:6718397

Abstract

The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.


Url:
DOI: 10.1038/s41598-019-49022-2
PubMed: 31477765
PubMed Central: 6718397

Links to Exploration step

PMC:6718397

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes</title>
<author>
<name sortKey="Glun I, Matko" sort="Glun I, Matko" uniqKey="Glun I M" first="Matko" last="Glun I">Matko Glun I</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vlahovi, Ines" sort="Vlahovi, Ines" uniqKey="Vlahovi I" first="Ines" last="Vlahovi">Ines Vlahovi</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff2">Algebra University College, Ilica 242, 10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Paar, Vladimir" sort="Paar, Vladimir" uniqKey="Paar V" first="Vladimir" last="Paar">Vladimir Paar</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff3">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0806 5093</institution-id>
<institution-id institution-id-type="GRID">grid.454373.2</institution-id>
<institution>Croatian Academy of Sciences and Arts,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">31477765</idno>
<idno type="pmc">6718397</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6718397</idno>
<idno type="RBID">PMC:6718397</idno>
<idno type="doi">10.1038/s41598-019-49022-2</idno>
<date when="2019">2019</date>
<idno type="wicri:Area/Pmc/Corpus">000460</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000460</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes</title>
<author>
<name sortKey="Glun I, Matko" sort="Glun I, Matko" uniqKey="Glun I M" first="Matko" last="Glun I">Matko Glun I</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vlahovi, Ines" sort="Vlahovi, Ines" uniqKey="Vlahovi I" first="Ines" last="Vlahovi">Ines Vlahovi</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff2">Algebra University College, Ilica 242, 10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Paar, Vladimir" sort="Paar, Vladimir" uniqKey="Paar V" first="Vladimir" last="Paar">Vladimir Paar</name>
<affiliation>
<nlm:aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff3">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0806 5093</institution-id>
<institution-id institution-id-type="GRID">grid.454373.2</institution-id>
<institution>Croatian Academy of Sciences and Arts,</institution>
</institution-wrap>
10000 Zagreb, Croatia</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Scientific Reports</title>
<idno type="eISSN">2045-2322</idno>
<imprint>
<date when="2019">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p id="Par1">The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Waye, Js" uniqKey="Waye J">JS Waye</name>
</author>
<author>
<name sortKey="Willard, Hf" uniqKey="Willard H">HF Willard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aldrup Macdonald, Me" uniqKey="Aldrup Macdonald M">ME Aldrup-Macdonald</name>
</author>
<author>
<name sortKey="Sullivan, Ba" uniqKey="Sullivan B">BA Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garrido Ramos, Manuel" uniqKey="Garrido Ramos M">Manuel Garrido-Ramos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bersani, F" uniqKey="Bersani F">F Bersani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Treangen, Tj" uniqKey="Treangen T">TJ Treangen</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lower, Ss" uniqKey="Lower S">SS Lower</name>
</author>
<author>
<name sortKey="Mcgurk, Mp" uniqKey="Mcgurk M">MP McGurk</name>
</author>
<author>
<name sortKey="Clark, Ag" uniqKey="Clark A">AG Clark</name>
</author>
<author>
<name sortKey="Barbash, Da" uniqKey="Barbash D">DA Barbash</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Manuelidis, L" uniqKey="Manuelidis L">L Manuelidis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Warburton, Pe" uniqKey="Warburton P">PE Warburton</name>
</author>
<author>
<name sortKey="Willard, Hf" uniqKey="Willard H">HF Willard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sullivan, Ll" uniqKey="Sullivan L">LL Sullivan</name>
</author>
<author>
<name sortKey="Chew, K" uniqKey="Chew K">K Chew</name>
</author>
<author>
<name sortKey="Sullivan, Ba" uniqKey="Sullivan B">BA Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Willard, Hf" uniqKey="Willard H">HF Willard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alexandrov, I" uniqKey="Alexandrov I">I Alexandrov</name>
</author>
<author>
<name sortKey="Kazakov, A" uniqKey="Kazakov A">A Kazakov</name>
</author>
<author>
<name sortKey="Tumeneva, I" uniqKey="Tumeneva I">I Tumeneva</name>
</author>
<author>
<name sortKey="Shepelev, V" uniqKey="Shepelev V">V Shepelev</name>
</author>
<author>
<name sortKey="Yurov, Y" uniqKey="Yurov Y">Y Yurov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vafa, O" uniqKey="Vafa O">O Vafa</name>
</author>
<author>
<name sortKey="Sullivan, Kf" uniqKey="Sullivan K">KF Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ikeno, M" uniqKey="Ikeno M">M Ikeno</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ando, S" uniqKey="Ando S">S Ando</name>
</author>
<author>
<name sortKey="Yang, H" uniqKey="Yang H">H Yang</name>
</author>
<author>
<name sortKey="Nozaki, N" uniqKey="Nozaki N">N Nozaki</name>
</author>
<author>
<name sortKey="Okazaki, T" uniqKey="Okazaki T">T Okazaki</name>
</author>
<author>
<name sortKey="Yoda, K" uniqKey="Yoda K">K Yoda</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Henikoff, S" uniqKey="Henikoff S">S Henikoff</name>
</author>
<author>
<name sortKey="Malik, Hs" uniqKey="Malik H">HS Malik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schueler, Mg" uniqKey="Schueler M">MG Schueler</name>
</author>
<author>
<name sortKey="Sullivan, Ba" uniqKey="Sullivan B">BA Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hayden, Ke" uniqKey="Hayden K">KE Hayden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Malik, Hs" uniqKey="Malik H">HS Malik</name>
</author>
<author>
<name sortKey="Henikoff, S" uniqKey="Henikoff S">S Henikoff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rudd, Mk" uniqKey="Rudd M">MK Rudd</name>
</author>
<author>
<name sortKey="Schueler, Mg" uniqKey="Schueler M">MG Schueler</name>
</author>
<author>
<name sortKey="Willard, Hf" uniqKey="Willard H">HF Willard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcnulty, Sm" uniqKey="Mcnulty S">SM McNulty</name>
</author>
<author>
<name sortKey="Sullivan, Ba" uniqKey="Sullivan B">BA Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Miga, Kh" uniqKey="Miga K">KH Miga</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alkan, C" uniqKey="Alkan C">C Alkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Macas, J" uniqKey="Macas J">J Macas</name>
</author>
<author>
<name sortKey="Neumann, P" uniqKey="Neumann P">P Neumann</name>
</author>
<author>
<name sortKey="Novak, P" uniqKey="Novak P">P Novak</name>
</author>
<author>
<name sortKey="Jiang, J" uniqKey="Jiang J">J Jiang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Dijk, El" uniqKey="Van Dijk E">EL van Dijk</name>
</author>
<author>
<name sortKey="Jaszczyszyn, Y" uniqKey="Jaszczyszyn Y">Y Jaszczyszyn</name>
</author>
<author>
<name sortKey="Naquin, D" uniqKey="Naquin D">D Naquin</name>
</author>
<author>
<name sortKey="Thermes, C" uniqKey="Thermes C">C Thermes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jain, M" uniqKey="Jain M">M Jain</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sevim, V" uniqKey="Sevim V">V Sevim</name>
</author>
<author>
<name sortKey="Bashir, A" uniqKey="Bashir A">A Bashir</name>
</author>
<author>
<name sortKey="Chin, Cs" uniqKey="Chin C">CS Chin</name>
</author>
<author>
<name sortKey="Miga, Kh" uniqKey="Miga K">KH Miga</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thompson, Jd" uniqKey="Thompson J">JD Thompson</name>
</author>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
<author>
<name sortKey="Gibson, Tj" uniqKey="Gibson T">TJ Gibson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sonnhammer, El" uniqKey="Sonnhammer E">EL Sonnhammer</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jurka, J" uniqKey="Jurka J">J Jurka</name>
</author>
<author>
<name sortKey="Klonowski, P" uniqKey="Klonowski P">P Klonowski</name>
</author>
<author>
<name sortKey="Dagman, V" uniqKey="Dagman V">V Dagman</name>
</author>
<author>
<name sortKey="Pelton, P" uniqKey="Pelton P">P Pelton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gluncic, M" uniqKey="Gluncic M">M Gluncic</name>
</author>
<author>
<name sortKey="Paar, V" uniqKey="Paar V">V Paar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paar, V" uniqKey="Paar V">V Paar</name>
</author>
<author>
<name sortKey="Gluncic, M" uniqKey="Gluncic M">M Gluncic</name>
</author>
<author>
<name sortKey="Rosandic, M" uniqKey="Rosandic M">M Rosandic</name>
</author>
<author>
<name sortKey="Basar, I" uniqKey="Basar I">I Basar</name>
</author>
<author>
<name sortKey="Vlahovic, I" uniqKey="Vlahovic I">I Vlahovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vlahovic, I" uniqKey="Vlahovic I">I Vlahovic</name>
</author>
<author>
<name sortKey="Gluncic, M" uniqKey="Gluncic M">M Gluncic</name>
</author>
<author>
<name sortKey="Rosandic, M" uniqKey="Rosandic M">M Rosandic</name>
</author>
<author>
<name sortKey="Ugarkovic, E" uniqKey="Ugarkovic E">E Ugarkovic</name>
</author>
<author>
<name sortKey="Paar, V" uniqKey="Paar V">V Paar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paar, V" uniqKey="Paar V">V Paar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ziccardi, W" uniqKey="Ziccardi W">W Ziccardi</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hattori, M" uniqKey="Hattori M">M Hattori</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Choo, Kh" uniqKey="Choo K">KH Choo</name>
</author>
<author>
<name sortKey="Vissel, B" uniqKey="Vissel B">B Vissel</name>
</author>
<author>
<name sortKey="Nagy, A" uniqKey="Nagy A">A Nagy</name>
</author>
<author>
<name sortKey="Earle, E" uniqKey="Earle E">E Earle</name>
</author>
<author>
<name sortKey="Kalitsis, P" uniqKey="Kalitsis P">P Kalitsis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vissel, B" uniqKey="Vissel B">B Vissel</name>
</author>
<author>
<name sortKey="Choo, Kh" uniqKey="Choo K">KH Choo</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tyler Smith, C" uniqKey="Tyler Smith C">C Tyler-Smith</name>
</author>
<author>
<name sortKey="Brown, Wr" uniqKey="Brown W">WR Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jorgensen, Al" uniqKey="Jorgensen A">AL Jorgensen</name>
</author>
<author>
<name sortKey="Bostock, Cj" uniqKey="Bostock C">CJ Bostock</name>
</author>
<author>
<name sortKey="Bak, Al" uniqKey="Bak A">AL Bak</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Sci Rep</journal-id>
<journal-id journal-id-type="iso-abbrev">Sci Rep</journal-id>
<journal-title-group>
<journal-title>Scientific Reports</journal-title>
</journal-title-group>
<issn pub-type="epub">2045-2322</issn>
<publisher>
<publisher-name>Nature Publishing Group UK</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">31477765</article-id>
<article-id pub-id-type="pmc">6718397</article-id>
<article-id pub-id-type="publisher-id">49022</article-id>
<article-id pub-id-type="doi">10.1038/s41598-019-49022-2</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Glunčić</surname>
<given-names>Matko</given-names>
</name>
<address>
<email>matko@phy.hr</email>
</address>
<xref ref-type="aff" rid="Aff1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Vlahović</surname>
<given-names>Ines</given-names>
</name>
<xref ref-type="aff" rid="Aff1">1</xref>
<xref ref-type="aff" rid="Aff2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Paar</surname>
<given-names>Vladimir</given-names>
</name>
<xref ref-type="aff" rid="Aff1">1</xref>
<xref ref-type="aff" rid="Aff3">3</xref>
</contrib>
<aff id="Aff1">
<label>1</label>
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0657 4636</institution-id>
<institution-id institution-id-type="GRID">grid.4808.4</institution-id>
<institution>Faculty of Science, University of Zagreb,</institution>
</institution-wrap>
10000 Zagreb, Croatia</aff>
<aff id="Aff2">
<label>2</label>
Algebra University College, Ilica 242, 10000 Zagreb, Croatia</aff>
<aff id="Aff3">
<label>3</label>
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 0806 5093</institution-id>
<institution-id institution-id-type="GRID">grid.454373.2</institution-id>
<institution>Croatian Academy of Sciences and Arts,</institution>
</institution-wrap>
10000 Zagreb, Croatia</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>2</day>
<month>9</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>2</day>
<month>9</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>9</volume>
<elocation-id>12629</elocation-id>
<history>
<date date-type="received">
<day>25</day>
<month>1</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>8</month>
<year>2019</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2019</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<p id="Par1">The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.</p>
</abstract>
<kwd-group kwd-group-type="npg-subject">
<title>Subject terms</title>
<kwd>Computational models</kwd>
<kwd>Data processing</kwd>
<kwd>Genome assembly algorithms</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source>
<institution-wrap>
<institution-id institution-id-type="FundRef">https://doi.org/10.13039/501100004488</institution-id>
<institution>Hrvatska Zaklada za Znanost (Croatian Science Foundation)</institution>
</institution-wrap>
</funding-source>
<award-id>IP-2014-09-3626</award-id>
<principal-award-recipient>
<name>
<surname>Paar</surname>
<given-names>Vladimir</given-names>
</name>
</principal-award-recipient>
</award-group>
</funding-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2019</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1" sec-type="introduction">
<title>Introduction</title>
<p id="Par2">Tandemly repeated DNA sequences, known as satellites, form a substantial part of genome in many eukaryotes, including humans
<sup>
<xref ref-type="bibr" rid="CR1">1</xref>
<xref ref-type="bibr" rid="CR3">3</xref>
</sup>
. Some links between satellites and phenotypes have been established, for example, satellite depression was associated with cancer outcomes
<sup>
<xref ref-type="bibr" rid="CR4">4</xref>
</sup>
, with chromosome missegregation and aneuploidy
<sup>
<xref ref-type="bibr" rid="CR2">2</xref>
</sup>
, and with aging
<sup>
<xref ref-type="bibr" rid="CR5">5</xref>
</sup>
. Satellite arrays form essential chromosome structure, such as centromeres and telomeres
<sup>
<xref ref-type="bibr" rid="CR3">3</xref>
</sup>
, and they show astonishing variation in both sequence and copy number. However, in spite of their crucial importance, satellites have been understudied
<sup>
<xref ref-type="bibr" rid="CR6">6</xref>
,
<xref ref-type="bibr" rid="CR7">7</xref>
</sup>
.</p>
<p id="Par3">The most abundant constituent of centromeres in human and other primate chromosomes are repetitive but rather divergent alpha satellite monomers of ~171 bp
<sup>
<xref ref-type="bibr" rid="CR8">8</xref>
</sup>
. They are organized mostly as tandem repeats of
<italic>n</italic>
mer higher order repeat (HOR) copies, each consisting of
<italic>n</italic>
monomers, or as monomeric arrays without any HORs (Supplementary Fig. 
<xref rid="MOESM1" ref-type="media">S1</xref>
and Table 
<xref rid="Tab1" ref-type="table">1</xref>
). Divergence among HOR copies within HOR array is about a few percent, while divergence among monomers within each HOR copy is sizably larger, ~20 to 40%
<sup>
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR10">10</xref>
</sup>
.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Alpha satellite
<italic>n</italic>
mer HOR arrays (
<italic>n</italic>
 ≥ 8) in hg38 sequence of human chromosome 21.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>
<italic>n</italic>
</th>
<th>HOR copies</th>
<th>Complete HOR copies</th>
<th>HOR start position</th>
<th>Monomers in HOR</th>
<th>HOR array length (bp)</th>
<th>HOR unit length (bp)</th>
<th>HOR divergence (%)</th>
<th>Monomer divergence (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td>33</td>
<td>4</td>
<td>4</td>
<td>10,864,568</td>
<td>132</td>
<td>22,537</td>
<td>5639</td>
<td>5</td>
<td>19</td>
</tr>
<tr>
<td>23</td>
<td>23</td>
<td>18</td>
<td>10,887,205</td>
<td>517</td>
<td>88,022</td>
<td>3915</td>
<td>4</td>
<td>22</td>
</tr>
<tr>
<td>23*</td>
<td>20</td>
<td>11</td>
<td>7,970,290</td>
<td>440</td>
<td>74,877</td>
<td>3915</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>22</td>
<td>6</td>
<td>5</td>
<td>11,093,195</td>
<td>121</td>
<td>20,672</td>
<td>3758</td>
<td>3</td>
<td>17</td>
</tr>
<tr>
<td>20</td>
<td>16</td>
<td>15</td>
<td>10,975,327</td>
<td>316</td>
<td>54,133</td>
<td>3425</td>
<td>3</td>
<td>23</td>
</tr>
<tr>
<td>20</td>
<td>19</td>
<td>18</td>
<td>11,029,561</td>
<td>371</td>
<td>63,536</td>
<td>3425</td>
<td>4</td>
<td>21</td>
</tr>
<tr>
<td>16</td>
<td>16</td>
<td>2</td>
<td>11,124,080</td>
<td>465</td>
<td>22,561</td>
<td>2733</td>
<td>5</td>
<td>17</td>
</tr>
<tr>
<td>16</td>
<td>4</td>
<td>1</td>
<td>11,113,966</td>
<td>39</td>
<td>6,669</td>
<td>2736</td>
<td>4</td>
<td>17</td>
</tr>
<tr>
<td>11</td>
<td>712</td>
<td>12</td>
<td>12,283,230</td>
<td>3,722</td>
<td>632,586</td>
<td>1870</td>
<td>3</td>
<td>21</td>
</tr>
<tr>
<td>8</td>
<td>4</td>
<td>1</td>
<td>11,120,735</td>
<td>19</td>
<td>3,246</td>
<td>1368</td>
<td>6</td>
<td>17</td>
</tr>
<tr>
<td>8</td>
<td>854</td>
<td>826</td>
<td>11,146,741</td>
<td>6,650</td>
<td>1,134,213</td>
<td>1364</td>
<td>4</td>
<td>22</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>1
<sup>nd</sup>
column: Number of monomers in HOR unit. Asterisk (*) denotes reverse monomer sequence in hg38 assembly with respect to the other HORs. 2
<sup>rd</sup>
column: Number of HOR copies in the HOR array. 3
<sup>th</sup>
column: Number of complete HOR copies in the HOR array. 4
<sup>th</sup>
column: Start position of HOR array in genomic sequence of chromosome 21. 5
<sup>th</sup>
column: Number of monomers in HOR array. 6
<sup>th</sup>
column: Length of HOR array (bp). 7
<sup>th</sup>
column: Length of HOR repeat unit (bp). 8
<sup>th</sup>
column: Mean divergence among HOR copies (%). 9
<sup>th</sup>
column: Mean divergence among monomers within HOR copies. Mean divergence is rounded off to nearest integer value. The divergence among HOR copies is determined as the mean value of divergence between pairs of the corresponding monomers in both HOR copies.</p>
</table-wrap-foot>
</table-wrap>
</p>
<p id="Par4">Several studies provided an evidence of functional role for satellite DNA
<sup>
<xref ref-type="bibr" rid="CR2">2</xref>
,
<xref ref-type="bibr" rid="CR11">11</xref>
<xref ref-type="bibr" rid="CR15">15</xref>
</sup>
. They are implicated in centromeric functions, such as segregation in mitosis and meiosis, essential during cell division, pairing of homologous chromosomes, sister chromatid attachment and formation of kinetochore structures
<sup>
<xref ref-type="bibr" rid="CR11">11</xref>
,
<xref ref-type="bibr" rid="CR12">12</xref>
</sup>
. However, the interplay between genome sequences and the network involved in kinetochore ensemble is poorly understood
<sup>
<xref ref-type="bibr" rid="CR11">11</xref>
,
<xref ref-type="bibr" rid="CR12">12</xref>
,
<xref ref-type="bibr" rid="CR16">16</xref>
<xref ref-type="bibr" rid="CR18">18</xref>
</sup>
. While a number of centromeric proteins share homology among evolutionary distant organisms, one of challenging problems is that centromeric DNA sequences differ significantly even among closely related species and evolve rapidly during speciation
<sup>
<xref ref-type="bibr" rid="CR19">19</xref>
</sup>
. Paradoxically, although the centromere’s role is conserved throughout eukaryotic evolution, the sequences that accomplish centromere function in different organisms are not conserved
<sup>
<xref ref-type="bibr" rid="CR16">16</xref>
</sup>
. The functional importance of centromeric sequences notwithstanding, these regions of human genome remain poorly understood at the level of sequence ensemble and annotation
<sup>
<xref ref-type="bibr" rid="CR20">20</xref>
</sup>
. Today, there are unique challenges of studying human satellite DNAs and RNAs and it is pointed toward technologies that will continue to advance our understanding of this, still largely untapped portion of the genome
<sup>
<xref ref-type="bibr" rid="CR21">21</xref>
,
<xref ref-type="bibr" rid="CR22">22</xref>
</sup>
.</p>
<p id="Par5">In the past decade there have been impressive improvings in sequencing technology with the Next generation sequencing and Third generation sequencing/Long read methods including whole genome sequencing
<sup>
<xref ref-type="bibr" rid="CR7">7</xref>
,
<xref ref-type="bibr" rid="CR23">23</xref>
<xref ref-type="bibr" rid="CR26">26</xref>
</sup>
. Unlike reads shorter than the underlying repeat structure, long reads allow direct inference of satellite higher order repeat (HOR) structure
<sup>
<xref ref-type="bibr" rid="CR27">27</xref>
</sup>
.</p>
<p id="Par6">Several computer algorithms for identification and analysis of tandem repeats in DNA sequences were previously designed and widely used, notably
<sup>
<xref ref-type="bibr" rid="CR28">28</xref>
<xref ref-type="bibr" rid="CR32">32</xref>
</sup>
. Two novel computational algorithms were recently designed and validated, convenient for HOR identification in novel centromeric repeat sequences obtained with long read sequencing
<sup>
<xref ref-type="bibr" rid="CR6">6</xref>
</sup>
GRM - Global Repeat Map
<sup>
<xref ref-type="bibr" rid="CR33">33</xref>
<xref ref-type="bibr" rid="CR35">35</xref>
</sup>
and Alpha-CENTAURI
<sup>
<xref ref-type="bibr" rid="CR26">26</xref>
,
<xref ref-type="bibr" rid="CR27">27</xref>
</sup>
, convenient for HOR identification in novel centromeric repeat sequences obtained with long read sequencing
<sup>
<xref ref-type="bibr" rid="CR7">7</xref>
</sup>
. The GRM algorithm is characterized by robustness with respect to deviations from regular repeat pattern, applicability to HORs with long repeat units and to long DNA sequences
<sup>
<xref ref-type="bibr" rid="CR33">33</xref>
,
<xref ref-type="bibr" rid="CR34">34</xref>
,
<xref ref-type="bibr" rid="CR36">36</xref>
</sup>
. It is used in this study to identify alpha satellite HORs in hg38 genomic sequence of human chromosome 21. The advantage of using new human genome assembly hg38 (GCA_000001405.15) for alpha satellite HOR studies is that it adds a number of alpha satellite sequences to human chromosome 21
<sup>
<xref ref-type="bibr" rid="CR37">37</xref>
,
<xref ref-type="bibr" rid="CR38">38</xref>
</sup>
. A number of human chromosome 21p clones have been added to the new assembly and the centromeric gap was filled with “reference models”, which are representations of alpha satellite HOR domains.</p>
</sec>
<sec id="Sec2" sec-type="results">
<title>Results</title>
<p id="Par7">Here, the recent DNA sequence hg38 of human chromosome 21 is analyzed to identify and study computationally the complete ensemble of alpha satellite HORs embedded in DNA sequence, using robust GRM algorithm (Methods). Alpha satellite HOR ideogram obtained in this way for chromosome 21 is shown in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
. To our knowledge, this is the first time that a complete ensemble of
<italic>n</italic>
 ≥ 8 alpha satellite HORs of a human chromosome was determined for the centromeric region.
<fig id="Fig1">
<label>Figure 1</label>
<caption>
<p>Alpha satellite HOR ideogram for linear positioning of alpha satellite HOR arrays with long repeat units (
<italic>n</italic>
 ≥ 8) obtained by applying GRM algorithm to the hg38 assembly sequence of human chromosome 21. CEN21 denotes location of the centromere. Only a segment of chromosome 21 containing alpha satellite HOR arrays is displayed. Ten HOR arrays are located within the centromere. The 23mer HOR with reverse monomers in the long arm of chromosome 21 is removed far from the centromere. Closer description of HOR arrays is given in Table 
<xref rid="Tab1" ref-type="table">1</xref>
.</p>
</caption>
<graphic xlink:href="41598_2019_49022_Fig1_HTML" id="d29e828"></graphic>
</fig>
</p>
<p id="Par8">The computed GRM diagram for hg38 DNA sequence of the whole chromosome 21 is shown in Fig. 
<xref rid="Fig2" ref-type="fig">2a</xref>
. Pronounced peaks at fragment lengths that are approximately equal to 171·
<italic>n</italic>
bp, i.e., to multiples of alpha satellite monomer length 171 bp, are candidates for
<italic>n</italic>
mer alpha satellite HORs, especially if a peak at ~171·
<italic>n</italic>
bp is sizably higher than the neighboring peak at lower fragment length ~171·(
<italic>n-1</italic>
) bp. For example, for 8mer HOR, the peak at ~171·8 bp is sizably stronger than the peak at ~171·7 bp; for 11mer HOR, the peak at ~171·11 bp is sizably stronger than at ~171·10 bp; for 16mer HOR, the peak at ~171·16 bp is sizably stronger than at ~171·15 bp; for 20mer HOR the peak at ~171·20 bp is sizably stronger than at ~171·19 bp, etc. We can directly confirm this attribution by analyzing the corresponding DNA sequences. For monomeric alpha satellite arrays the frequencies of peaks at 171·
<italic>n</italic>
bp gradually decrease with increasing
<italic>n</italic>
and a peak sizably above this background is an indication for HOR.
<fig id="Fig2">
<label>Figure 2</label>
<caption>
<p>GRM diagrams for the whole human chromosome 21 and for the contig NT_187321 which contains 33mer HOR array. (
<bold>a</bold>
) GRM diagram for the whole chromosome 21. Pronounced peaks that correspond to alpha satellite HORs are denoted by number of monomers in
<italic>n</italic>
mer HOR repeat unit. Inserts give magnified presentation of weak peaks for 22mer and 33mer, which are sizably screened by a noise of different other repeats in the whole chromosome 21. (
<bold>b</bold>
) GRM diagram for contig NT_187321 in which the 33mer HOR array is located. The pronounced GRM peak at 5,639 bp is a signature of 33mer HOR (5,639: 171 ≈ 33).</p>
</caption>
<graphic xlink:href="41598_2019_49022_Fig2_HTML" id="d29e870"></graphic>
</fig>
</p>
<p id="Par9">In order to identify complete ensemble of alpha satellite HORs in a given DNA sequence we extended GRM algorithm with introduction of a novel algorithm ALPHAsub (Methods) to identify positions of all alpha satellite arrays (regardless weather of HOR type or nonHOR type) in DNA sequence. In this way we determine contigs in which alpha satellite arrays are located, i.e., to each alpha satellite array the corresponding contig is assigned. Then we apply GRM to each of these contigs. The GRM analysis of contigs containing alpha satellite
<italic>n</italic>
mer HOR provides a sizably more pronounced peak at position 171·
<italic>n</italic>
bp, than the GRM analysis of the whole genome, because the noise due to other repeats is sizably smaller for a contig than for the whole chromosome. In this way it is straightforward to determine whether the alpha satellite array is HOR.</p>
<p id="Par10">The GRM peak corresponding to 33mer in GRM diagram for the whole chromosome 21 is small, but visible at 5639 bp in the magnified segment of HOR diagram. On the other hand, the 5639 bp peak of 33mer HOR is sizeable in GRM diagram for contigs NT_187321.1, in which the 33mer HOR is, located (Fig. 
<xref rid="Fig2" ref-type="fig">2b</xref>
). The length of a 33mer HOR copy is ~33 × 0.171 kb ~5.6 kb.</p>
<p id="Par11">Schematic presentation of aligned monomer structure of 33mer HOR array is presented in Fig. 
<xref rid="Fig3" ref-type="fig">3a</xref>
. This HOR array has very regular structure; it’s all four HOR copies are complete. Using GRM, the DNA sequences of 33mer HOR copies are determined from hg38 for human chromosome 21 and the consensus sequence was determined (Supplementary Table 
<xref rid="MOESM1" ref-type="media">S1</xref>
). The average divergence among monomers within the 33mer consensus HOR is 19%, and divergence between 33mer HOR copies in HOR array is 5%. The 23mer HOR starting at the chromosome position 7,968,750, far from the centromeric region, has reversed monomers with respect to the other ten HORs. This 23mer HOR array consists of 20 HOR copies. Schematic presentation of aligned monomer structure of 23mer HOR (reverse) array is presented in Fig. 
<xref rid="Fig3" ref-type="fig">3b</xref>
.
<fig id="Fig3">
<label>Figure 3</label>
<caption>
<p>Schematic presentation of aligned monomer structure of 33mer HOR and 23mer HOR (reverse) arrays in human chromosome 21. (
<bold>a</bold>
) 33mer (4 complete HOR copies). Top: enumeration of columns corresponding to 33 constituent consensus monomers (Nos. 1 to 33, enumeration of every fifth monomer is displayed). Each of the four 33mer HOR copies (denoted
<italic>h</italic>
<sub>
<italic>i</italic>
</sub>
,
<italic>i</italic>
 = 1, 2, 3, 4) is presented by 33 bars in the
<italic>i</italic>
th row. Each monomer of the same type (from consensus HOR) in different HOR copies is presented by a bar in the same column, corresponding to monomer enumeration at the top. (
<bold>b</bold>
) 23mer with reverse monomers (20 HOR copies). Each of the twenty 23mer HOR copies (denoted
<italic>h</italic>
<sub>
<italic>i</italic>
</sub>
,
<italic>i</italic>
 = 1, 2, … 20) is presented by 23 bars in the
<italic>i</italic>
th row.</p>
</caption>
<graphic xlink:href="41598_2019_49022_Fig3_HTML" id="d29e934"></graphic>
</fig>
</p>
<p id="Par12">Additionally, for each identified alpha satellite array we computed the corresponding dot-matrix diagrams to determine whether it has a HOR structure. The dot-matrix for 33mer HOR in human chromosome 21 is shown in Fig. 
<xref rid="Fig4" ref-type="fig">4a</xref>
. The HOR pattern is characterized by diagonal lines at the spacing of
<italic>n</italic>
 = 33 monomers, parallel to the self-diagonal. We computed dot-matrix diagrams for all high-multiple HORs in chromosome 21. For example, dot-matrix diagrams are shown for 23mer, 23mer (reverse) and 22mer HOR arrays containing 517, 448 and 371 monomers, respectively (Supplementary Fig. 
<xref rid="MOESM1" ref-type="media">S4</xref>
).
<fig id="Fig4">
<label>Figure 4</label>
<caption>
<p>Dot-matrix plots of 33mer HORs in four acrocentric chromosomes. (
<bold>a</bold>
) chromosome 21; (
<bold>b</bold>
) chromosome 13; (
<bold>c</bold>
) chromosome 14; (
<bold>d</bold>
) chromosome 22. Dot-matrix analyses determined the presence or absence of HOR structure using a window size of monomer length and mismatch limits ranging at ~7%. Monomers are labeled in order of appearance, displayed in matrix along the upper horizontal axis (from left to right) and along the left vertical axis (from up to down): at both axes label 1 corresponds to the first monomer in alpha satellite ensemble, label 2 to the second monomer, etc. In this way, the alpha satellite ensemble is compared with itself, giving pairwise comparisons of divergence between constituting alpha satellite monomers. Each cell in dot-matrix which represents divergence between monomers located at identical positions in different HOR copies (e.g. the second monomer in the third HOR copy on the horizontal axis and the second monomer in the fourth HOR copy on the vertical axis, etc.) correspond to relatively small divergence between monomers (here chosen below 7%) and is shown as colored dot. The other cells in dot-matrix correspond to higher divergence (above 7%) are blank. In this way, for each HOR array the dot-matrix diagram is obtained as a set of equidistant diagonal lines at spacing equal to the number of monomers in HOR unit (
<italic>n</italic>
 = 33), parallel to the self-diagonal.</p>
</caption>
<graphic xlink:href="41598_2019_49022_Fig4_HTML" id="d29e969"></graphic>
</fig>
</p>
<p id="Par13">In accordance with GRM diagrams (Fig. 
<xref rid="Fig2" ref-type="fig">2a</xref>
and Supplementary Fig. 
<xref rid="MOESM1" ref-type="media">S3a–c</xref>
) and dot-matrix analysis (Fig. 
<xref rid="Fig4" ref-type="fig">4a–d</xref>
), we find for hg38 assembly the 100% identical 33mer HORs in chromosomes 21, 13, 14 and 22, and also there are gaps in hg38 in the neighborhood of 33mers. A more complete sequencing of this region seems to be required.</p>
</sec>
<sec id="Sec3" sec-type="discussion">
<title>Discussion</title>
<p id="Par14">DNA sequence of the whole human chromosome 21 was previously used in studies of gene catalogue
<sup>
<xref ref-type="bibr" rid="CR39">39</xref>
</sup>
. The earlier restriction map estimates
<sup>
<xref ref-type="bibr" rid="CR40">40</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
</sup>
for HORs with long repeat units largely differ from the present GRM results in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
. For HOR repeat units longer than 20mers we identified here two 20mers, one 22mer, two 23mers and one 33mer, while earlier restriction map estimates in this range indicated one 22mer and one 28mer
<sup>
<xref ref-type="bibr" rid="CR40">40</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
</sup>
. For
<italic>n</italic>
mers below
<italic>n</italic>
 = 20 monomers, the previously predicted 11mer and 16mer
<sup>
<xref ref-type="bibr" rid="CR40">40</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
</sup>
are in accordance with present results, but for other
<italic>n</italic>
mers previous estimates show some differences.</p>
<p id="Par15">Extensive earlier work on alpha satellite sequences and satellite DNA in centromere near regions of chromosome 21 was summarized in monograph
<sup>
<xref ref-type="bibr" rid="CR42">42</xref>
</sup>
. Recent study of alphoid sequences revealed five distinct alphoid clusters Mp1-Mp5 that are large in size (25–189 kb each) and extend over a distance at least 5 Mb from D2171.</p>
<p id="Par16">Using a different method, alpha satellite arrays in hg38 sequence of chromosome 21 were recently studied
<sup>
<xref ref-type="bibr" rid="CR37">37</xref>
</sup>
and their results for positions of alpha satellite arrays in centromere were obtained which correspond to our GRM results, but there were also some pronounced differences. In ref.
<sup>
<xref ref-type="bibr" rid="CR37">37</xref>
</sup>
only four alpha satellite HORs have been identified out of eleven identified here using GRM and, among others, the 33mer HOR was not identified. Also, the pronounced reverse alpha satellite array far outside of the centromeric region which was identified here using GRM, was not found in ref.
<sup>
<xref ref-type="bibr" rid="CR37">37</xref>
</sup>
.</p>
<p id="Par17">Three of our eleven HOR arrays, 11mer, one of two 23mers and 22mer, can have attributions as follows. The 11mer HOR corresponds to the well known D21Z1 sequence, one of our 23mer HORs to D21Z3 (pTRA-1) with 3.9-kb repeat unit positioned in Mp3 cluster, and our 22mer to SF5 HOR of ~3.7 kb (GJ212128), a part of which is almost identical to the pTRA-7 probe corresponding to D21Z7.</p>
<p id="Par18">The 33mer HOR identified here for chromosome 21 is the longest alpha satellite HOR repeat unit reported so far among all 22 somatic chromosomes. Evidence for a slightly longer HOR repeat unit, a 35mer, was found previously only in the sexual chromosome Y
<sup>
<xref ref-type="bibr" rid="CR43">43</xref>
</sup>
.</p>
<p id="Par19">Previous analysis of hybridization of alpha satellite DNA showed that chromosomes 13 and 21 share a 4mer HOR array and that the HOR unit is indistinguishable on each chromosome, while the 6mer HOR array is shared by chromosomes 22 and 13, distinguishable from the 13/21 4mer HOR array
<sup>
<xref ref-type="bibr" rid="CR44">44</xref>
</sup>
. These results suggested that, at some point after they originated and were homogenized, different subfamilies of alphoid sequences must have exchanged between chromosomes 13 and 21 and separately between chromosomes 13 and 22. Following homogenization of one chromosome, a portion of HOR array could have been transferred between nonhomologous chromosomes by recombination. If this recombination event had happened recently in evolutionary time, any independent homogenization process operating on the two blocks of alpha satellite DNA, now separated on different chromosomes, may not have had sufficient time to cause isolated sequences to diverge
<sup>
<xref ref-type="bibr" rid="CR44">44</xref>
</sup>
.</p>
<p id="Par20">Chromosome identifier probes are largely based on detection of higher order alpha satellite repeats and it has been a constant thorn in the side of molecular cytogeneticists that the probe for chromosome 21 also picked up chromosome 13 (and vice versa). In this respect one could comment on the likelihood of recent work leading to a unique chromosome 21 probe, which would find great utility in diagnostics. To this end, in Table 
<xref rid="Tab2" ref-type="table">2</xref>
we present the alpha satellite
<italic>n</italic>
mer HOR arrays (
<italic>n</italic>
 ≥ 8) identified here in hg38 sequence of human chromosome 13 for comparison with Table 
<xref rid="Tab1" ref-type="table">1</xref>
for chromosome 21. Here we find some pronounced differences between alpha satellite HORs in chromosomes 21 and 13. In chromosome 21 we identify four complete 33mer HOR copies and in chromosome 13 three (in one HOR copy two monomers are deleted). A significant difference exists for 23mer HORs: we identify two distinct 23mer HOR arrays in human chromosome 21 (constituted of 23 and 20 HOR copies, respectively), while in chromosome 13 we identify only one HOR array (constituted of 23 HOR copies). This provides a pronounced distinction between chromosomes 21 and 13. We also note a sizable difference between 16mer HORs: we identify two distinct 16mer HOR arrays in human chromosome 21 (constituted of 16 and 4 HOR copies, respectively), while in chromosome 13 we identify only one HOR array (constituted of 3 HOR copies). Finally, we note a difference between lengths of long 8mer HOR arrays in chromosomes 21 and 13 (854 and 832 HOR copies, respectively).
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>Alpha satellite
<italic>n</italic>
mer HOR arrays (
<italic>n</italic>
 ≥ 8) in hg38 sequence of human chromosome 13.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>
<italic>n</italic>
</th>
<th>HOR copies</th>
<th>Complete HOR copies</th>
<th>HOR
<break></break>
start position</th>
<th>Monomers
<break></break>
in HOR</th>
<th>HOR array length (bp)</th>
<th>HOR unit length (bp)</th>
<th>HOR
<break></break>
divergence (%)</th>
<th>Monomer divergence (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td>33</td>
<td>4</td>
<td>3</td>
<td>16,000,179</td>
<td>131</td>
<td>22,366</td>
<td>5639</td>
<td>5</td>
<td>20</td>
</tr>
<tr>
<td>23</td>
<td>23</td>
<td>15</td>
<td>16,022,645</td>
<td>515</td>
<td>88,022</td>
<td>3915</td>
<td>4</td>
<td>21</td>
</tr>
<tr>
<td>22</td>
<td>6</td>
<td>5</td>
<td>
<italic>16</italic>
,
<italic>228</italic>
,
<italic>806</italic>
</td>
<td>120</td>
<td>20,672</td>
<td>3758</td>
<td>3</td>
<td>17</td>
</tr>
<tr>
<td>20</td>
<td>16</td>
<td>15</td>
<td>
<italic>16</italic>
,
<italic>110</italic>
,
<italic>767</italic>
</td>
<td>316</td>
<td>54,133</td>
<td>3426</td>
<td>3</td>
<td>24</td>
</tr>
<tr>
<td>20</td>
<td>19</td>
<td>18</td>
<td>
<italic>16</italic>
,
<italic>165</italic>
,
<italic>001</italic>
</td>
<td>371</td>
<td>63,536</td>
<td>3425</td>
<td>4</td>
<td>22</td>
</tr>
<tr>
<td>16</td>
<td>3</td>
<td>2</td>
<td>
<italic>16</italic>
,
<italic>249</italic>
,
<italic>406</italic>
</td>
<td>39</td>
<td>6,670</td>
<td>2736</td>
<td>5</td>
<td>17</td>
</tr>
<tr>
<td>11</td>
<td>363</td>
<td>229</td>
<td>
<italic>17</italic>
,
<italic>418</italic>
,
<italic>670</italic>
</td>
<td>3,721</td>
<td>632,415</td>
<td>1871</td>
<td>3</td>
<td>23</td>
</tr>
<tr>
<td>8</td>
<td>3</td>
<td>2</td>
<td>
<italic>16</italic>
,
<italic>256</italic>
,
<italic>175</italic>
</td>
<td>20</td>
<td>3246</td>
<td>1368</td>
<td>6</td>
<td>17</td>
</tr>
<tr>
<td>8</td>
<td>832</td>
<td>818</td>
<td>
<italic>16</italic>
,
<italic>282</italic>
,
<italic>181</italic>
</td>
<td>6,646</td>
<td>1,134,112</td>
<td>1365</td>
<td>4</td>
<td>23</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>1
<sup>nd</sup>
column: Number of monomers in HOR unit. 2
<sup>rd</sup>
column: Number of HOR copies in the HOR array. 3
<sup>th</sup>
column: Number of complete HOR copies in the HOR array. 4
<sup>th</sup>
column: Start position of HOR array in genomic sequence of chromosome 13. 5
<sup>th</sup>
column: Number of monomers in HOR array. 6
<sup>th</sup>
column: Length of HOR array (bp). 7
<sup>th</sup>
column: Length of HOR repeat unit (bp). 8
<sup>th</sup>
column: Mean divergence among HOR copies (%). 9
<sup>th</sup>
column: Mean divergence among monomers within HOR copies. Mean divergence is rounded off to nearest integer value. The divergence among HOR copies is determined as the mean value of divergence between pairs of the corresponding monomers in both HOR copies.</p>
</table-wrap-foot>
</table-wrap>
</p>
<p id="Par21">Analogously, it could be hypothesized here that at some point after it originated and was homogenized on one of the chromosomes 21, 13, 14 or 22, the 33mer HOR could have been transferred recently in evolutionary time by recombination to the other chromosomes.</p>
<p id="Par22">Recently, almost complete genomic sequences open the possibility to determine complete ensemble of alpha satellite HORs in the whole human genome, which in turn enables a broad investigations of alpha satellite HORs and monomeric repeats and their influence on centromere dynamics. As noted by van Dijk
<italic>et al</italic>
.
<sup>
<xref ref-type="bibr" rid="CR25">25</xref>
</sup>
, ultralong reads may allow complete, gapless assembly of human genomes in the near future, which will further boost human genetic research and personalized medicine
<sup>
<xref ref-type="bibr" rid="CR25">25</xref>
</sup>
. Among others, complete DNA sequences will open new challenges related to HORs as possibly important regulatory elements. One could hypothesize that the richness of different long HOR repeat units will be found in centromere of other chromosomes too. We are only at the beginning of the third revolution in sequencing technology and the coming years may bring exciting new developments in HOR studies.</p>
</sec>
<sec id="Sec4">
<title>Methods</title>
<sec id="Sec5">
<title>Sequence data</title>
<p id="Par23">In this study the hg38 assembly sequence of chromosome 21 was used for HOR analysis.</p>
</sec>
<sec id="Sec6">
<title>ALPHAsub algorithm</title>
<p id="Par24">The novel method of identifying alpha satellite arrays in DNA sequence is as follows. As “ideal key word”, we use a robust 28-bp segment from alpha satellite DNA sequences, TGAGAAACTGCTTTGTGATGTGTGCATT and its reverse complement. First, using the Levenshtein distance algorithm, all positions in the whole chromosome are determined where the 28-bp sequence of “ideal key word” or its reverse complement differs from a “real key word” by at most nine nucleotides. Second, the distances between positions of neighboring “real key words” are calculated. Third, only those “real key words” are retained for which distance to its previous neighbor is approximately equal to 171 bp or to a multiple of 171 bp (d(
<italic>n</italic>
,
<italic>n-1</italic>
) ~ 
<italic>m</italic>
·171;
<italic>m</italic>
 = 1, 2,…). In the latter case (
<italic>m</italic>
 > 1), the additional “real key words” (one for
<italic>m</italic>
 = 2, two for
<italic>m</italic>
 = 3, and so on) are determined in the sequence between “real key word” and its previous neighbor, using the Levenshtein distance algorithm, at positions with the smallest difference of “real key words” compared to “ideal key word” or its reverse complement. In general, a distance between the additional “real key words”, obtained by this method, is always approximately equal to 171 bp. In this way, we determined positions of all alpha satellites within chromosome 21. In the next step, using positions of “real key words”, all alpha satellites from chromosome 21 hg38 DNA sequence are extracted and different alpha satellite ensembles are identified. On this basis, we have designed our ALPHAsub algorithm and computer program. Applying ALPHAsub program to the hg38 sequence of chromosome 21 we determine location of all alpha satellite arrays within genomic sequence.</p>
</sec>
<sec id="Sec7">
<title>GRM algorithm</title>
<p id="Par25">Global repeat algorithm (GRM) is an efficient and robust novel method to identify and study repeats, especially HORs, in a given DNA sequence
<sup>
<xref ref-type="bibr" rid="CR33">33</xref>
<xref ref-type="bibr" rid="CR35">35</xref>
</sup>
. To identify alpha satellite HORs, we compute GRM diagrams for genomic sequence of the whole chromosome. For long DNA sequences of whole chromosomes, due to many repeats in genomic sequence the noise in GRM diagram increases with increasing length of HOR repeat unit. This noise is significantly reduced by applying GRM to those regions which contain alpha satellite arrays. Such regions are first selected using ALPHAsub algorithm for analysis of the whole chromosome sequence. The novelty of GRM approach is a direct mapping of symbolic DNA sequence into frequency domain using complete
<italic>K</italic>
-string ensemble instead of statistically adjusted individual
<italic>K</italic>
-strings optimized locally. In this way, GRM provides a straightforward identification of DNA repeats using frequency domain, but avoiding mapping of symbolic DNA sequence into numerical sequence, and uses
<italic>K</italic>
-string matching, but avoiding statistical methods and locally optimizing individual
<italic>K</italic>
-strings. For a given sequence, the GRM algorithm provides in the first step (“identification step”) the corresponding GRM diagram; each significant peak (“fragment length”) presents the length of repeat unit. In the second step (“analysis step”), for each significant GRM peak the algorithm determines corresponding repeat sequences and their positions, the consensus repeat unit and divergence between repeat copies and with respect to consensus.</p>
</sec>
<sec id="Sec8">
<title>GRM algorithm expanded by ALPHAsub algorithm</title>
<p id="Par26">Successive application of ALPHAsub and GRM algorithms is used for identification and analysis of alpha satellite HORs in a whole chromosome sequence: in the first step we identify contigs that contain alpha satellite arrays and in the second step we perform GRM computation for these contigs. In this way, an ensemble of all alpha satellite HORs is extracted from a given genomic sequence (here hg38).</p>
</sec>
<sec id="Sec9">
<title>ALPHAsub algorithm expanded by Dot-matrix method</title>
<p id="Par27">For each alpha satellite array identified by ALPHAsub algorithm, the corresponding dot-matrix diagrams are created to identify alpha satellite HORs on the basis of equidistant lines parallel to self-diagonal.</p>
</sec>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary information</title>
<sec id="Sec10">
<p>
<supplementary-material content-type="local-data" id="MOESM1">
<media xlink:href="41598_2019_49022_MOESM1_ESM.pdf">
<caption>
<p>Supplementary Tables and Figures</p>
</caption>
</media>
</supplementary-material>
</p>
</sec>
</sec>
</body>
<back>
<fn-group>
<fn>
<p>
<bold>Publisher’s note:</bold>
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.</p>
</fn>
</fn-group>
<sec>
<title>Supplementary information</title>
<p>
<bold>Supplementary information</bold>
accompanies this paper at 10.1038/s41598-019-49022-2.</p>
</sec>
<ack>
<title>Acknowledgements</title>
<p>We thank C. Tyler-Smith for stimulating our interest for alpha satellites. We also acknowledge support from the QuantiXLie Centre of Excellence, a project cofinanced by the Croatian Government and European Union through the European Regional Development Fund - the Competitiveness and Cohesion Operational Programme (Grant KK.01.1.1.01.0004), and the grant IP-2014-09-3626 from Croatian Science Foundation.</p>
</ack>
<notes notes-type="author-contribution">
<title>Author Contributions</title>
<p>I.V. and M.G. performed the computations. M.G. wrote computational algorithm ALPHAsub. V.P. supervised the study. All authors analyzed computational results. V.P. and M.G. wrote the manuscript. All authors read and approved the final version of the manuscript.</p>
</notes>
<notes notes-type="data-availability">
<title>Code Availability</title>
<p>Further code information is available on request from the authors.</p>
</notes>
<notes notes-type="COI-statement">
<title>Competing Interests</title>
<p id="Par28">The authors declare no competing interests.</p>
</notes>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Waye</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Willard</surname>
<given-names>HF</given-names>
</name>
</person-group>
<article-title>Nucleotide sequence heterogeneity of alpha satellite repetitive DNA: a survey of alphoid sequences from different human chromosomes</article-title>
<source>Nucleic Acids Res</source>
<year>1987</year>
<volume>15</volume>
<fpage>7549</fpage>
<lpage>69</lpage>
<pub-id pub-id-type="doi">10.1093/nar/15.18.7549</pub-id>
<pub-id pub-id-type="pmid">3658703</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aldrup-Macdonald</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>BA</given-names>
</name>
</person-group>
<article-title>The past, present, and future of human centromere genomics</article-title>
<source>Genes (Basel)</source>
<year>2014</year>
<volume>5</volume>
<fpage>33</fpage>
<lpage>50</lpage>
<pub-id pub-id-type="doi">10.3390/genes5010033</pub-id>
<pub-id pub-id-type="pmid">24683489</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garrido-Ramos</surname>
<given-names>Manuel</given-names>
</name>
</person-group>
<article-title>Satellite DNA: An Evolving Topic</article-title>
<source>Genes</source>
<year>2017</year>
<volume>8</volume>
<issue>9</issue>
<fpage>230</fpage>
<pub-id pub-id-type="doi">10.3390/genes8090230</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bersani</surname>
<given-names>F</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Pericentromeric satellite repeat expansions through RNA-derived DNA intermediates in cancer</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2015</year>
<volume>112</volume>
<fpage>15148</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1518008112</pub-id>
<pub-id pub-id-type="pmid">26575630</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Aging stem cells. A Werner syndrome stem cell model unveils heterochromatin alterations as a driver of human aging</article-title>
<source>Science</source>
<year>2015</year>
<volume>348</volume>
<fpage>1160</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="doi">10.1126/science.aaa1356</pub-id>
<pub-id pub-id-type="pmid">25931448</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Treangen</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Salzberg</surname>
<given-names>SL</given-names>
</name>
</person-group>
<article-title>Repetitive DNA and next-generation sequencing: computational challenges and solutions</article-title>
<source>Nat Rev Genet</source>
<year>2011</year>
<volume>13</volume>
<fpage>36</fpage>
<lpage>46</lpage>
<pub-id pub-id-type="doi">10.1038/nrg3117</pub-id>
<pub-id pub-id-type="pmid">22124482</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lower</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>McGurk</surname>
<given-names>MP</given-names>
</name>
<name>
<surname>Clark</surname>
<given-names>AG</given-names>
</name>
<name>
<surname>Barbash</surname>
<given-names>DA</given-names>
</name>
</person-group>
<article-title>Satellite DNA evolution: old ideas, new approaches</article-title>
<source>Curr Opin Genet Dev</source>
<year>2018</year>
<volume>49</volume>
<fpage>70</fpage>
<lpage>78</lpage>
<pub-id pub-id-type="doi">10.1016/j.gde.2018.03.003</pub-id>
<pub-id pub-id-type="pmid">29579574</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Manuelidis</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Chromosomal localization of complex and simple repeated human DNAs</article-title>
<source>Chromosoma</source>
<year>1978</year>
<volume>66</volume>
<fpage>23</fpage>
<lpage>32</lpage>
<pub-id pub-id-type="doi">10.1007/BF00285813</pub-id>
<pub-id pub-id-type="pmid">639625</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Warburton</surname>
<given-names>PE</given-names>
</name>
<name>
<surname>Willard</surname>
<given-names>HF</given-names>
</name>
</person-group>
<article-title>Genomic analysis of sequence variation in tandemly repeated DNA. Evidence for localized homogeneous sequence domains within arrays of alpha-satellite DNA</article-title>
<source>J Mol Biol</source>
<year>1990</year>
<volume>216</volume>
<fpage>3</fpage>
<lpage>16</lpage>
<pub-id pub-id-type="doi">10.1016/S0022-2836(05)80056-7</pub-id>
<pub-id pub-id-type="pmid">2122000</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sullivan</surname>
<given-names>LL</given-names>
</name>
<name>
<surname>Chew</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>BA</given-names>
</name>
</person-group>
<article-title>alpha satellite DNA variation and function of the human centromere</article-title>
<source>Nucleus</source>
<year>2017</year>
<volume>8</volume>
<fpage>331</fpage>
<lpage>339</lpage>
<pub-id pub-id-type="doi">10.1080/19491034.2017.1308989</pub-id>
<pub-id pub-id-type="pmid">28406740</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Willard</surname>
<given-names>HF</given-names>
</name>
</person-group>
<article-title>Centromeres: the missing link in the development of human artificial chromosomes</article-title>
<source>Curr Opin Genet Dev</source>
<year>1998</year>
<volume>8</volume>
<fpage>219</fpage>
<lpage>25</lpage>
<pub-id pub-id-type="doi">10.1016/S0959-437X(98)80144-5</pub-id>
<pub-id pub-id-type="pmid">9610413</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alexandrov</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Kazakov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tumeneva</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Shepelev</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Yurov</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Alpha-satellite DNA of primates: old and new families</article-title>
<source>Chromosoma</source>
<year>2001</year>
<volume>110</volume>
<fpage>253</fpage>
<lpage>66</lpage>
<pub-id pub-id-type="doi">10.1007/s004120100146</pub-id>
<pub-id pub-id-type="pmid">11534817</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vafa</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>KF</given-names>
</name>
</person-group>
<article-title>Chromatin containing CENP-A and alpha-satellite DNA is a major component of the inner kinetochore plate</article-title>
<source>Curr Biol</source>
<year>1997</year>
<volume>7</volume>
<fpage>897</fpage>
<lpage>900</lpage>
<pub-id pub-id-type="doi">10.1016/S0960-9822(06)00381-2</pub-id>
<pub-id pub-id-type="pmid">9382804</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ikeno</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Construction of YAC-based mammalian artificial chromosomes</article-title>
<source>Nat Biotechnol</source>
<year>1998</year>
<volume>16</volume>
<fpage>431</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1038/nbt0598-431</pub-id>
<pub-id pub-id-type="pmid">9592390</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ando</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Nozaki</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Okazaki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Yoda</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>CENP-A, -B, and -C chromatin complex that contains the I-type alpha-satellite array constitutes the prekinetochore in HeLa cells</article-title>
<source>Molecular and Cellular Biology</source>
<year>2002</year>
<volume>22</volume>
<fpage>2229</fpage>
<lpage>2241</lpage>
<pub-id pub-id-type="doi">10.1128/MCB.22.7.2229-2241.2002</pub-id>
<pub-id pub-id-type="pmid">11884609</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Henikoff</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Malik</surname>
<given-names>HS</given-names>
</name>
</person-group>
<article-title>Centromeres: selfish drivers</article-title>
<source>Nature</source>
<year>2002</year>
<volume>417</volume>
<fpage>227</fpage>
<pub-id pub-id-type="doi">10.1038/417227a</pub-id>
<pub-id pub-id-type="pmid">12015578</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schueler</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>BA</given-names>
</name>
</person-group>
<article-title>Structural and functional dynamics of human centromeric chromatin</article-title>
<source>Annu Rev Genomics Hum Genet</source>
<year>2006</year>
<volume>7</volume>
<fpage>301</fpage>
<lpage>13</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.genom.7.080505.115613</pub-id>
<pub-id pub-id-type="pmid">16756479</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hayden</surname>
<given-names>KE</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Sequences associated with centromere competency in the human genome</article-title>
<source>Mol Cell Biol</source>
<year>2013</year>
<volume>33</volume>
<fpage>763</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="doi">10.1128/MCB.01198-12</pub-id>
<pub-id pub-id-type="pmid">23230266</pub-id>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Malik</surname>
<given-names>HS</given-names>
</name>
<name>
<surname>Henikoff</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Conflict begets complexity: the evolution of centromeres</article-title>
<source>Curr Opin Genet Dev</source>
<year>2002</year>
<volume>12</volume>
<fpage>711</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="doi">10.1016/S0959-437X(02)00351-9</pub-id>
<pub-id pub-id-type="pmid">12433586</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rudd</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>Schueler</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Willard</surname>
<given-names>HF</given-names>
</name>
</person-group>
<article-title>Sequence organization and functional annotation of human centromeres</article-title>
<source>Cold Spring Harb Symp Quant Biol</source>
<year>2003</year>
<volume>68</volume>
<fpage>141</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1101/sqb.2003.68.141</pub-id>
<pub-id pub-id-type="pmid">15338612</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McNulty</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>BA</given-names>
</name>
</person-group>
<article-title>Alpha satellite DNA biology: finding function in the recesses of the genome</article-title>
<source>Chromosome Res</source>
<year>2018</year>
<volume>26</volume>
<fpage>115</fpage>
<lpage>138</lpage>
<pub-id pub-id-type="doi">10.1007/s10577-018-9582-3</pub-id>
<pub-id pub-id-type="pmid">29974361</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Miga</surname>
<given-names>KH</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Centromere reference models for human chromosomes X and Y satellite arrays</article-title>
<source>Genome Res</source>
<year>2014</year>
<volume>24</volume>
<fpage>697</fpage>
<lpage>707</lpage>
<pub-id pub-id-type="doi">10.1101/gr.159624.113</pub-id>
<pub-id pub-id-type="pmid">24501022</pub-id>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alkan</surname>
<given-names>C</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data</article-title>
<source>PLoS Comput Biol</source>
<year>2007</year>
<volume>3</volume>
<fpage>1807</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.0030181</pub-id>
<pub-id pub-id-type="pmid">17907796</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Macas</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Neumann</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Novak</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Global sequence characterization of rice centromeric satellite based on oligomer frequency analysis in large-scale sequencing data</article-title>
<source>Bioinformatics</source>
<year>2010</year>
<volume>26</volume>
<fpage>2101</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btq343</pub-id>
<pub-id pub-id-type="pmid">20616383</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Dijk</surname>
<given-names>EL</given-names>
</name>
<name>
<surname>Jaszczyszyn</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Naquin</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Thermes</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>The Third Revolution in Sequencing Technology</article-title>
<source>Trends Genet</source>
<year>2018</year>
<volume>34</volume>
<fpage>666</fpage>
<lpage>681</lpage>
<pub-id pub-id-type="doi">10.1016/j.tig.2018.05.008</pub-id>
<pub-id pub-id-type="pmid">29941292</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jain</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Linear assembly of a human centromere on the Y chromosome</article-title>
<source>Nat Biotechnol</source>
<year>2018</year>
<volume>36</volume>
<fpage>321</fpage>
<lpage>323</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.4109</pub-id>
<pub-id pub-id-type="pmid">29553574</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sevim</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Bashir</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Chin</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>Miga</surname>
<given-names>KH</given-names>
</name>
</person-group>
<article-title>Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing</article-title>
<source>Bioinformatics</source>
<year>2016</year>
<volume>32</volume>
<fpage>1921</fpage>
<lpage>1924</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btw101</pub-id>
<pub-id pub-id-type="pmid">27153570</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Basic local alignment search tool</article-title>
<source>J Mol Biol</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1016/S0022-2836(05)80360-2</pub-id>
<pub-id pub-id-type="pmid">2231712</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Benson</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Tandem repeats finder: a program to analyze DNA sequences</article-title>
<source>Nucleic Acids Res</source>
<year>1999</year>
<volume>27</volume>
<fpage>573</fpage>
<lpage>80</lpage>
<pub-id pub-id-type="doi">10.1093/nar/27.2.573</pub-id>
<pub-id pub-id-type="pmid">9862982</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>TJ</given-names>
</name>
</person-group>
<article-title>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</article-title>
<source>Nucleic Acids Res</source>
<year>1994</year>
<volume>22</volume>
<fpage>4673</fpage>
<lpage>80</lpage>
<pub-id pub-id-type="doi">10.1093/nar/22.22.4673</pub-id>
<pub-id pub-id-type="pmid">7984417</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sonnhammer</surname>
<given-names>EL</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis</article-title>
<source>Gene</source>
<year>1995</year>
<volume>167</volume>
<fpage>GC1</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1016/0378-1119(95)00714-8</pub-id>
<pub-id pub-id-type="pmid">8566757</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurka</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Klonowski</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dagman</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Pelton</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>CENSOR–a program for identification and elimination of repetitive elements from DNA sequences</article-title>
<source>Comput Chem</source>
<year>1996</year>
<volume>20</volume>
<fpage>119</fpage>
<lpage>21</lpage>
<pub-id pub-id-type="doi">10.1016/S0097-8485(96)80013-1</pub-id>
<pub-id pub-id-type="pmid">8867843</pub-id>
</element-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gluncic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Paar</surname>
<given-names>V</given-names>
</name>
</person-group>
<article-title>Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm</article-title>
<source>Nucleic Acids Res</source>
<year>2013</year>
<volume>41</volume>
<fpage>e17</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gks721</pub-id>
<pub-id pub-id-type="pmid">22977183</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paar</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Gluncic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rosandic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Basar</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Vlahovic</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>Intragene higher order repeats in neuroblastoma breakpoint family genes distinguish humans from chimpanzees</article-title>
<source>Mol Biol Evol</source>
<year>2011</year>
<volume>28</volume>
<fpage>1877</fpage>
<lpage>92</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msr009</pub-id>
<pub-id pub-id-type="pmid">21273634</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vlahovic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Gluncic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rosandic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ugarkovic</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Paar</surname>
<given-names>V</given-names>
</name>
</person-group>
<article-title>Regular Higher Order Repeat Structures in Beetle Tribolium castaneum Genome</article-title>
<source>Genome Biol Evol</source>
<year>2017</year>
<volume>9</volume>
<fpage>2668</fpage>
<lpage>2680</lpage>
<pub-id pub-id-type="pmid">27492235</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paar</surname>
<given-names>V</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Large tandem, higher order repeats and regularly dispersed repeat units contribute substantially to divergence between human and chimpanzee Y chromosomes</article-title>
<source>J Mol Evol</source>
<year>2011</year>
<volume>72</volume>
<fpage>34</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="doi">10.1007/s00239-010-9401-8</pub-id>
<pub-id pub-id-type="pmid">21103868</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ziccardi</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Clusters of alpha satellite on human chromosome 21 are dispersed far onto the short arm and lack ancient layers</article-title>
<source>Chromosome Res</source>
<year>2016</year>
<volume>24</volume>
<fpage>421</fpage>
<lpage>36</lpage>
<pub-id pub-id-type="doi">10.1007/s10577-016-9530-z</pub-id>
<pub-id pub-id-type="pmid">27430641</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<mixed-citation publication-type="other">Uralsky, L.I.
<italic>et al</italic>
. Classification and monomer-by-monomer annotation dataset of suprachromosomal family 1 alpha satellite higher-order repeats in hg38 human genome assembly.
<italic>Data Brief</italic>
<bold>24</bold>
, 103708 (2019).</mixed-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hattori</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The DNA sequence of human chromosome 21</article-title>
<source>Nature</source>
<year>2000</year>
<volume>405</volume>
<fpage>311</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1038/35012518</pub-id>
<pub-id pub-id-type="pmid">10830953</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Choo</surname>
<given-names>KH</given-names>
</name>
<name>
<surname>Vissel</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Nagy</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Earle</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Kalitsis</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>A survey of the genomic distribution of alpha satellite DNA on all the human chromosomes, and derivation of a new consensus sequence</article-title>
<source>Nucleic Acids Res</source>
<year>1991</year>
<volume>19</volume>
<fpage>1179</fpage>
<lpage>82</lpage>
<pub-id pub-id-type="doi">10.1093/nar/19.6.1179</pub-id>
<pub-id pub-id-type="pmid">2030938</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vissel</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Choo</surname>
<given-names>KH</given-names>
</name>
</person-group>
<article-title>Four distinct alpha satellite subfamilies shared by human chromosomes 13, 14 and 21</article-title>
<source>Nucleic Acids Res</source>
<year>1991</year>
<volume>19</volume>
<fpage>271</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="doi">10.1093/nar/19.2.271</pub-id>
<pub-id pub-id-type="pmid">2014167</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<mixed-citation publication-type="other">Liehr, T. Benign and Pathological Chromosomal Imbalances: Microscopic and Submicroscopic Copy Number Variations (CNVs) in Genetics and Counseling.
<italic>Benign and Pathological Chromosomal Imbalances: Microscopic and Submicroscopic Copy Number Variations (Cnvs) in Genetics and Counseling</italic>
, 1–199 (2014).</mixed-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tyler-Smith</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>WR</given-names>
</name>
</person-group>
<article-title>Structure of the major block of alphoid satellite DNA on the human Y chromosome</article-title>
<source>J Mol Biol</source>
<year>1987</year>
<volume>195</volume>
<fpage>457</fpage>
<lpage>70</lpage>
<pub-id pub-id-type="doi">10.1016/0022-2836(87)90175-6</pub-id>
<pub-id pub-id-type="pmid">2821279</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jorgensen</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Bostock</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Bak</surname>
<given-names>AL</given-names>
</name>
</person-group>
<article-title>Homologous subfamilies of human alphoid repetitive DNA on different nucleolus organizing chromosomes</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1987</year>
<volume>84</volume>
<fpage>1075</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.84.4.1075</pub-id>
<pub-id pub-id-type="pmid">3469648</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000460 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000460 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:6718397
   |texte=   Discovery of 33mer in chromosome 21 – the largest alpha satellite higher order repeat unit among all human somatic chromosomes
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:31477765" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021