TamazightV2, Pmc, Corpus, bibRecord, 000036

***** Acces problem to record *****\

Identifieur interne : 000036 ( Pmc/Corpus ); précédent : 0000359; suivant : 0000370 ***** probable Xml problem with record *****

Links to Exploration step

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Convex Clustering: An Attractive Alternative to Hierarchical Clustering</title>
<author><name sortKey="Chen, Gary K" sort="Chen, Gary K" uniqKey="Chen G" first="Gary K." last="Chen">Gary K. Chen</name>
<affiliation><nlm:aff id="aff001"><addr-line>Department of Preventive Medicine ,Biostatistics Division, University of Southern California, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Chi, Eric C" sort="Chi, Eric C" uniqKey="Chi E" first="Eric C." last="Chi">Eric C. Chi</name>
<affiliation><nlm:aff id="aff002"><addr-line>Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Ranola, John Michael O" sort="Ranola, John Michael O" uniqKey="Ranola J" first="John Michael O." last="Ranola">John Michael O. Ranola</name>
<affiliation><nlm:aff id="aff003"><addr-line>Department of Statistics, University of Washington, Seattle, Washington, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Lange, Kenneth" sort="Lange, Kenneth" uniqKey="Lange K" first="Kenneth" last="Lange">Kenneth Lange</name>
<affiliation><nlm:aff id="aff004"><addr-line>Department of Biomathematics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="aff005"><addr-line>Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="aff006"><addr-line>Department of Statistics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">25965340</idno>
<idno type="pmc">4429070</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429070</idno>
<idno type="RBID">PMC:4429070</idno>
<idno type="doi">10.1371/journal.pcbi.1004228</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000036</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000036</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Convex Clustering: An Attractive Alternative to Hierarchical Clustering</title>
<author><name sortKey="Chen, Gary K" sort="Chen, Gary K" uniqKey="Chen G" first="Gary K." last="Chen">Gary K. Chen</name>
<affiliation><nlm:aff id="aff001"><addr-line>Department of Preventive Medicine ,Biostatistics Division, University of Southern California, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Chi, Eric C" sort="Chi, Eric C" uniqKey="Chi E" first="Eric C." last="Chi">Eric C. Chi</name>
<affiliation><nlm:aff id="aff002"><addr-line>Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Ranola, John Michael O" sort="Ranola, John Michael O" uniqKey="Ranola J" first="John Michael O." last="Ranola">John Michael O. Ranola</name>
<affiliation><nlm:aff id="aff003"><addr-line>Department of Statistics, University of Washington, Seattle, Washington, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Lange, Kenneth" sort="Lange, Kenneth" uniqKey="Lange K" first="Kenneth" last="Lange">Kenneth Lange</name>
<affiliation><nlm:aff id="aff004"><addr-line>Department of Biomathematics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="aff005"><addr-line>Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="aff006"><addr-line>Department of Statistics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">PLoS Computational Biology</title>
<idno type="ISSN">1553-734X</idno>
<idno type="eISSN">1553-7358</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program <sc>CONVEXCLUSTER</sc>
 incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. <sc>CONVEXCLUSTER</sc>
 can be freely downloaded from the UCLA Human Genetics web site at <ext-link ext-link-type="uri" xlink:href="http://www.genetics.ucla.edu/software/">http://www.genetics.ucla.edu/software/</ext-link>
</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Cormen, Th" uniqKey="Cormen T">TH Cormen</name>
</author>
<author><name sortKey="Stein, C" uniqKey="Stein C">C Stein</name>
</author>
<author><name sortKey="Rivest, Rl" uniqKey="Rivest R">RL Rivest</name>
</author>
<author><name sortKey="Leiserson, Ce" uniqKey="Leiserson C">CE Leiserson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lindsten, F" uniqKey="Lindsten F">F Lindsten</name>
</author>
<author><name sortKey="Ohlsson, H" uniqKey="Ohlsson H">H Ohlsson</name>
</author>
<author><name sortKey="Ljung, L" uniqKey="Ljung L">L Ljung</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Fisher, Ra" uniqKey="Fisher R">RA Fisher</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Borwein, Jm" uniqKey="Borwein J">JM Borwein</name>
</author>
<author><name sortKey="Lewis, As" uniqKey="Lewis A">AS Lewis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Clarke, Fh" uniqKey="Clarke F">FH Clarke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Demyanov, Vf" uniqKey="Demyanov V">VF Demyanov</name>
</author>
<author><name sortKey="Fletcher, R" uniqKey="Fletcher R">R Fletcher</name>
</author>
<author><name sortKey="Terlaky, T" uniqKey="Terlaky T">T Terlaky</name>
</author>
<author><name sortKey="Di Pillo, G" uniqKey="Di Pillo G">G Di Pillo</name>
</author>
<author><name sortKey="Schoen, F" uniqKey="Schoen F">F Schoen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Borg, I" uniqKey="Borg I">I Borg</name>
</author>
<author><name sortKey="Groenen, Pj" uniqKey="Groenen P">PJ Groenen</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Hunter, Dr" uniqKey="Hunter D">DR Hunter</name>
</author>
<author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
<author><name sortKey="Hunter, Dr" uniqKey="Hunter D">DR Hunter</name>
</author>
<author><name sortKey="Yang, I" uniqKey="Yang I">I Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wu, Tt" uniqKey="Wu T">TT Wu</name>
</author>
<author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Deutsch, F" uniqKey="Deutsch F">F Deutsch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Parikh, N" uniqKey="Parikh N">N Parikh</name>
</author>
<author><name sortKey="Boyd, S" uniqKey="Boyd S">S Boyd</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Sokal, Rr" uniqKey="Sokal R">RR Sokal</name>
</author>
<author><name sortKey="Michener, Cd" uniqKey="Michener C">CD Michener</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hopcroft, J" uniqKey="Hopcroft J">J Hopcroft</name>
</author>
<author><name sortKey="Tarjan, R" uniqKey="Tarjan R">R Tarjan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hubert, L" uniqKey="Hubert L">L Hubert</name>
</author>
<author><name sortKey="Arabie, P" uniqKey="Arabie P">P Arabie</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Su, Ys" uniqKey="Su Y">YS Su</name>
</author>
<author><name sortKey="Gelman, A" uniqKey="Gelman A">A Gelman</name>
</author>
<author><name sortKey="Hill, J" uniqKey="Hill J">J Hill</name>
</author>
<author><name sortKey="Yajima, M" uniqKey="Yajima M">M Yajima</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pritchard, Jk" uniqKey="Pritchard J">JK Pritchard</name>
</author>
<author><name sortKey="Stephens, M" uniqKey="Stephens M">M Stephens</name>
</author>
<author><name sortKey="Donnelly, P" uniqKey="Donnelly P">P Donnelly</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Alexander, Dh" uniqKey="Alexander D">DH Alexander</name>
</author>
<author><name sortKey="Novembre, J" uniqKey="Novembre J">J Novembre</name>
</author>
<author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Price, Al" uniqKey="Price A">AL Price</name>
</author>
<author><name sortKey="Patterson, Nj" uniqKey="Patterson N">NJ Patterson</name>
</author>
<author><name sortKey="Plenge, Rm" uniqKey="Plenge R">RM Plenge</name>
</author>
<author><name sortKey="Weinblatt, Me" uniqKey="Weinblatt M">ME Weinblatt</name>
</author>
<author><name sortKey="Shadick, Na" uniqKey="Shadick N">NA Shadick</name>
</author>
<author><name sortKey="Reich, D" uniqKey="Reich D">D Reich</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rosenberg, Na" uniqKey="Rosenberg N">NA Rosenberg</name>
</author>
<author><name sortKey="Pritchard, Jk" uniqKey="Pritchard J">JK Pritchard</name>
</author>
<author><name sortKey="Weber, Jl" uniqKey="Weber J">JL Weber</name>
</author>
<author><name sortKey="Cann, Hm" uniqKey="Cann H">HM Cann</name>
</author>
<author><name sortKey="Kidd, Kk" uniqKey="Kidd K">KK Kidd</name>
</author>
<author><name sortKey="Zhivotovsky, La" uniqKey="Zhivotovsky L">LA Zhivotovsky</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kittles, Ra" uniqKey="Kittles R">RA Kittles</name>
</author>
<author><name sortKey="Weiss, Km" uniqKey="Weiss K">KM Weiss</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author><name sortKey="Lewis, Cm" uniqKey="Lewis C">CM Lewis</name>
</author>
<author><name sortKey="Jakobsson, M" uniqKey="Jakobsson M">M Jakobsson</name>
</author>
<author><name sortKey="Ramachandran, S" uniqKey="Ramachandran S">S Ramachandran</name>
</author>
<author><name sortKey="Ray, N" uniqKey="Ray N">N Ray</name>
</author>
<author><name sortKey="Bedoya, G" uniqKey="Bedoya G">G Bedoya</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, Jz" uniqKey="Li J">JZ Li</name>
</author>
<author><name sortKey="Absher, Dm" uniqKey="Absher D">DM Absher</name>
</author>
<author><name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
<author><name sortKey="Southwick, Am" uniqKey="Southwick A">AM Southwick</name>
</author>
<author><name sortKey="Casto, Am" uniqKey="Casto A">AM Casto</name>
</author>
<author><name sortKey="Ramachandran, S" uniqKey="Ramachandran S">S Ramachandran</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Rosenberg, Na" uniqKey="Rosenberg N">NA Rosenberg</name>
</author>
<author><name sortKey="Mahajan, S" uniqKey="Mahajan S">S Mahajan</name>
</author>
<author><name sortKey="Gonzalez Quevedo, C" uniqKey="Gonzalez Quevedo C">C Gonzalez-Quevedo</name>
</author>
<author><name sortKey="Blum, Mg" uniqKey="Blum M">MG Blum</name>
</author>
<author><name sortKey="Nino Rosales, L" uniqKey="Nino Rosales L">L Nino-Rosales</name>
</author>
<author><name sortKey="Ninis, V" uniqKey="Ninis V">V Ninis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Coudray, C" uniqKey="Coudray C">C Coudray</name>
</author>
<author><name sortKey="Olivieri, A" uniqKey="Olivieri A">A Olivieri</name>
</author>
<author><name sortKey="Achilli, A" uniqKey="Achilli A">A Achilli</name>
</author>
<author><name sortKey="Pala, M" uniqKey="Pala M">M Pala</name>
</author>
<author><name sortKey="Melhaoui, M" uniqKey="Melhaoui M">M Melhaoui</name>
</author>
<author><name sortKey="Cherkaoui, M" uniqKey="Cherkaoui M">M Cherkaoui</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Qamar, R" uniqKey="Qamar R">R Qamar</name>
</author>
<author><name sortKey="Ayub, Q" uniqKey="Ayub Q">Q Ayub</name>
</author>
<author><name sortKey="Mohyuddin, A" uniqKey="Mohyuddin A">A Mohyuddin</name>
</author>
<author><name sortKey="Helgason, A" uniqKey="Helgason A">A Helgason</name>
</author>
<author><name sortKey="Mazhar, K" uniqKey="Mazhar K">K Mazhar</name>
</author>
<author><name sortKey="Mansoor, A" uniqKey="Mansoor A">A Mansoor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ablimit, A" uniqKey="Ablimit A">A Ablimit</name>
</author>
<author><name sortKey="Qin, W" uniqKey="Qin W">W Qin</name>
</author>
<author><name sortKey="Shan, W" uniqKey="Shan W">W Shan</name>
</author>
<author><name sortKey="Wu, W" uniqKey="Wu W">W Wu</name>
</author>
<author><name sortKey="Ling, F" uniqKey="Ling F">F Ling</name>
</author>
<author><name sortKey="Ling, Kh" uniqKey="Ling K">KH Ling</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Nelson, Mr" uniqKey="Nelson M">MR Nelson</name>
</author>
<author><name sortKey="Bryc, K" uniqKey="Bryc K">K Bryc</name>
</author>
<author><name sortKey="King, Ks" uniqKey="King K">KS King</name>
</author>
<author><name sortKey="Indap, A" uniqKey="Indap A">A Indap</name>
</author>
<author><name sortKey="Boyko, Ar" uniqKey="Boyko A">AR Boyko</name>
</author>
<author><name sortKey="Novembre, J" uniqKey="Novembre J">J Novembre</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gravel, S" uniqKey="Gravel S">S Gravel</name>
</author>
<author><name sortKey="Henn, Bm" uniqKey="Henn B">BM Henn</name>
</author>
<author><name sortKey="Gutenkunst, Rn" uniqKey="Gutenkunst R">RN Gutenkunst</name>
</author>
<author><name sortKey="Indap, Ar" uniqKey="Indap A">AR Indap</name>
</author>
<author><name sortKey="Marth, Gt" uniqKey="Marth G">GT Marth</name>
</author>
<author><name sortKey="Clark, Ag" uniqKey="Clark A">AG Clark</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rochefort, H" uniqKey="Rochefort H">H Rochefort</name>
</author>
<author><name sortKey="Glondu, M" uniqKey="Glondu M">M Glondu</name>
</author>
<author><name sortKey="Sahla, Me" uniqKey="Sahla M">ME Sahla</name>
</author>
<author><name sortKey="Platet, N" uniqKey="Platet N">N Platet</name>
</author>
<author><name sortKey="Garcia, M" uniqKey="Garcia M">M Garcia</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Perou, Cm" uniqKey="Perou C">CM Perou</name>
</author>
<author><name sortKey="S Rlie, T" uniqKey="S Rlie T">T S?rlie</name>
</author>
<author><name sortKey="Eisen, Mb" uniqKey="Eisen M">MB Eisen</name>
</author>
<author><name sortKey="Van De Rijn, M" uniqKey="Van De Rijn M">M van de Rijn</name>
</author>
<author><name sortKey="Jeffrey, Ss" uniqKey="Jeffrey S">SS Jeffrey</name>
</author>
<author><name sortKey="Rees, Ca" uniqKey="Rees C">CA Rees</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chi, Ec" uniqKey="Chi E">EC Chi</name>
</author>
<author><name sortKey="Lange, K" uniqKey="Lange K">K Lange</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">PLoS Comput Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS Comput. Biol</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">ploscomp</journal-id>
<journal-title-group><journal-title>PLoS Computational Biology</journal-title>
</journal-title-group>
<issn pub-type="ppub">1553-734X</issn>
<issn pub-type="epub">1553-7358</issn>
<publisher><publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">25965340</article-id>
<article-id pub-id-type="pmc">4429070</article-id>
<article-id pub-id-type="publisher-id">PCOMPBIOL-D-14-01570</article-id>
<article-id pub-id-type="doi">10.1371/journal.pcbi.1004228</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group><article-title>Convex Clustering: An Attractive Alternative to Hierarchical Clustering</article-title>
<alt-title alt-title-type="running-head">CONVEXCLUSTER</alt-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Chen</surname>
<given-names>Gary K.</given-names>
</name>
<xref ref-type="aff" rid="aff001"><sup>1</sup>
</xref>
<xref ref-type="corresp" rid="cor001">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Chi</surname>
<given-names>Eric C.</given-names>
</name>
<xref ref-type="aff" rid="aff002"><sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Ranola</surname>
<given-names>John Michael O.</given-names>
</name>
<xref ref-type="aff" rid="aff003"><sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Lange</surname>
<given-names>Kenneth</given-names>
</name>
<xref ref-type="aff" rid="aff004"><sup>4</sup>
</xref>
<xref ref-type="aff" rid="aff005"><sup>5</sup>
</xref>
<xref ref-type="aff" rid="aff006"><sup>6</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff001"><label>1</label>
<addr-line>Department of Preventive Medicine ,Biostatistics Division, University of Southern California, Los Angeles, California, United States of America</addr-line>
</aff>
<aff id="aff002"><label>2</label>
<addr-line>Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America</addr-line>
</aff>
<aff id="aff003"><label>3</label>
<addr-line>Department of Statistics, University of Washington, Seattle, Washington, United States of America</addr-line>
</aff>
<aff id="aff004"><label>4</label>
<addr-line>Department of Biomathematics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</aff>
<aff id="aff005"><label>5</label>
<addr-line>Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</aff>
<aff id="aff006"><label>6</label>
<addr-line>Department of Statistics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>
</aff>
<contrib-group><contrib contrib-type="editor"><name><surname>Kosakovsky Pond</surname>
<given-names>Sergei L.</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1"><addr-line>University of California San Diego, UNITED STATES</addr-line>
</aff>
<author-notes><fn fn-type="conflict" id="coi001"><p>The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="con" id="contrib001"><p>Conceived and designed the experiments: GKC ECC KL. Performed the experiments: GKC. Analyzed the data: GKC. Wrote the paper: GKC ECC KL. Prepared the POPRES data: JMOR.</p>
</fn>
<corresp id="cor001">* E-mail: <email>gary.k.chen@usc.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="collection"><month>5</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="epub"><day>12</day>
<month>5</month>
<year>2015</year>
</pub-date>
<volume>11</volume>
<issue>5</issue>
<elocation-id>e1004228</elocation-id>
<history><date date-type="received"><day>27</day>
<month>8</month>
<year>2014</year>
</date>
<date date-type="accepted"><day>6</day>
<month>3</month>
<year>2015</year>
</date>
</history>
<permissions><copyright-statement>© 2015 Chen et al</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Chen et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:type="simple" xlink:href="pcbi.1004228.pdf"></self-uri>
<abstract><p>The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program <sc>CONVEXCLUSTER</sc>
 incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. <sc>CONVEXCLUSTER</sc>
 can be freely downloaded from the UCLA Human Genetics web site at <ext-link ext-link-type="uri" xlink:href="http://www.genetics.ucla.edu/software/">http://www.genetics.ucla.edu/software/</ext-link>
</p>
</abstract>
<abstract abstract-type="summary"><title>Author Summary</title>
<p>Pattern discovery is one of the most important goals of data-driven research. In the biological sciences hierarchical clustering has achieved a position of pre-eminence due to its ability to capture multiple levels of data granularity. Hierarchical clustering’s visual displays of phylogenetic trees and gene-expression modules are indeed seductive. Despite its merits, hierarchical clustering is greedy by nature and often produces spurious clusters, particularly in the presence of substantial noise. This paper presents a relatively new alternative to hierarchical clustering known as convex clustering. Although convex clustering is more computationally demanding, it enjoys several advantages over hierarchical clustering and other traditional methods of clustering. Convex clustering delivers a uniquely defined clustering path that partially obviates the need for choosing an optimal number of clusters. Along the path small clusters gradually coalesce to form larger clusters. Clustering can be guided by external information through appropriately defined similarity weights. Comparisons to hierarchical clustering demonstrate the superior robustness of convex clustering to noise. Our genetics examples include inference of the demographic history of 52 populations across the world, a more detailed analysis of European demography, and a re-analysis of a well-known breast cancer expression dataset. We also introduce a new algorithm for solving the convex clustering problem. This algorithm belongs to a subclass of MM (minimization-majorization) algorithms known as proximal distance algorithms. The proximal distance convex clustering algorithm is inherently parallelizable and readily maps to modern many-core devices such as graphics processing units (GPUs). Our freely available software, <sc>convexcluster</sc>
, exploits OpenCL routines that ensure compatibility across a variety of hardware environments.</p>
</abstract>
<funding-group><funding-statement>This work was funded in part by National Institutes of Health grants R01 MH100879, R01 ES019876, and R01 HG006465 supporting GKC, CIA Postdoctoral Fellowship 2012-12062800003 supporting ECC, NHGRI grant T32 HG00035 supporting JMOR, and R01 HG006139 and RO1 GM53275 supporting KL. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts><fig-count count="19"></fig-count>
<table-count count="3"></table-count>
<page-count count="31"></page-count>
</counts>
<custom-meta-group><custom-meta id="data-availability"><meta-name>Data Availability</meta-name>
<meta-value>Datasets in paper are held in public repositories 1) <ext-link ext-link-type="uri" xlink:href="https://archive.ics.uci.edu/ml/datasets.html">https://archive.ics.uci.edu/ml/datasets.html</ext-link>
 2) <ext-link ext-link-type="uri" xlink:href="http://web.stanford.edu/group/rosenberglab/diversity.html">http://web.stanford.edu/group/rosenberglab/diversity.html</ext-link>
 3) <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000145.v4.p2">http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000145.v4.p2</ext-link>
 4) <ext-link ext-link-type="uri" xlink:href="http://genome-www.stanford.edu/sutech/">http://genome-www.stanford.edu/sutech/</ext-link>
</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes><title>Data Availability</title>
<p>Datasets in paper are held in public repositories 1) <ext-link ext-link-type="uri" xlink:href="https://archive.ics.uci.edu/ml/datasets.html">https://archive.ics.uci.edu/ml/datasets.html</ext-link>
 2) <ext-link ext-link-type="uri" xlink:href="http://web.stanford.edu/group/rosenberglab/diversity.html">http://web.stanford.edu/group/rosenberglab/diversity.html</ext-link>
 3) <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000145.v4.p2">http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000145.v4.p2</ext-link>
 4) <ext-link ext-link-type="uri" xlink:href="http://genome-www.stanford.edu/sutech/">http://genome-www.stanford.edu/sutech/</ext-link>
</p>
</notes>
</front>
<body><disp-quote><p>This is a <italic>PLOS Computational Biology</italic>
 Methods paper.</p>
</disp-quote>
<sec sec-type="intro" id="sec001"><title>Introduction</title>
<p>Pattern discovery is one of the primary goals of bioinformatics. Cluster analysis is a broad term for a variety of exploratory methods that reveal patterns based on similarities between data points. Well-known methods such as <italic>k</italic>
-means invoke a fixed number of clusters. In complex biological data, the number of clusters is unknown in advance, and it is appealing to vary the number of clusters simultaneously with cluster assignment. Hierarchical clustering has been particularly helpful in understanding cluster granularity in gene-expression studies and other applications. In addition to producing easily visualized and interpretable results, hierarchical clustering is simple to implement and computationally quick. These are legitimate advantages, but they do not compensate for hierarchical clustering’s instability to small data perturbations such as measurement error. Cluster inference can be adversely affected as small errors accumulate.</p>
<p>All principled methods of clustering attempt to minimize some measure of within group dissimilarity. Hierarchical clustering constructs a bifurcating tree by fusing or dividing observations (features). Fusion is referred to as agglomerative clustering and splitting as divisive clustering. Because of the greedy nature of the choices in hierarchical clustering, it returns clusters that are only locally optimal with respect to the underlying criterion [<xref rid="pcbi.1004228.ref001" ref-type="bibr">1</xref>
]. Solution quality may vary depending on how clusters are fused. There is no guarantee that UPGMA, single linkage, or complete linkage will agree or will collectively or individually give the optimal clusters. A potentially greater handicap is that small perturbations in the data can lead to large changes in hierarchical clustering assignments. This propensity makes hierarchical clustering sensitive to outliers and promotes the formation of spurious clusters. In combination, the presence of local minima and the sensitivity to outliers lead to irreproducible results.</p>
<p>Although hierarchical clustering has its drawbacks, completely reformulating it might be detrimental. Recently [<xref rid="pcbi.1004228.ref002" ref-type="bibr">2</xref>
] and [<xref rid="pcbi.1004228.ref003" ref-type="bibr">3</xref>
] introduced convex clustering based on minimizing a penalized sum of squares. Their criterion is coercive and strictly convex. Recall that a function <italic>f</italic>
(<bold><italic>x</italic>
</bold>
) is coercive if lim<sub>‖<bold><italic>x</italic>
</bold>
‖ → ∞</sub>
<italic>f</italic>
(<bold><italic>x</italic>
</bold>
) = ∞. According to a classical theorem of mathematical analysis, a continuous coercive function achieves its minimum. Strict convexity of the convex clustering criterion ensures that the global minimizer is unique. The penalty term in convex clustering criterion accommodates prior information through nonuniform weights on data pairs. The solution paths of convex clustering retain the straightforward interpretability of hierarchical clustering while ameliorating its sensitivity to outliers and tendency to get trapped by local minima.</p>
<p>Despite the persuasive advantages of convex clustering, there are two obstacles that stand in its way of becoming a practical tool in bioinformatics. The first is the challenge of large-scale problems. Current algorithms are computationally intensive and scale poorly on high-dimensional problems. A second obstacle is the minimal guidance currently available on how to choose penalty weights. Hocking and colleagues suggest some rules of thumb but offer little detailed advice [<xref rid="pcbi.1004228.ref003" ref-type="bibr">3</xref>
]. In our experience, the quality of the clustering path depends critically on well-designed weights. To address these issues, the current paper describes a fast new algorithm and a corresponding software implementation, <sc>convexcluster</sc>
. Our advice on strategies for choosing penalty weights is grounded in some practical biological examples. These examples support our conviction that convex clustering can be more nuanced than hierarchical clustering. Our examples include Fisher’s Iris data from discriminant analysis, ethnicity clustering based on microsatellite genotypes from the Human Genome Diversity Project and SNP genotypes from the POPRES project, and breast cancer subtype classification via microarrays. In the POPRES data, we first reduce the genotypes to principal components and then use these to cluster. The paths computed under convex clustering expose features of the data hidden to less sophisticated clustering methods. The potential for understanding human evolution and history alone justify wider adoption of convex clustering.</p>
</sec>
<sec sec-type="materials|methods" id="sec002"><title>Methods</title>
<p>Assume that there are <italic>n</italic>
 cases and <italic>p</italic>
 features. For example, cases might correspond to cancer patients and features to their biomarker profiles. The more vivid language of graph theory speaks of nodes rather than cases and edges rather than pairs of cases. To implement convex clustering, [<xref rid="pcbi.1004228.ref002" ref-type="bibr">2</xref>] suggest minimizing the penalized loss function
<disp-formula id="pcbi.1004228.e001"><alternatives><graphic xlink:href="pcbi.1004228.e001.jpg" id="pcbi.1004228.e001g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M1"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:msub><mml:mi>f</mml:mi>
<mml:mi>μ</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(1)</label>
</disp-formula>
relying on Euclidean norms. Here the column vector <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
 ∈ ℝ<sup><italic>p</italic>
</sup>
 of the matrix <bold><italic>X</italic>
</bold>
 ∈ ℝ<sup><italic>p</italic>
 × <italic>n</italic>
</sup>
 records the features for case <italic>i</italic>
, the column <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
 of the matrix <bold><italic>U</italic>
</bold>
 denotes the cluster center assigned to case <italic>i</italic>
, <italic>μ</italic>
 ≥ 0 tunes the strength of the penalty, and <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 ≥ 0 weights the contribution of the case pair (<italic>i</italic>
, <italic>j</italic>
) to the penalty. Unless sparse, the weights <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 are stored in a symmetric <italic>n</italic>
 × <italic>n</italic>
 adjacency matrix. <xref ref-type="fig" rid="pcbi.1004228.g001">Fig 1</xref>
 illustrates the concept of convex clustering on three data point extracted from the Iris dataset [<xref rid="pcbi.1004228.ref004" ref-type="bibr">4</xref>
]. The objective function <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
) treats the features symmetrically. If these range over widely varying scales, it is prudent to standardize each feature to have mean 0 and variance 1.</p>
<fig id="pcbi.1004228.g001" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g001</object-id>
<label>Fig 1</label>
<caption><title>Convex clustering concepts.</title>
<p>For clarity, we present three random data points extracted from the three classes in the Iris dataset. Black points denote the original data points <bold><italic>X</italic>
</bold>
 and blue points denote the cluster centers <bold><italic>U</italic>
</bold>
. At <italic>μ</italic>
 = 0, <bold><italic>X</italic>
</bold>
 and <bold><italic>U</italic>
</bold>
 coincide. At intermediate <italic>μ</italic>
 values (middle figure), <bold><italic>U</italic>
</bold>
 coalesces towards its cluster center. For sufficiently large <italic>μ</italic>
, <bold><italic>U</italic>
</bold>
 converges to cluster centers (right figure). Note that in this demonstration, only the left two points have non-zero pairwise weights <italic>w</italic>
<sub><italic>ij</italic>
</sub>
. Hence, the two resulting clusters reflect the two graphs defined by the matrix of weights.</p>
</caption>
<graphic xlink:href="pcbi.1004228.g001"></graphic>
</fig>
<p>Because the objective function <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
) is strictly convex and coercive, a unique minimum point exists for each value of <italic>μ</italic>
. When <italic>μ</italic>
 = 0 and the <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
 are unique, the choices <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
 = <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
 minimize <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
), and there are as many clusters as cases. If the underlying graph is connected, then as <italic>μ</italic>
 increases, cluster centers coalesce until all centers merge into a single cluster with all <inline-formula id="pcbi.1004228.e002"><mml:math id="M2"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>‾</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>
, the average of the data points <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
. Although splitting events as well as fusion events can in principle occur along the solution path, following the path as <italic>μ</italic>
 increases typically reveals a hierarchical structure among the clusters. The weights encode prior information that guides clustering. Setting some of the weights equal to 0 reduces the computational load of minimizing <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
) in the proximal distance algorithm introduced next.</p>
<sec id="sec003"><title>The Proximal Distance Algorithm</title>
<p>The proximal distance principle is a new way of attacking constrained optimization problems [<xref rid="pcbi.1004228.ref005" ref-type="bibr">5</xref>
]. The principle is capable of enforcing parsimony in parameter estimation while avoiding the shrinkage incurred by convex penalties such as the lasso. In parametric models, shrinkage leads to biased parameter estimates and entices false positives to enter the model. Imperfect models in turn fit new data poorly. The proximal distance principle seeks to minimize a function <italic>h</italic>
(<bold><italic>y</italic>
</bold>
), possibly nonsmooth, subject to <bold><italic>y</italic>
</bold>
 ∈ <italic>C</italic>
, where <italic>C</italic>
 is a closed set, not necessarily convex. The set <italic>C</italic>
 encodes constraints such as sparsity. In the exact penalty method of Clarke [<xref rid="pcbi.1004228.ref006" ref-type="bibr">6</xref>
, <xref rid="pcbi.1004228.ref007" ref-type="bibr">7</xref>
, <xref rid="pcbi.1004228.ref008" ref-type="bibr">8</xref>
], this constrained problem is replaced by the unconstrained problem of minimizing <italic>h</italic>
(<bold><italic>y</italic>
</bold>
) + <italic>ρ</italic>
 dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
), where dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) denotes the Euclidean distance from <bold><italic>y</italic>
</bold>
 to <italic>C</italic>
. Note that dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) = 0 is a necessary and sufficient condition for <bold><italic>y</italic>
</bold>
 ∈ <italic>C</italic>
. If <italic>ρ</italic>
 is chosen large enough, say bigger than a Lipschitz constant for <italic>h</italic>
(<bold><italic>y</italic>
</bold>
), then the minima of the two problems coincide (Proposition 6.3.2 in [<xref rid="pcbi.1004228.ref006" ref-type="bibr">6</xref>
]).</p>
<p>How does convex clustering fit in this abstract framework? Although the objective function <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
) is certainly nonsmooth, there are no constraints in sight. The strategy of parameter splitting introduces constraints to simplify the objective function. Since least squares problems are routine, the penalty terms constitute the intractable part of the objective function <italic>f</italic>
<sub><italic>μ</italic>
</sub>
(<bold><italic>U</italic>
</bold>
). One can simplify the term ‖<bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>u</italic>
</bold>
<sub><italic>j</italic>
</sub>
‖ by replacing the vector difference <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>u</italic>
</bold>
<sub><italic>j</italic>
</sub>
 by the single vector <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 and imposing the constraint <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 = <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>u</italic>
</bold>
<sub><italic>j</italic>
</sub>. Parameter splitting therefore leads to the revised objective function
<disp-formula id="pcbi.1004228.e003"><alternatives><graphic xlink:href="pcbi.1004228.e003.jpg" id="pcbi.1004228.e003g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M3"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:msub><mml:mi>g</mml:mi>
<mml:mi>μ</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(2)</label>
</disp-formula>
with a simpler loss, an expanded set of parameters, and a linear constraint set <italic>C</italic>
 encapsulating the pairwise constraints <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 = <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>u</italic>
</bold>
<sub><italic>j</italic>
</sub>
.</p>
<p>The proximal distance method undertakes minimization of <italic>h</italic>
(<bold><italic>y</italic>
</bold>
) + <italic>ρ</italic>
 dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) by a combination of approximation, the MM (majorization-minimization) principle [<xref rid="pcbi.1004228.ref009" ref-type="bibr">9</xref>
, <xref rid="pcbi.1004228.ref010" ref-type="bibr">10</xref>
, <xref rid="pcbi.1004228.ref011" ref-type="bibr">11</xref>
, <xref rid="pcbi.1004228.ref012" ref-type="bibr">12</xref>
, <xref rid="pcbi.1004228.ref013" ref-type="bibr">13</xref>
], and an appeal to a combination of set projection [<xref rid="pcbi.1004228.ref014" ref-type="bibr">14</xref>
] and proximal mapping [<xref rid="pcbi.1004228.ref015" ref-type="bibr">15</xref>
]. The latter operations have been intensely studied for years and implemented in a host of special cases. Thus, the proximal distance principle encourages highly modular solutions to difficult optimization problems. Furthermore, most proximal distance algorithms benefit from parallelization.</p>
<p>Let us consider each of the ingredients of the proximal distance algorithm in turn, starting with approximation. The function dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) is nonsmooth even when <italic>C</italic>
 is well behaved. For <italic>ϵ</italic>
 > 0 small, the revised distance <inline-formula id="pcbi.1004228.e004"><mml:math id="M4"><mml:mrow><mml:msub><mml:mtext mathvariant="normal">dist</mml:mtext>
<mml:mi>ϵ</mml:mi>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="normal">dist</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</inline-formula>
 is differentiable and approximates dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) well. The MM principle leads to algorithms that systematically decrease the objective function. In the case of minimizing <italic>f</italic>
(<bold><italic>y</italic>
</bold>
) + <italic>ρ</italic>
 dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) one can invoke the majorization dist(<bold><italic>y</italic>
</bold>
, <italic>C</italic>
) ≤ ‖<bold><italic>y</italic>
</bold>
−<italic>P</italic>
<sub><italic>C</italic>
</sub>
(<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
)‖, where <italic>P</italic>
<sub><italic>C</italic>
</sub>
(<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
) is the projection of the current iterate <bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
 onto the set <italic>C</italic>
. By definition dist(<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
, <italic>C</italic>
) = ‖<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
−<italic>P</italic>
<sub><italic>C</italic>
</sub>
(<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
)‖, and <italic>P</italic>
<sub><italic>C</italic>
</sub>
(<bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
) is a closest point in <italic>C</italic>
 to the point <bold><italic>y</italic>
</bold>
<sub><italic>m</italic>
</sub>
. For a closed nonconvex set, there may be multiple closest points; for a closed convex set there is exactly one.</p>
<p>According to the MM principle, minimizing the surrogate function
<disp-formula id="pcbi.1004228.e005"><alternatives><graphic xlink:href="pcbi.1004228.e005.jpg" id="pcbi.1004228.e005g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M5"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>ρ</mml:mi>
<mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mo>∥</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">U</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
</mml:msub>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(3)</label>
</disp-formula>drives the approximate objective function
<disp-formula id="pcbi.1004228.e006"><alternatives><graphic xlink:href="pcbi.1004228.e006.jpg" id="pcbi.1004228.e006g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M6"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>ρ</mml:mi>
<mml:msqrt><mml:mrow><mml:mi>dist</mml:mi>
<mml:msup><mml:mrow><mml:mo>[</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">U</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
downhill. The surrogate function <xref ref-type="disp-formula" rid="pcbi.1004228.e005">Eq (3)</xref> is still too complicated for our purposes. The remedy is another round of majorization. This time the majorization
<disp-formula id="pcbi.1004228.e007"><alternatives><graphic xlink:href="pcbi.1004228.e007.jpg" id="pcbi.1004228.e007g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M7"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msqrt><mml:mrow><mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mtd>
<mml:mtd><mml:mo>≤</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mn>1</mml:mn>
<mml:mrow><mml:mn>2</mml:mn>
<mml:msqrt><mml:mrow><mml:msub><mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(4)</label>
</disp-formula>
comes into play based on the concavity of the function <inline-formula id="pcbi.1004228.e008"><mml:math id="M8"><mml:mrow><mml:msqrt><mml:mrow><mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</inline-formula>
 for <italic>t</italic>
 ≥ 0. This follows from the fact that a differentiable concave function is always bounded by its first order Taylor expansion. As required by the MM principle, equality holds in the majorization <xref ref-type="disp-formula" rid="pcbi.1004228.e007">Eq (4)</xref>
 when <italic>t</italic>
 = <italic>t</italic>
<sub><italic>m</italic>
</sub>
. Applying this majorization to the surrogate function <xref ref-type="disp-formula" rid="pcbi.1004228.e005">Eq (3)</xref> yields the new surrogate
<disp-formula id="pcbi.1004228.e009"><alternatives><graphic xlink:href="pcbi.1004228.e009.jpg" id="pcbi.1004228.e009g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M9"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mi>h</mml:mi>
<mml:mo>[</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mi>ρ</mml:mi>
<mml:mrow><mml:mn>2</mml:mn>
<mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msup><mml:mrow><mml:mo>∥</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">U</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
</mml:msub>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mo>∥</mml:mo>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
</mml:msub>
<mml:mo>(</mml:mo>
<mml:mtable><mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr><mml:mtd><mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>)</mml:mo>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>ϵ</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(5)</label>
</disp-formula>
up to an irrelevant constant. The surrogate function <xref ref-type="disp-formula" rid="pcbi.1004228.e009">Eq (5)</xref>
 resulting from these maneuvers separates all of the vectors <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
 and <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
. The derivative of the surrogate with respect to <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub> is
<disp-formula id="pcbi.1004228.e010"><alternatives><graphic xlink:href="pcbi.1004228.e010.jpg" id="pcbi.1004228.e010g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M10"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mfrac><mml:mi>∂</mml:mi>
<mml:mrow><mml:mi>∂</mml:mi>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mi>h</mml:mi>
<mml:mrow><mml:mo>[</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mi>ρ</mml:mi>
<mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mfrac>
<mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
where <bold><italic>a</italic>
</bold>
<sub><italic>n</italic>
,<italic>i</italic>
</sub>
 is the part of the projection pertaining to <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>. One can explicitly solve for the update
<disp-formula id="pcbi.1004228.e011"><alternatives><graphic xlink:href="pcbi.1004228.e011.jpg" id="pcbi.1004228.e011g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M11"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mrow><mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>ρ</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mi>ρ</mml:mi>
<mml:mrow><mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>ρ</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msub><mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
The update of <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 involves shrinkage. Let <bold><italic>b</italic>
</bold>
<sub><italic>n</italic>
,<italic>ij</italic>
</sub>
 denote the part of the projection pertaining to <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
. Standard arguments from convex calculus [<xref rid="pcbi.1004228.ref016" ref-type="bibr">16</xref>
] show that the minimum of <inline-formula id="pcbi.1004228.e012"><mml:math id="M12"><mml:mrow><mml:mi>μ</mml:mi>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">‖</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">‖</mml:mo>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mi>ρ</mml:mi>
<mml:mrow><mml:mn>2</mml:mn>
<mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msup><mml:mrow><mml:mo stretchy="false">‖</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">‖</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is achieved by
<disp-formula id="pcbi.1004228.e013"><alternatives><graphic xlink:href="pcbi.1004228.e013.jpg" id="pcbi.1004228.e013g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M13"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mo movablelimits="true" form="prefix">max</mml:mo>
<mml:mo>{</mml:mo>
<mml:mo>(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mfrac><mml:mrow><mml:mi>μ</mml:mi>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub><mml:mi>d</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow><mml:mrow><mml:mi>ρ</mml:mi>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>∥</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>}</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mrow><mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(6)</label>
</disp-formula>
In the exceptional case <bold><italic>b</italic>
</bold>
<sub><italic>n</italic>
,<italic>ij</italic>
</sub>
 = <bold>0</bold>
, the solution <bold><italic>v</italic>
</bold>
<sub><italic>n</italic>
+1,<italic>ij</italic>
</sub>
 = <bold>0</bold>
 is clear from inspection of the <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 criterion <xref ref-type="disp-formula" rid="pcbi.1004228.e013">Eq (6)</xref>
. Both of these solution maps fall under the heading of proximal operators, hence, the name proximal distance algorithm.</p>
<p>If a weight <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 = 0, then it is computationally inefficient to introduce a difference vector <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
. Thus, in many applications, the weight matrix <bold><italic>W</italic>
</bold>
 = (<italic>w</italic>
<sub><italic>ij</italic>
</sub>
) may be sparse. The block descent algorithm for projection, that we discuss next, takes into account the sparsity patterns in <bold><italic>W</italic>
</bold>
. Again taking the sparsity pattern of <bold><italic>W</italic>
</bold>
 into account enables us to employ fewer difference vectors. Let <italic>E</italic>
 denote the set of edges {<italic>i</italic>
, <italic>j</italic>
} with positive weights <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 = <italic>w</italic>
<sub><italic>ji</italic>
</sub>
. Divide the neighborhood <italic>N</italic>
<sub><italic>i</italic>
</sub>
 of a node <italic>i</italic>
 into left and right node neighborhoods <italic>L</italic>
<sub><italic>i</italic>
</sub>
 = {<italic>j</italic>
 < <italic>i</italic>
:<italic>w</italic>
<sub><italic>ji</italic>
</sub>
 > 0} and <italic>R</italic>
<sub><italic>i</italic>
</sub>
 = {<italic>j</italic>
 > <italic>i</italic>
:<italic>w</italic>
<sub><italic>ij</italic>
</sub>
 > 0}. Clearly <italic>N</italic>
<sub><italic>i</italic>
</sub>
 = <italic>L</italic>
<sub><italic>i</italic>
</sub>
∪<italic>R</italic>
<sub><italic>i</italic>
</sub>
, and <inline-formula id="pcbi.1004228.e014"><mml:math id="M14"><mml:mrow><mml:mi>E</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup><mml:mo>∪</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:msub><mml:mi>N</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Projection minimizes the criterion
<disp-formula id="pcbi.1004228.e015"><alternatives><graphic xlink:href="pcbi.1004228.e015.jpg" id="pcbi.1004228.e015g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M15"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>∥</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>∥</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mo>{</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>}</mml:mo>
<mml:mo>∈</mml:mo>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msup><mml:mrow><mml:mrow><mml:mo>∥</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>∥</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
for <inline-formula id="pcbi.1004228.e016"><mml:math id="M16"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>
 and <inline-formula id="pcbi.1004228.e017"><mml:math id="M17"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>
 given. One can minimize this criterion by equating its derivative with respect to <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
 to <bold>0</bold>. It is unclear how to massage the stationarity equation
<disp-formula id="pcbi.1004228.e018"><alternatives><graphic xlink:href="pcbi.1004228.e018.jpg" id="pcbi.1004228.e018g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M18"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mn mathvariant="bold-italic">0</mml:mn>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub><mml:mi>R</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>−</mml:mo>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub><mml:mi>L</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>into a solvable form. However, the block updates
<disp-formula id="pcbi.1004228.e019"><alternatives><graphic xlink:href="pcbi.1004228.e019.jpg" id="pcbi.1004228.e019g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M19"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mrow><mml:mrow><mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi>N</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>(</mml:mo>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub><mml:mi>R</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub><mml:mi>L</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>j</mml:mi>
<mml:mo>∈</mml:mo>
<mml:msub><mml:mi>N</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
are available. Here |<italic>N</italic>
<sub><italic>i</italic>
</sub>
| denotes the cardinality of <italic>N</italic>
<sub><italic>i</italic>
</sub>
. One cycle of the block descent algorithm updates <bold><italic>u</italic>
</bold>
<sub>1</sub>
 through <bold><italic>u</italic>
</bold>
<sub><italic>n</italic>
</sub>
 sequentially. This cycle is repeated until all of the vectors <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
 stabilize. Once convergence is achieved, one sets <bold><italic>v</italic>
</bold>
<sub><italic>ij</italic>
</sub>
 = <bold><italic>u</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>u</italic>
</bold>
<sub><italic>j</italic>
</sub>
 for the relevant pairs.</p>
</sec>
<sec id="sec004"><title>Missing Data</title>
<p>In general, clustering methods require complete data. The remedy of pre-imputation of missing values can be sensitive to the model assumptions underlying a given imputation method. A better remedy is to change the clustering criterion to directly reflect missing data. It is then straightforward to accommodate missing data in <bold><italic>X</italic>
</bold>
 by another round of majorization. Suppose Γ is the set of ordered index pairs (<italic>i</italic>
, <italic>j</italic>
) corresponding to the observed entries <italic>x</italic>
<sub><italic>ij</italic>
</sub>
 of <bold><italic>X</italic>
</bold>. We now minimize the revised criterion
<disp-formula id="pcbi.1004228.e020"><alternatives><graphic xlink:href="pcbi.1004228.e020.jpg" id="pcbi.1004228.e020g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M20"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:msub><mml:mi>f</mml:mi>
<mml:mi>μ</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:mo>Γ</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msup><mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi>x</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>u</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(7)</label>
</disp-formula>which unfortunately lacks the symmetry of the original problem. To restore the lost symmetry, we invoke the majorization
<disp-formula id="pcbi.1004228.e021"><alternatives><graphic xlink:href="pcbi.1004228.e021.jpg" id="pcbi.1004228.e021g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M21"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:mo>Γ</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msup><mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi>x</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>u</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>≤</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>∈</mml:mo>
<mml:mo>Γ</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msup><mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi>x</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>u</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>)</mml:mo>
<mml:mo>∉</mml:mo>
<mml:mo>Γ</mml:mo>
</mml:mrow>
</mml:munder>
<mml:msup><mml:mrow><mml:mo>(</mml:mo>
<mml:msub><mml:mi>u</mml:mi>
<mml:mrow><mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi>u</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
where <italic>u</italic>
<sub><italic>mij</italic>
</sub>
 is a component of <bold><italic>U</italic>
</bold>
<sub><italic>m</italic>
</sub>
. In essence, the term (<italic>u</italic>
<sub><italic>mij</italic>
</sub>
−<italic>u</italic>
<sub><italic>ij</italic>
</sub>
)<sup>2</sup>
 majorizes 0. If the <italic>n</italic>
 × <italic>p</italic>
 matrix <bold><italic>Y</italic>
</bold>
 = (<italic>y</italic>
<sub><italic>ij</italic>
</sub>
) has entries <italic>y</italic>
<sub><italic>ij</italic>
</sub>
 = <italic>x</italic>
<sub><italic>ij</italic>
</sub>
 for (<italic>i</italic>
, <italic>j</italic>
) ∈ Γ and <italic>y</italic>
<sub><italic>ij</italic>
</sub>
 = <italic>u</italic>
<sub><italic>mij</italic>
</sub>
 for (<italic>i</italic>
, <italic>j</italic>) ∉ Γ, then in the minimization step of the proximal distance algorithm, we simply minimize the surrogate function
<disp-formula id="pcbi.1004228.e022"><alternatives><graphic xlink:href="pcbi.1004228.e022.jpg" id="pcbi.1004228.e022g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M22"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:mrow><mml:msub><mml:mi>g</mml:mi>
<mml:mi>μ</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mi mathvariant="bold-italic">U</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:munderover><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>μ</mml:mi>
<mml:munder><mml:mo>∑</mml:mo>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">v</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow><mml:mo>|</mml:mo>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(8)</label>
</disp-formula>
The rest of the proximal distance algorithm remains the same.</p>
</sec>
<sec id="sec005"><title>Calibration of Weights</title>
<p>The pairwise weight <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 = <italic>w</italic>
<sub><italic>ji</italic>
</sub>
 introduced in the penalty term of <xref ref-type="disp-formula" rid="pcbi.1004228.e001">Eq (1)</xref>
 determines the importance of similarity between nodes <italic>i</italic>
 and <italic>j</italic>
. Two principles guide our choice of weights. First, the weight <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 should be inversely proportional to the distance between the <italic>i</italic>
th and <italic>j</italic>
th points. This inverse relationship accords with intuition. As <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 increases, the pressure for the <italic>i</italic>
th and <italic>j</italic>
th centroids to coalesce increases. If the weights <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 are correlated with the similarity of the feature vectors <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
 and <bold><italic>x</italic>
</bold>
<sub><italic>j</italic>
</sub>
, then the pressure for their centroids to merge is especially great. Second, the weight matrix <bold><italic>W</italic>
</bold>
 should be sparse. Despite the fact that small positive weights and zero weights lead to similar clustering paths, the computational advantages of zero weights cannot be ignored.</p>
<p>These observations prompt the following choice of weights. To maintain computational efficiency, it is helpful to focus on the <italic>k</italic>
 nearest neighbors of each node. We define the distance <italic>d</italic>
<sub><italic>ij</italic>
</sub>
 between two nodes <italic>i</italic>
 and <italic>j</italic>
 by the Euclidean norm ||<bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>x</italic>
</bold>
<sub><italic>j</italic>
</sub>
|| and write <italic>i</italic>
∼<sub><italic>k</italic>
</sub>
<italic>j</italic>
 if <italic>j</italic>
 occurs among the <italic>k</italic>
 nearest neighbors of <italic>i</italic>
 or <italic>i</italic>
 occurs among the <italic>k</italic>
 nearest neighbors of <italic>j</italic>. Based on these considerations the weights
<disp-formula id="pcbi.1004228.e023"><alternatives><graphic xlink:href="pcbi.1004228.e023.jpg" id="pcbi.1004228.e023g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M23"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi>w</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:msub><mml:mn>1</mml:mn>
<mml:mrow><mml:mo>{</mml:mo>
<mml:mi>i</mml:mi>
<mml:msub><mml:mo>∼</mml:mo>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mi>j</mml:mi>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:msub>
<mml:msup><mml:mi>e</mml:mi>
<mml:mrow><mml:mo>−</mml:mo>
<mml:mi>ϕ</mml:mi>
<mml:msubsup><mml:mi>d</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(9)</label>
</disp-formula>
are reasonable, where 1<sub>{<italic>i</italic>
∼<sub><italic>k</italic>
</sub>
<italic>j</italic>
}</sub>
 is the indicator function of the event {<italic>i</italic>
∼<sub><italic>k</italic>
</sub>
<italic>j</italic>
} and <italic>ϕ</italic>
 ≥ 0 is a tuning constant. The case <italic>ϕ</italic>
 = 0 corresponds to uniform weights between nearest neighbors. When <italic>ϕ</italic>
 is positive, <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 strictly decreases as a function of <italic>d</italic>
<sub><italic>ij</italic>
</sub>
. Complete coalescence of the nodes occurs as <italic>μ</italic>
 increases if the graph is connected based on all <italic>w</italic>
<sub><italic>ij</italic>
</sub>
. Using squared distances <inline-formula id="pcbi.1004228.e024"><mml:math id="M24"><mml:mrow><mml:msubsup><mml:mi>d</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>
 rather than distances <italic>d</italic>
<sub><italic>ij</italic>
</sub>
 induces more aggressive coalescence of nearby points and slower coalescence of distant points. In practice we normalize weights so that they sum to 1. This harmless tactic is equivalent to rescaling <italic>μ</italic>
. This generic framework was proposed by [<xref rid="pcbi.1004228.ref003" ref-type="bibr">3</xref>
].</p>
<p>We now discuss a strategy for leveraging additional information. When expert knowledge on the relationships among nodes is available and can be quantified, incorporating such knowledge may improve the clustering path. This must be done delicately so that prior information does not overwhelm observed data. If <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
 and <bold><italic>y</italic>
</bold>
<sub><italic>i</italic>
</sub>
 store the genotypes and GPS (global positioning system) coordinates of subject <italic>i</italic>, respectively, then the weighted average
<disp-formula id="pcbi.1004228.e025"><alternatives><graphic xlink:href="pcbi.1004228.e025.jpg" id="pcbi.1004228.e025g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M25"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="right"><mml:msub><mml:mi>d</mml:mi>
<mml:mrow><mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd><mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left"><mml:mrow><mml:mrow><mml:mi>α</mml:mi>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>∥</mml:mo>
<mml:mo>+</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi>α</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>∥</mml:mo>
</mml:mrow>
<mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msub><mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mrow><mml:mo>∥</mml:mo>
<mml:mo>,</mml:mo>
<mml:mspace width="1.em"></mml:mspace>
<mml:mspace width="1.em"></mml:mspace>
<mml:mi>α</mml:mi>
<mml:mo>∈</mml:mo>
<mml:mrow><mml:mo>(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
<label>(10)</label>
</disp-formula>
serves as a composite distance helpful in clustering subjects. In <xref ref-type="disp-formula" rid="pcbi.1004228.e025">Eq (10)</xref>
 observe that the components of the difference <bold><italic>y</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>y</italic>
</bold>
<sub><italic>j</italic>
</sub>
 must be computed in modulo arithmetic. Given a proper choice of the scaling constant <italic>α</italic>
, an even better alternative replaces ‖<bold><italic>y</italic>
</bold>
<sub><italic>i</italic>
</sub>
−<bold><italic>y</italic>
</bold>
<sub><italic>j</italic>
</sub>
‖ by the geodesic distance between <italic>i</italic>
 and <italic>j</italic>
. One could reverse the roles of the vector pairs <bold><italic>y</italic>
</bold>
<sub><italic>i</italic>
</sub>
 and <bold><italic>x</italic>
</bold>
<sub><italic>i</italic>
</sub>
, but it seems to us that genotype similarity rather than physical proximity should be the primary driver of clustering. GPS coordinates are less informative, crudely estimated, and shared across many cases.</p>
</sec>
<sec id="sec006"><title>Evaluation of Clusters</title>
<p>Our program <sc>convexcluster</sc>
 minimizes the penalized loss <xref ref-type="disp-formula" rid="pcbi.1004228.e003">Eq (2)</xref>
 for a range of user specified <italic>μ</italic>
 values. For each <italic>μ</italic>
 the optimized matrix <bold><italic>U</italic>
</bold>
 of cluster centers is stored in a temporary file for later construction of the cluster path. To facilitate visualization, <sc>convexcluster</sc>
 encourages users to project the cluster path onto any two principal components of the original data. The first example of Section 1 relies on the classical Iris data of discriminant analysis [<xref rid="pcbi.1004228.ref004" ref-type="bibr">4</xref>
]. This dataset contains 150 cases spread over three species. The Iris data can be downloaded from the UCI machine learning repository [<xref rid="pcbi.1004228.ref017" ref-type="bibr">17</xref>
]. For purposes of comparison, we also evaluated the clusters formed by agglomerative hierarchical clustering. In contrast to convex clustering, hierarchical clustering results are usually visualized via dendrograms. Hierarchical clustering comes in several flavors; we chose UPGMA (Unweighted Pair Group Method with Arithmetic Mean) [<xref rid="pcbi.1004228.ref018" ref-type="bibr">18</xref>
] as implemented in the R function <italic>hclust</italic>
. Although <italic>hclust</italic>
 offers six other options for merging clusters, UPGMA is probably the most reliable in reducing the detrimental effects of outliers since it averages information across all cluster members. UPGMA operates on a matrix of pairwise distances defined between nodes. In our genetics examples, we take these to be the distances defined by <xref ref-type="disp-formula" rid="pcbi.1004228.e025">Eq (10)</xref>
. To make a fair comparison between convex and hierarchical clustering, we invoke the composite distance in both methods. We also present results graphically by projecting cluster paths onto the first two principal components of the genetic data in Examples 1 and 2 and the expression data in the last Example. To generate a cluster path for hierarchical clustering, we assigned each fusion node on the tree as as the average of the values of its descendant leaves.</p>
</sec>
</sec>
<sec sec-type="results" id="sec007"><title>Results</title>
<sec id="sec008"><title>Guidance on Selecting Constants <italic>k</italic>
 and <italic>ϕ</italic>
</title>
<p>In computing pairwise weights, one is immediately confronted with the question of how to select the constants <italic>k</italic>
 (number of nearest neighbors) and <italic>ϕ</italic>
 (the soft-threshold effect). The answer depends upon one’s research goals. Unlike supervised learning such as classification, clustering is inherently exploratory. In practice it usually looks for coarse-level relationships among the data points before drilling down in coarse clusters to look for fine-level relationships. In hierarchical clustering different levels of granularity can be explored by drawing a line bisecting all branches along a given level of the tree. Our recommendation for convex clustering is to begin with large values of <italic>k</italic>
 and then examine the patterns revealed as <italic>k</italic>
 is progressively reduced. All points eventually coalesce to a single cluster while <italic>k</italic>
 exceeds a particular threshold, which is determined by the separation of the nodes.</p>
<p>To get a sense of the impact of the constants <italic>k</italic>
 and <italic>ϕ</italic>
 on the Iris data, we generated cluster paths for various pairs (<italic>k</italic>
, <italic>ϕ</italic>
). As <xref ref-type="fig" rid="pcbi.1004228.g002">Fig 2</xref>
 illustrates, <italic>k</italic>
 quantifies the connectivity of the underlying graph. Eventual coalescence only occurs for <italic>k</italic>
 = 50; even then the apparent Iris-Versicolor outlier does not coalesce until very late. All values of <italic>k</italic>
 support a clear separation of Iris-Setosa from the other two species Iris-Versicolor and Iris-Virginica. Separation of Iris-Versicolor and Iris-Virginica into two different groups becomes discernible at <italic>k</italic>
 = 20. Subgroups within each species are evident for <italic>k</italic>
 = 5 and <italic>k</italic>
 = 2. Improved resolution comes at a price; the two small two-member clusters seen in the top right corner of the main Iris-Versicolor cluster never fully coalesce with the main cluster when <italic>k</italic>
 = 2. The distance tuning constant <italic>ϕ</italic>
 also exerts a subtle influence along each row of <xref ref-type="fig" rid="pcbi.1004228.g002">Fig 2</xref>
. This influence is more strongly felt for low values of <italic>k</italic>
. For example, for <italic>k</italic>
 = 2 and <italic>k</italic>
 = 5 with <italic>ϕ</italic>
 = 4, we observe that the two green points at the bottom left of the cluster graph coalesce much later when <italic>ϕ</italic>
 is set to smaller values. Examination of the Iris data suggests exploring cluster granularity over a range of <italic>k</italic>
 values with <italic>ϕ</italic>
 set to 0. One can find the minimum <italic>k</italic>
 ensuring full connectivity by combining bisection* with either breadth-first search or depth-first search [<xref rid="pcbi.1004228.ref019" ref-type="bibr">19</xref>
]. Once the desired granularity is achieved, <italic>ϕ</italic>
 can be increased to reveal more subtle details. Note that increasing <italic>ϕ</italic>
 sends most weights between <italic>k</italic>
 nearest neighbors to 0. As previously noted, the proximal distance algorithm takes substantially more iterations to converge for large values of <italic>ϕ</italic>
.</p>
<fig id="pcbi.1004228.g002" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g002</object-id>
<label>Fig 2</label>
<caption><title>Effects of the parameters <italic>k</italic>
 and <italic>ϕ</italic>
 on cluster paths in the Iris data.</title>
<p>Black, red, and green points denote the species Iris-setosa, Iris-versicolor, and Iris-virginica, respectively. These points are projections of the Iris dataset on the first two principal components (PCs). Lines trace the cluster centers as they traverse the regularization path. The subtle impact of <italic>ϕ</italic>
 is revealed in two cases. At <italic>k</italic>
 = 50, a red dot coalesces with the right cluster at <italic>ϕ</italic>
 = 0, but with the left cluster for larger values of <italic>ϕ</italic>
. At <italic>k</italic>
 = 5 or <italic>k</italic>
 = 10, the two green dots at the extreme lower left corner coalesce later at the largest value of <italic>ϕ</italic>
.</p>
</caption>
<graphic xlink:href="pcbi.1004228.g002"></graphic>
</fig>
<p>As the Iris data illustrate, cluster inference is robust over a wide range of <italic>k</italic>
 values. Across all four rows in <xref ref-type="fig" rid="pcbi.1004228.g002">Fig 2</xref>
, we would have learned that there are two major classes of Iris, even if the points were plotted in the same color. By decreasing <italic>k</italic>
, we were able to discern relationships within the two classes. The figure also shows that the parameter <italic>ϕ</italic>
 is less critical than <italic>k</italic>
. Note, however, that for low values of <italic>k</italic>
, better resolution is achieved by increasing <italic>ϕ</italic>
 from 0.</p>
</sec>
<sec id="sec009"><title>Cluster Accuracy in the Presence of Noise</title>
<p>Although agglomerative hierarchical clustering is computationally efficient, it is greedy, and greedy algorithms tend to produce suboptimal solutions [<xref rid="pcbi.1004228.ref001" ref-type="bibr">1</xref>
]. In particular, it can falter in the face of noisy data. To test this hypothesis, we simulated new data from the Iris data. In creating a dataset, we perturbed each row of the data matrix <bold><italic>X</italic>
</bold>
 by adding normal deviates with mean 0 and standard deviation equal to the sample standard deviation <italic>s</italic>
<sup>2</sup>
 of the corresponding feature multiplied by a constant <italic>c</italic>
. We then clustered the data points into three clusters and quantitatively evaluated clustering performance through Normalized Rand Indices [<xref rid="pcbi.1004228.ref020" ref-type="bibr">20</xref>
]. For convex clustering, visual inspection of the converged clustering paths reveals roughly three major clusters for values of <italic>k</italic>
 between 5 and 15. With hierarchical clustering, three clusters were constructed by choosing a cut point on the full tree intersecting three branches. <xref ref-type="table" rid="pcbi.1004228.t001">Table 1</xref>
 summarizes Rand indices averaged over 100 replicates under the two methods. Larger values of the Rand index represent higher accuracy; the maximum value of 1 indicates error-free clustering. Examination of the table suggests that convex clustering is indeed more accurate in the face of noise over a wide range of <italic>k</italic>
 values.</p>
<table-wrap id="pcbi.1004228.t001" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.t001</object-id>
<label>Table 1</label>
<caption><title>Avg Rand indices (RI) as a function of noise in the Iris data.</title>
</caption>
<alternatives><graphic id="pcbi.1004228.t001g" xlink:href="pcbi.1004228.t001"></graphic>
<table frame="box" rules="all" border="0"><colgroup span="1"><col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead><tr><th align="left" rowspan="2" colspan="1">Noise level <italic>c</italic>
</th>
<th align="left" rowspan="1" colspan="1"><sc>hclust</sc>
</th>
<th align="center" colspan="3" rowspan="1"><sc>convexcluster</sc>
</th>
</tr>
<tr><th align="left" rowspan="1" colspan="1">UPGMA RI</th>
<th align="left" rowspan="1" colspan="1">k = 5 RI</th>
<th align="left" rowspan="1" colspan="1">k = 10 RI</th>
<th align="left" rowspan="1" colspan="1">k = 15 RI</th>
</tr>
</thead>
<tbody><tr><td align="char" char="char" rowspan="1" colspan="1">0.02</td>
<td align="left" rowspan="1" colspan="1">.83(.05)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.89(.01)</td>
<td align="left" rowspan="1" colspan="1">.89(.02)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.04</td>
<td align="left" rowspan="1" colspan="1">.83(.05)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.88(.02)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.06</td>
<td align="left" rowspan="1" colspan="1">.83(.05)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.08</td>
<td align="left" rowspan="1" colspan="1">.82(.05)</td>
<td align="left" rowspan="1" colspan="1">.88(.04)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.87(.03)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.10</td>
<td align="left" rowspan="1" colspan="1">.82(.05)</td>
<td align="left" rowspan="1" colspan="1">.87(.04)</td>
<td align="left" rowspan="1" colspan="1">.87(.04)</td>
<td align="left" rowspan="1" colspan="1">.86(.04)</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot><fn id="t001fn001"><p>Standard deviations in parentheses. For computational efficiency, <italic>ϕ</italic>
 was set to zero for convex clustering.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec010"><title>Cluster Accuracy with Missing Values</title>
<p>We carried out a second simulation study on the Iris data to assess accuracy of cluster inference as a function of missingness. Because the Iris data includes only four features (width and height of sepals and petals), simply selecting entries of the data matrix at random can lead to cases retaining no data. To avoid these degeneracies, we randomly selected cases and then a random feature from each case for deletion. Given cases rates of 25%, 50%, 75%, and 100%, the proportion of missing observations consequently ranged from 5% to 25%. Hierarchical clustering with missing data requires that either cases with missing entries be omitted or that missing entries be imputed. We employed the second strategy, filling in missing entries by multiple imputation as implemented in the R package <sc>mi</sc>
 [<xref rid="pcbi.1004228.ref021" ref-type="bibr">21</xref>
]. Hierarchical clustering was then applied to the completed data. For convex clustering, we also applied multiple imputation, but for the sole purpose of computing the convex clustering weights. We then applied convex clustering to the original incomplete data under the objective function <xref ref-type="disp-formula" rid="pcbi.1004228.e020">Eq (7)</xref>
. Accuracy for each method was estimated in the same manner as the previous simulations. The Rand indices in <xref ref-type="table" rid="pcbi.1004228.t002">Table 2</xref>
 suggest that convex clustering does indeed outperform hierarchical clustering in the presence of missing data.</p>
<table-wrap id="pcbi.1004228.t002" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.t002</object-id>
<label>Table 2</label>
<caption><title>Avg Rand indices (RI) as a function of missingness in the Iris data.</title>
</caption>
<alternatives><graphic id="pcbi.1004228.t002g" xlink:href="pcbi.1004228.t002"></graphic>
<table frame="box" rules="all" border="0"><colgroup span="1"><col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead><tr><th align="left" rowspan="2" colspan="1">Proportion of rows with a missing attribute <italic>c</italic>
</th>
<th align="left" rowspan="1" colspan="1"><sc>hclust</sc>
</th>
<th align="center" colspan="3" rowspan="1"><sc>convexcluster</sc>
</th>
</tr>
<tr><th align="left" rowspan="1" colspan="1">UPGMA RI</th>
<th align="left" rowspan="1" colspan="1">k = 5 RI</th>
<th align="left" rowspan="1" colspan="1">k = 10 RI</th>
<th align="left" rowspan="1" colspan="1">k = 15 RI</th>
</tr>
</thead>
<tbody><tr><td align="char" char="char" rowspan="1" colspan="1">0.25</td>
<td align="left" rowspan="1" colspan="1">.82(.05)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.88(.03)</td>
<td align="left" rowspan="1" colspan="1">.87(.02)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.50</td>
<td align="left" rowspan="1" colspan="1">.83(.05)</td>
<td align="left" rowspan="1" colspan="1">.87(.04)</td>
<td align="left" rowspan="1" colspan="1">.86(.03)</td>
<td align="left" rowspan="1" colspan="1">.86(.03)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">0.75</td>
<td align="left" rowspan="1" colspan="1">.82(.05)</td>
<td align="left" rowspan="1" colspan="1">.86(.05)</td>
<td align="left" rowspan="1" colspan="1">.85(.04)</td>
<td align="left" rowspan="1" colspan="1">.86(.04)</td>
</tr>
<tr><td align="char" char="char" rowspan="1" colspan="1">1.00</td>
<td align="left" rowspan="1" colspan="1">.82(.04)</td>
<td align="left" rowspan="1" colspan="1">.86(.05)</td>
<td align="left" rowspan="1" colspan="1">.84(.05)</td>
<td align="left" rowspan="1" colspan="1">.85(.04)</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot><fn id="t002fn001"><p>Standard deviations in parentheses. For computational efficiency, <italic>ϕ</italic>
 was set to zero for convex clustering.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec011"><title>Inference of Ethnicity</title>
<p>As genotyping costs have dropped in recent years, it has become straightforward to relate ethnicity to subtle genetic variations. Several software tools are now available for this purpose. For example, the programs <sc>structure</sc>
 [<xref rid="pcbi.1004228.ref022" ref-type="bibr">22</xref>
] and <sc>admixture</sc>
 [<xref rid="pcbi.1004228.ref023" ref-type="bibr">23</xref>
] estimate a subject’s admixture proportions across a set of predefined or inferred ancestral populations. <sc>eigenstrat</sc>
 [<xref rid="pcbi.1004228.ref024" ref-type="bibr">24</xref>
] employs a handful of principal components to explain ethnic variation. Principal component analysis (PCA) is attractive due to its speed and ease of visualization. Clustering can also separate subjects by ethnicity if individuals of mixed ethnicity are omitted. The advantage of convex clustering is that one can follow the dynamic behavior of the relationship clusters along the regularization path. In the next two examples on population structure, the data consist of multi-dimensional genotypes. We project our convex clustering paths onto the first two principal components of the data. This produces plots where population substructure aligns with geographic regions of origin.</p>
<sec id="sec012"><title>World-wide genetic diversity</title>
<p>For a practical demonstration of convex clustering, we now turn to the Human Genome Diversity Project (HGDP). This collaboration makes several datasets publicly available that vary in marker type (SNPs versus microsatellites) and sample size. The HGDP 2002 dataset considered here includes 1,056 individuals from 52 populations genotyped at 377 autosomal microsatellites [<xref rid="pcbi.1004228.ref025" ref-type="bibr">25</xref>
]. Care must be taken in analyzing microsatellites since, in contrast to SNPs, they display more alleles and greater levels of polymorphism. Recall that an allele at a microsatellite approximates the number of short tandem repeats of some simple motif. Because treating microsatellite genotypes as continuous variables is problematic, we encode each microsatellite genotype as a sequence of allele counts. Each count ranges from 0 to 2, and there are as many count variables as alleles. This encoding yields a revised 2002 dataset with the 377 microsatellite genotypes expanded to 4,682 different attributes.</p>
<p>As expected, these data exhibit clines in allele frequencies [<xref rid="pcbi.1004228.ref026" ref-type="bibr">26</xref>
]. To take advantage of the correlation between geographic separation and ethnic similarity, we defined penalty weights <italic>w</italic>
<sub><italic>ij</italic>
</sub>
 according to the composite distance in <xref ref-type="disp-formula" rid="pcbi.1004228.e025">Eq (10)</xref>
 with constant <italic>α</italic>
 = 0.5. We chose this value of <italic>α</italic>
 to give equal weight to both sources of information. Results for other values of <italic>α</italic>
 ∈ (0, 1) are similar. We progressively reduced <italic>k</italic>
 from a large value such as 10 until we could observe separation of the seven major continental groups. Variations in <italic>ϕ</italic>
 make no discernible differences in the analysis of these data. <xref ref-type="fig" rid="pcbi.1004228.g003">Fig 3</xref>
 plots cluster paths for these data given the settings <italic>ϕ</italic>
 = 1 and <italic>k</italic>
 = 4. With <italic>k</italic>
 = 4 nearest neighbors, we observe broad-scale clustering events that link up the major continental groups. In the north, Europeans fall into a single cluster, later joined by populations from the Middle East. In the east the Chinese merge into a cluster that subsequently merges with two Oceania populations from New Guinea. This mega cluster then merges with various Central Asian populations of predominantly Pakistani origin. In the west five Central/South American populations cluster, and in the south six African populations cluster. Considering the continental clusters in the figure, the American cluster (red points) and the Central/East Asian cluster (green points) are linked by a straight line, while the northern (turquoise, green, and magenta points) and southern continental clusters (black points) appear to fuse at a point just below this straight line. This accords with known links between East Asians and American Indians, who crossed the Bering strait, possibly multiple times, during the Ice Age [<xref rid="pcbi.1004228.ref027" ref-type="bibr">27</xref>
]. <xref ref-type="fig" rid="pcbi.1004228.g004">Fig 4</xref>
 presents the output of hierarchical clustering, where datapoints and their fused (averaged) values are projected onto the same coordinates as the convex clustering results. Although the two methods give fairly consistent plots, there is a striking difference in how the African San population is treated. In hierarchical clustering it coalesces to the origin as a single outlier continental region. The Central Asian groups also appear to be more closely related to Europeans. In convex clustering <xref ref-type="fig" rid="pcbi.1004228.g005">Fig 5</xref>
 depicts finer grained events exposed by setting <italic>k</italic>
 = 1. Along the western axis, taking <italic>k</italic>
 = 1 is uninformative, but among the African populations along the southern axis, we observe three major clusters: a two-member cluster representing the two Pygmy sub-groups; a three-member cluster comprising Bantu-speaking peoples from Kenya, Yorubans from Nigeria, and Mandenkas from Senegal; and finally a singleton cluster for the San from Namibia. These results are consistent with a recent phylogenetic study [<xref rid="pcbi.1004228.ref028" ref-type="bibr">28</xref>
] that found the San to be the most isolated of the African populations, followed by the two Pygmy populations, and finally the three Bantu-language populations. Along the eastern axis, the two Papua New Guinea populations cluster together and do not join the remaining Asian populations.</p>
<fig id="pcbi.1004228.g003" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g003</object-id>
<label>Fig 3</label>
<caption><title>Convex clustering of the HGDP data using a large number of nearest neighbors to infer intercontinental connections (<italic>k</italic>
 = 4, <italic>ϕ</italic>
 = 1).</title>
</caption>
<graphic xlink:href="pcbi.1004228.g003"></graphic>
</fig>
<fig id="pcbi.1004228.g004" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g004</object-id>
<label>Fig 4</label>
<caption><title>Hierarchical clustering of the 52 populations from the HGDP data.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g004"></graphic>
</fig>
<fig id="pcbi.1004228.g005" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g005</object-id>
<label>Fig 5</label>
<caption><title>Convex clustering of the HGDP data using a small number <italic>k</italic>
 of nearest neighbors to resolve intracontinental connections (<italic>k</italic>
 = 1, <italic>ϕ</italic>
 = 1).</title>
</caption>
<graphic xlink:href="pcbi.1004228.g005"></graphic>
</fig>
<p>Figs <xref ref-type="fig" rid="pcbi.1004228.g006">6</xref>
 and <xref ref-type="fig" rid="pcbi.1004228.g007">7</xref>
 focus on related populations along the eastern and northern axes of East Asia, respectively. Most of the Chinese populations along the eastern axis appear to coalesce simultaneously. Some of the other populations along the northern border of China coalesce earlier. The Hezhen and Oroqen peoples reside predominantly in the Heilongjiang province of northeast China [<xref rid="pcbi.1004228.ref029" ref-type="bibr">29</xref>
, <xref rid="pcbi.1004228.ref030" ref-type="bibr">30</xref>
]. These two populations cluster early with the inner Mongolians and the Xibo population, who occupy northeast China and the northwest region of Xinjiang province. Three distinct clusters of Middle Easterners, Central Asians, and Europeans occur along the northern axis. All European populations except for the Russian populations are grouped into a single cluster. The two Russian populations instead merge with a second cluster that includes three populations from Israel. The Mozabites, who coalesce late with this cluster, exhibit high frequencies of North African haplotypes as previously noted in the literature [<xref rid="pcbi.1004228.ref031" ref-type="bibr">31</xref>
, <xref rid="pcbi.1004228.ref032" ref-type="bibr">32</xref>
]. A third cluster within Central Asia unite Pakistani populations with Uygurs from China. Within this cluster, the Brahui, Balochi, and Makran populations of the Baluchistan province of northwestern Pakistan coalesce early with the Sindhi people of the Sindh province on the eastern border of Baluchistan. Later coalescing populations include the Hazara, Uygurs, and Kalash. The Hazaras of Pakistan and the Uygurs of China share common Mongolian and Turkic ancestry and some physical attributes [<xref rid="pcbi.1004228.ref033" ref-type="bibr">33</xref>
, <xref rid="pcbi.1004228.ref034" ref-type="bibr">34</xref>
]. In contrast, hierarchical clustering suggests a more distant relationship between these two ethnic groups. (<xref ref-type="fig" rid="pcbi.1004228.g008">Fig 8</xref>
) A previous admixture analysis carried out on high-density SNP data via <sc>structure</sc>
 [<xref rid="pcbi.1004228.ref022" ref-type="bibr">22</xref>
] supports our observation that the Kalash people constitute a single distinct cluster, one of seven clusters separating all of the populations covered in the HGDP data [<xref rid="pcbi.1004228.ref031" ref-type="bibr">31</xref>
].</p>
<fig id="pcbi.1004228.g006" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g006</object-id>
<label>Fig 6</label>
<caption><title>Magnified view of the convex clustering results for the HGDP data in East Asia.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g006"></graphic>
</fig>
<fig id="pcbi.1004228.g007" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g007</object-id>
<label>Fig 7</label>
<caption><title>Magnified view of the convex clustering results for the HGDP data in Europe and Central Asia.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g007"></graphic>
</fig>
<fig id="pcbi.1004228.g008" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g008</object-id>
<label>Fig 8</label>
<caption><title>Magnified view of the hierarchical clustering results for the HGDP data in Europe and Central Asia.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g008"></graphic>
</fig>
</sec>
<sec id="sec013"><title>Population structure of Europe</title>
<p>We next investigate whether convex clustering can glean further insights into the population structure of Europe. The POPRES resource archives high-density genotypes generated on the Illumina 550k microarray platform [<xref rid="pcbi.1004228.ref035" ref-type="bibr">35</xref>
]. Version 2 of POPRES contains genotype and phenotype data on 4,077 subjects genotyped across 457,297 SNPs. For this analysis, we include only non-admixed Europeans who report all four grandparents of the same ethnicity. This leaves 1,896 subjects. SNP data presents advantages and disadvantages compared to microsatellite data. Dense marker panels may be more sensitive to subtle differences driven by population events such as migration, expansion, and bottlenecks [<xref rid="pcbi.1004228.ref036" ref-type="bibr">36</xref>
]. Challenges include the lower information content of biallelic markers and the correlations between markers caused by linkage disequilibrium (LD). After considerable experimentation, we found that the leading principal components offered more insight into population structure than the raw genotypes themselves. We employed <sc>eigenstrat</sc>
 to extract the ten leading principal components from the genotype matrix. <sc>eigenstrat</sc>
 prunes SNPs in LD with <italic>r</italic>
<sup>2</sup>
 exceeding a user-specified threshold [<xref rid="pcbi.1004228.ref024" ref-type="bibr">24</xref>
]. In our case the threshold 0.8 discards all but 276,823 nearly independent SNPs. Our choice of the composite distance defined in <xref ref-type="disp-formula" rid="pcbi.1004228.e025">Eq (10)</xref>
 places equal weight (<italic>α</italic>
 = 0.5) on genetic distances and GPS distances between the capital cities of participants. To ease visualization, our figures display a maximum of 20 subjects from each ethnicity, for a total of 370 subjects. The computed convex clustering path is projected onto the first two principal components of the POPRES data; these components capture geographic east-west and north-south axes, respectively.</p>
<p>In the Iris and the HGDP datasets, the number of nearest neighbors <italic>k</italic>
 was more critical in resolving cluster evolution than the tuning constant <italic>ϕ</italic>
. In the European POPRES data, where inter-class differences are more subtle, increasing <italic>ϕ</italic>
 can be critical in resolving details for <italic>k</italic>
 large. As in the previous examples, we gradually reduced <italic>k</italic>
 from a large value until major clusters along the North-West, North-East, and South-East geographic axes emerged. <xref ref-type="fig" rid="pcbi.1004228.g009">Fig 9</xref>
 depicts a clustering path with <italic>k</italic>
 = 40 neighbors and <italic>ϕ</italic>
 = 0. Increasing <italic>ϕ</italic>
 to 10 gives a similar clustering pattern, except that each of the major trunks coalesce before converging to the origin. Thus, <xref ref-type="fig" rid="pcbi.1004228.g010">Fig 10</xref>
 shows several major clusters connected by five major trunks. Spain and Portugal constitute a major cluster in the southwest trunk. The southeast trunk includes Italy and southeast Europe; these populations eventually merge into a single cluster. The northeast trunk defines a cluster that includes Poland, Russia, Ukraine, the Czech Republic, Hungary, and Slovenia. Norway, Sweden, and Germany cluster along the northern trunk, and the British Isles merge with Belgium and the Netherlands to form the northwest trunk. A large cluster comprising France and the Swiss linguistic groups (French, German, and Italian) constitute the western trunk. Hierarchical clustering for the most part recapitulates these major clusters, but the major clusters are less discernible. (<xref ref-type="fig" rid="pcbi.1004228.g011">Fig 11</xref>
) Replotting the clustering path from convex clustering with <italic>ϕ</italic>
 = 1 and <italic>k</italic>
 = 3 shows Norway and Sweden breaking away from Germany and forming their own disjoint cluster (<xref ref-type="fig" rid="pcbi.1004228.g012">Fig 12</xref>
). France breaks away from the Swiss groups to form its own disjoint cluster. Along the south trunk, Italy now separates from southeast Europe and eventually clusters with the Swiss-Italians.</p>
<fig id="pcbi.1004228.g009" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g009</object-id>
<label>Fig 9</label>
<caption><title>Convex clustering of the European populations from the POPRES data using <italic>ϕ</italic>
 = 0 and <italic>k</italic>
 = 40.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g009"></graphic>
</fig>
<fig id="pcbi.1004228.g010" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g010</object-id>
<label>Fig 10</label>
<caption><title>Convex clustering of the European populations from the POPRES data using <italic>ϕ</italic>
 = 10 and <italic>k</italic>
 = 40.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g010"></graphic>
</fig>
<fig id="pcbi.1004228.g011" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g011</object-id>
<label>Fig 11</label>
<caption><title>Hierarchical clustering of the European populations from the POPRES data.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g011"></graphic>
</fig>
<fig id="pcbi.1004228.g012" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g012</object-id>
<label>Fig 12</label>
<caption><title>Convex clustering of the European populations from the POPRES data using <italic>ϕ</italic>
 = 1 and <italic>k</italic>
 = 3.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g012"></graphic>
</fig>
<p><xref ref-type="fig" rid="pcbi.1004228.g013">Fig 13</xref>
 depicts the clustering path of southeast Europe, where West Slavic languages predominate. Here Greece first coalesces with Macedonia, a Slavic population bordering Greece on the north. A cluster comprising Bosnia-Herzegovina and Serbia merges with Romania, before merging into the primary trunk of southeast Europe. Finally at the northern end of the trunk, a cluster formed by Croatia and Slovenia form its own cluster. The groups in the Bosnia-Herzegovina cluster and the Macedonian cluster are consistent with the recent break up of Yugoslavia. Poland and Russia cluster in the northern most branch of the northeast trunk (<xref ref-type="fig" rid="pcbi.1004228.g014">Fig 14</xref>
). The Czech-Republic, Austria, and Hungary define a distinct cluster along the southern branch. Given that Austria conquered Hungary in 1699 and established rule over Bohemia (the predecessor to modern Czechs) as early as 1526, these results are not surprising.</p>
<fig id="pcbi.1004228.g013" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g013</object-id>
<label>Fig 13</label>
<caption><title>Magnified view of results from convex clustering of Southeast Europe.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g013"></graphic>
</fig>
<fig id="pcbi.1004228.g014" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g014</object-id>
<label>Fig 14</label>
<caption><title>Magnified view of results from convex clustering of Northeast Europe.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g014"></graphic>
</fig>
<p>In the POPRES data, convex clustering and hierarchical clustering occasionally disagree. For example, hierarchical clustering merges the Netherlands and Belgium with Britain before it merges Britain with Ireland and Scotland (<xref ref-type="fig" rid="pcbi.1004228.g015">Fig 15</xref>
). In light of the geography and history of Britain, it is reasonable to expect Britain to first merge with Scotland and Ireland. Convex clustering produces yields precisely this expected effect (<xref ref-type="fig" rid="pcbi.1004228.g016">Fig 16</xref>
). The British-Scotland-Ireland cluster then merges with the neighboring cluster of Belgium and the Netherlands. Owing to a few outliers, the greedy nature of hierarchical clustering appears to force a spurious coalescence, which cannot be repaired until later. Another discrepancy occurs in clustering the Swiss linguistic groups. Convex clustering first groups the Swiss-German, Swiss-French, and Swiss-Italian into a single Swiss cluster (<xref ref-type="fig" rid="pcbi.1004228.g017">Fig 17</xref>
). Hierarchical clustering groups France with this cluster. At the next higher level, rather than cluster Italy with the Swiss, hierarchical clustering merges it with Greece and populations from the former Yugoslavia. Convex clustering, in contrast, merges Italy with the Swiss before joining both to the southeast European trunk. In this case, it is unclear which method is providing a more accurate solution; due to the large size of Italy, geographic proximity suggests a closer relationship between Southern Italians and Greece, with similar logic applied to Northern Italians and the Swiss. Further details on the geographic origins of POPRES Italian subjects would help resolve this discrepancy.</p>
<fig id="pcbi.1004228.g015" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g015</object-id>
<label>Fig 15</label>
<caption><title>Hierarchical clustering projection showing genetic relationships among populations in and near the British Isles.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g015"></graphic>
</fig>
<fig id="pcbi.1004228.g016" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g016</object-id>
<label>Fig 16</label>
<caption><title>Convex clustering projection showing genetic relationships among populations in and near the British Isles.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g016"></graphic>
</fig>
<fig id="pcbi.1004228.g017" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g017</object-id>
<label>Fig 17</label>
<caption><title>Magnified view of results from convex clustering of Swiss liguistic groups.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g017"></graphic>
</fig>
</sec>
</sec>
<sec id="sec014"><title>Inferring Cancer Subtypes</title>
<p>It is well accepted that cancers of a given tissue often fall into different subtypes. In breast cancer for instance, patients with tumors that are estrogen receptor (ER) and epidermal growth factor receptor (ErbB2) negative are less responsive to hormone based treatment than those possessing active receptors [<xref rid="pcbi.1004228.ref037" ref-type="bibr">37</xref>
]. High-throughput platforms such as gene-expression microarrays and RNA-Seq have enabled researchers to classify cancer patients based on their molecular phenotypes. Hierarchical clustering by [<xref rid="pcbi.1004228.ref038" ref-type="bibr">38</xref>
] established five gene-expression profiles across 9216 genes in 84 breast-cancer patients. Among the 84 patients, only 16 also had a clinical assessment of hormone receptor status. Here, we attempt to determine whether convex and hierarchical clustering can infer clusters consistent with the clinical outcomes for these 16 patients. Under the tuning constants <italic>ϕ</italic>
 = .5 and <italic>k</italic>
 = 1, convex clustering recovers two distinct clusters. <xref ref-type="fig" rid="pcbi.1004228.g018">Fig 18</xref>
 projects the cluster centers along the cluster path on the first and third principal components of the original data. The left and right clusters correspond roughly to ER positive and ER negative tumors, respectively. Two ER negative tumors cluster with the ER positive tumors. <xref ref-type="fig" rid="pcbi.1004228.g019">Fig 19</xref>
 depicts results from hierarchical clustering. Based on the order of fusion events, hierarchical clustering does not appear to group the tumors into distinct ER positive and negative groups. This could be an artifact of the hard binary choices imposed by hierarchical clustering. The two ER-B2 positive samples that clustered together in convex clustering appear in distant clusters under hierarchical clustering.</p>
<fig id="pcbi.1004228.g018" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g018</object-id>
<label>Fig 18</label>
<caption><title>Convex clustering of the breast cancer samples.</title>
<p>Points on the plot indicate data vectors projected onto the first and third principal components (PCs) of the sample. Lines trace the cluster centers as they traverse the regularization path.</p>
</caption>
<graphic xlink:href="pcbi.1004228.g018"></graphic>
</fig>
<fig id="pcbi.1004228.g019" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.g019</object-id>
<label>Fig 19</label>
<caption><title>Average linkage hierarchical clustering of the breast cancer samples.</title>
</caption>
<graphic xlink:href="pcbi.1004228.g019"></graphic>
</fig>
</sec>
<sec id="sec015"><title>Run-Time Benchmarks</title>
<p>For a dataset with a large number of attributes, parallelization can substantially reduce run times. <sc>convexcluster</sc>
 includes code written in OpenCL, a language designed to run on many-core devices such as GPUs. For each of the three genetic analyses presented above, we recorded the total run-time along the entire regularization path using standard C++ code for the CPU and OpenCL code for the GPU. For the sake of comparison, we also recorded run-times for <sc>clusterpath</sc>
 [<xref rid="pcbi.1004228.ref003" ref-type="bibr">3</xref>
], an R package that also implements convex clustering, on the same datasets and weighting schemes. <xref ref-type="table" rid="pcbi.1004228.t003">Table 3</xref>
 records the average run time to minimize the objective function averaged over all values of the regularization parameter. We chose this strategy because <sc>clusterpath</sc>
 does not allow users to pre-specify a grid of regularization values. The bottom line is that <sc>convexcluster</sc>
 required only 16%, 47%, and 75% of the time required by <sc>clusterpath</sc>
 to fit the HGDP, POPRES, and breast cancer datasets respectively. When a GPU is available, further improvements can potentially be realized. On an nVidia C2050 GPU, <sc>convexcluster</sc>
 enjoys speed improvements of 4.6 and 5.5 fold over the CPU version for the HGDP and breast cancer examples. In contrast, on the POPRES example, the GPU version is actually 3.5 fold slower than the CPU version. In its current form, <sc>convexcluster</sc>
 reads the updated matrix <bold><italic>U</italic>
</bold>
 from the GPUs at each point on the <italic>μ</italic>
-regularization path before saving the data to disk. This large I/O overhead can overwhelm gains from parallelization for low-dimensional datasets such as the POPRES data. In general, GPU implementations of standard algorithms require a high degree of parallelization, limited data transfers between the master CPU and the slave GPUs, and maximal synchrony of the GPUs. Depending on the nature of the clustering data, <sc>convexcluster</sc>
 satisfies these requirements. It does not in the POPRES data, and computational efficiency suffers in the GPU version.</p>
<table-wrap id="pcbi.1004228.t003" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1004228.t003</object-id>
<label>Table 3</label>
<caption><title>Average runtimes in seconds for different analyses.</title>
</caption>
<alternatives><graphic id="pcbi.1004228.t003g" xlink:href="pcbi.1004228.t003"></graphic>
<table frame="box" rules="all" border="0"><colgroup span="1"><col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
<col align="left" valign="top" span="1"></col>
</colgroup>
<thead><tr><th align="left" rowspan="1" colspan="1">Analysis</th>
<th align="left" rowspan="1" colspan="1">Datapoints</th>
<th align="left" rowspan="1" colspan="1">Variables</th>
<th align="left" rowspan="1" colspan="1"><sc>clusterpath</sc>
</th>
<th align="center" colspan="2" rowspan="1"><sc>convexcluster</sc>
</th>
</tr>
<tr><th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1">CPU</th>
<th align="left" rowspan="1" colspan="1">GPU</th>
</tr>
</thead>
<tbody><tr><td align="left" rowspan="1" colspan="1">HGDP</td>
<td align="left" rowspan="1" colspan="1">52</td>
<td align="left" rowspan="1" colspan="1">4,682</td>
<td align="char" char="char" rowspan="1" colspan="1">8.67</td>
<td align="char" char="char" rowspan="1" colspan="1">1.46</td>
<td align="char" char="char" rowspan="1" colspan="1">.32</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">POPRES</td>
<td align="left" rowspan="1" colspan="1">370</td>
<td align="left" rowspan="1" colspan="1">10</td>
<td align="char" char="char" rowspan="1" colspan="1">2.53</td>
<td align="char" char="char" rowspan="1" colspan="1">1.21</td>
<td align="char" char="char" rowspan="1" colspan="1">4.29</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Breast Cancer data</td>
<td align="left" rowspan="1" colspan="1">16</td>
<td align="left" rowspan="1" colspan="1">9,216</td>
<td align="char" char="char" rowspan="1" colspan="1">3.14</td>
<td align="char" char="char" rowspan="1" colspan="1">2.37</td>
<td align="char" char="char" rowspan="1" colspan="1">.43</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
</sec>
</sec>
<sec sec-type="conclusions" id="sec016"><title>Discussion</title>
<p>The literature on cluster analysis is enormous. Each clustering method has advantages in either simplicity, speed, reliability, interpretability, or scalability. If the number of clusters is known in advance, then <italic>k</italic>
-means clustering is usually preferred. In convex clustering one can often achieve a predetermined number of clusters by varying the number of nearest neighbors and following the solution path to its final destination. Alternatively, if the underlying graph is fully connected, then one can follow the solution path until <italic>k</italic>
 clusters appear. The downside of <italic>k</italic>
-means clustering is that it offers no insight into cluster similarity. If the goal in clustering is to obtain a snapshot of the relationships among observed data points at different levels of granularity, the choices are limited, and most biologists opt for hierarchical clustering. Hierarchical clustering is notable for its speed and visual appeal. Balanced against these assets is its sensitivity to poor starting values and outliers. Convex clustering occupies an enviable middle ground between <italic>k</italic>
-means clustering and hierarchical clustering. Our extensive exploration of the HGDP and POPRES datasets showcase the subtle solutions paths of convex clustering. These paths offer considerable insights into population history and correct some of the greedy mistakes of hierarchical clustering.</p>
<p>Nonetheless, hierarchical clustering can be the more practical choice when noise is low and a premium is put on computational speed. In the Iris data with no introduced noise, the two methods yield equivalent results. Total runtimes for the convex clustering analyses in this paper ranged from 5 minutes to 30 minutes. In contrast, even for the largest datasets analyzed here, hierarchical clustering required no more than 5 seconds to complete. Our perturbations of the Iris data demonstrate sensitivity to noise, so speed comes at a price.</p>
<p>Given the novelty of convex clustering [<xref rid="pcbi.1004228.ref002" ref-type="bibr">2</xref>
], it is hardly surprising that only a few previous programs (<sc>clusterpath</sc>
 [<xref rid="pcbi.1004228.ref003" ref-type="bibr">3</xref>
] and <sc>cvxclustr</sc>
 [<xref rid="pcbi.1004228.ref039" ref-type="bibr">39</xref>
]), implement it. Our program is unique in that we offer a fast implementation when GPU devices are available. These earlier programs perform similarly to our program on modest problems such as the Iris data. Unfortunately, on large datasets such as the HGDP data, <sc>clusterpath</sc>
 depletes all available memory and terminates prematurely. Furthermore, <sc>clusterpath</sc>
 lacks two features that work to the advantage of convex clustering. First, it does not support disconnected graphs defined by sparse weights. In our breast cancer example, clustering with disconnected graphs reveals fine-grained details. Second, <sc>clusterpath</sc>
 does not allow for missing entries in the data matrix. The current paper documents <sc>convexcluster</sc>
’s ability to scale realistically to dimensions typical of modern genomic data. A combination of careful algorithmic development and exploitation of modern many-core chipsets lies behind <sc>convexcluster</sc>
. The proximal distance algorithm propelling <sc>convexcluster</sc>
 separates parameters and enables massive parallelization. OpenCL made it relatively easy to implement parallel versions of our original serial code. Further speedups are possible. For instance, <sc>convexcluster</sc>
 spends an inordinate amount of execution time moving matrices over relatively slow I/O channels in preparation for plotting. One could easily project the data to principal components on each GPU itself prior to data transfer. More recent ATI or nVidia GPUs should improve the speedups on high-dimensional data mentioned here.</p>
<p>Convex clustering also shows promise as a building block for more sophisticated exploratory tools in computational biology. In a companion paper [<xref rid="pcbi.1004228.ref040" ref-type="bibr">40</xref>
] introduce a convex formulation of the biclustering problem. In biclustering one seeks to cluster both observations and features simultaneously in a data matrix. Cancer subtype discovery can be formulated as a biclustering problem in which gene expression data is partitioned into a checkerboard-like pattern highlighting the associations between groups of patients and the groups of genes that distinguish them. To bicluster a data matrix, hierarchical clustering can be applied independently to the rows and columns of the matrix. Convex biclustering produces more stable biclusterings while retaining the interpretability of hierarchical biclustering. Convex biclustering requires repeatedly solving convex clustering subproblems.</p>
<p>The field of cluster analysis is crowded with so many competing methods that it would foolish to conclude that convex clustering is uniformly superior. Our goal of illustrating the versatility of convex clustering is more modest. The reflex reaction of most biologists is to employ hierarchical or <italic>k</italic>
-means clustering. We suggest that biologists take a second look. Convex clustering’s ability to reliably deliver an entire solution path is compelling. The insights discussed here will enhance the careful exploration of many big datasets. The present algorithm, and indeed the present formulation of convex clustering, are unlikely to be the last words on the subject. We encourage other computational biologists and statisticians to refine these promising tools. <sc>convexcluster</sc>
 can be freely downloaded from the UCLA Human Genetics web site at <ext-link ext-link-type="uri" xlink:href="http://www.genetics.ucla.edu/software/">http://www.genetics.ucla.edu/software/</ext-link>
 for analysis and comparison purposes.</p>
</sec>
</body>
<back><ref-list><title>References</title>
<ref id="pcbi.1004228.ref001"><label>1</label>
<mixed-citation publication-type="book"><name><surname>Cormen</surname>
<given-names>TH</given-names>
</name>
, <name><surname>Stein</surname>
<given-names>C</given-names>
</name>
, <name><surname>Rivest</surname>
<given-names>RL</given-names>
</name>
, <name><surname>Leiserson</surname>
<given-names>CE</given-names>
</name>
. <source>Introduction to Algorithms</source>
. <edition>2nd ed.</edition>
<publisher-name>McGraw-Hill Higher Education</publisher-name>
; <year>2001</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref002"><label>2</label>
<mixed-citation publication-type="book"><name><surname>Lindsten</surname>
<given-names>F</given-names>
</name>
, <name><surname>Ohlsson</surname>
<given-names>H</given-names>
</name>
, <name><surname>Ljung</surname>
<given-names>L</given-names>
</name>
. <chapter-title>Clustering using sum-of-norms regularization: With application to particle filter output computation</chapter-title>
 In: <source>Statistical Signal Processing Workshop (SSP), 2011 IEEE</source>
. <publisher-name>IEEE</publisher-name>
; <year>2011</year>
 p. <fpage>201</fpage>
–<lpage>204</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref003"><label>3</label>
<mixed-citation publication-type="other">Hocking T, Vert JP, Bach F, Joulin A. Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties. In: Getoor L, Scheffer T, editors. Proceedings of the 28th International Conference on Machine Learning (ICML-11). ICML ‘11. New York, NY, USA: ACM; 2011. p. 745–752.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref004"><label>4</label>
<mixed-citation publication-type="journal"><name><surname>Fisher</surname>
<given-names>RA</given-names>
</name>
. <article-title>The use of multiple measurements in taxonomic problems</article-title>
. <source>Annals of eugenics</source>
. <year>1936</year>
;<volume>7</volume>
(<issue>2</issue>
):<fpage>179</fpage>
–<lpage>188</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.1469-1809.1936.tb02137.x">10.1111/j.1469-1809.1936.tb02137.x</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref005"><label>5</label>
<mixed-citation publication-type="other">Lange K, Keys KL. The MM proximal distance algorithm. Proceedings 2014 International Congress of Mathematicians. 2014;(in press).</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref006"><label>6</label>
<mixed-citation publication-type="book"><name><surname>Borwein</surname>
<given-names>JM</given-names>
</name>
, <name><surname>Lewis</surname>
<given-names>AS</given-names>
</name>
. <source>Convex Analysis and Nonlinear Optimization. vol. 3 of CMS Books in Mathematics</source>
. <edition>2nd ed.</edition>
<publisher-name>Springer</publisher-name>
; <year>2006</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref007"><label>7</label>
<mixed-citation publication-type="book"><name><surname>Clarke</surname>
<given-names>FH</given-names>
</name>
. <source>Optimization and Nonsmooth Analysis. vol. 5 of Classics in Applied Mathematics</source>
. <publisher-name>SIAM</publisher-name>
; <year>1990</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref008"><label>8</label>
<mixed-citation publication-type="book"><name><surname>Demyanov</surname>
<given-names>VF</given-names>
</name>
, <name><surname>Fletcher</surname>
<given-names>R</given-names>
</name>
, <name><surname>Terlaky</surname>
<given-names>T</given-names>
</name>
, <name><surname>Di Pillo</surname>
<given-names>G</given-names>
</name>
, <name><surname>Schoen</surname>
<given-names>F</given-names>
</name>
. <source>Nonlinear Optimization</source>
. <publisher-name>Springer</publisher-name>
; <year>2010</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref009"><label>9</label>
<mixed-citation publication-type="book"><name><surname>Borg</surname>
<given-names>I</given-names>
</name>
, <name><surname>Groenen</surname>
<given-names>PJ</given-names>
</name>
. <source>Modern Multidimensional Scaling: Theory and Applications</source>
. <publisher-name>Springer</publisher-name>
; <year>2005</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref010"><label>10</label>
<mixed-citation publication-type="other">Heiser WJ. Convergent computation by iterative majorization: theory and applications in multidimensional data analysis. Recent Advances in Descriptive Multivariate Analysis. 1995;p. 157–189.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref011"><label>11</label>
<mixed-citation publication-type="journal"><name><surname>Hunter</surname>
<given-names>DR</given-names>
</name>
, <name><surname>Lange</surname>
<given-names>K</given-names>
</name>
. <article-title>A tutorial on MM algorithms</article-title>
. <source>American Statistician</source>
. <year>2004</year>
;<volume>58</volume>
:<fpage>30</fpage>
–<lpage>37</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1198/0003130042836">10.1198/0003130042836</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref012"><label>12</label>
<mixed-citation publication-type="journal"><name><surname>Lange</surname>
<given-names>K</given-names>
</name>
, <name><surname>Hunter</surname>
<given-names>DR</given-names>
</name>
, <name><surname>Yang</surname>
<given-names>I</given-names>
</name>
. <article-title>Optimization transfer using surrogate objective functions</article-title>
. <source>Journal of Computational and Graphical Statistics</source>
. <year>2000</year>
;<volume>9</volume>
:<fpage>1</fpage>
–<lpage>20</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2307/1390605">10.2307/1390605</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref013"><label>13</label>
<mixed-citation publication-type="journal"><name><surname>Wu</surname>
<given-names>TT</given-names>
</name>
, <name><surname>Lange</surname>
<given-names>K</given-names>
</name>
. <article-title>The MM alternative to EM</article-title>
. <source>Statistical Science</source>
. <year>2010</year>
;<volume>25</volume>
:<fpage>492</fpage>
–<lpage>505</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1214/08-STS264">10.1214/08-STS264</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref014"><label>14</label>
<mixed-citation publication-type="book"><name><surname>Deutsch</surname>
<given-names>F</given-names>
</name>
. <source>Best Approximation in Inner Product Spaces. vol. 7 of CMS Books in Mathematics</source>
. <publisher-name>Springer-Verlag</publisher-name>
; <year>2001</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref015"><label>15</label>
<mixed-citation publication-type="journal"><name><surname>Parikh</surname>
<given-names>N</given-names>
</name>
, <name><surname>Boyd</surname>
<given-names>S</given-names>
</name>
. <article-title>Proximal algorithms</article-title>
. <source>Foundations and Trends in Optimization</source>
. <year>2013</year>
;<volume>1</volume>
(<issue>3</issue>
):<fpage>123</fpage>
–<lpage>231</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref016"><label>16</label>
<mixed-citation publication-type="book"><name><surname>Lange</surname>
<given-names>K</given-names>
</name>
. <chapter-title>Optimization</chapter-title>
<edition>2nd ed.</edition>
<source>Springer Texts in Statistics</source>
. <publisher-name>Springer-Verlag</publisher-name>
; <year>2012</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref017"><label>17</label>
<mixed-citation publication-type="other">Bache K, Lichman M. UCI Machine Learning Repository; 2013. Available from: <ext-link ext-link-type="uri" xlink:href="http://archive.ics.uci.edu/ml">http://archive.ics.uci.edu/ml</ext-link>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref018"><label>18</label>
<mixed-citation publication-type="journal"><name><surname>Sokal</surname>
<given-names>RR</given-names>
</name>
, <name><surname>Michener</surname>
<given-names>CD</given-names>
</name>
. <article-title>A statistical method for evaluating systematic relationships</article-title>
. <source>University of Kansas Science Bulletin</source>
. <year>1958</year>
;<volume>38</volume>
:<fpage>1409</fpage>
–<lpage>1438</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref019"><label>19</label>
<mixed-citation publication-type="journal"><name><surname>Hopcroft</surname>
<given-names>J</given-names>
</name>
, <name><surname>Tarjan</surname>
<given-names>R</given-names>
</name>
. <article-title>Algorithm 447: Efficient algorithms for graph manipulation</article-title>
. <source>Communications of the ACM</source>
. <year>1973</year>
;<volume>16</volume>
(<issue>6</issue>
):<fpage>372</fpage>
–<lpage>378</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1145/362248.362272">10.1145/362248.362272</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref020"><label>20</label>
<mixed-citation publication-type="journal"><name><surname>Hubert</surname>
<given-names>L</given-names>
</name>
, <name><surname>Arabie</surname>
<given-names>P</given-names>
</name>
. <article-title>Comparing partitions</article-title>
. <source>Journal of Classification</source>
. <year>1985</year>
;<volume>2</volume>
(<issue>1</issue>
):<fpage>193</fpage>
–<lpage>218</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1007/BF01908075">10.1007/BF01908075</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref021"><label>21</label>
<mixed-citation publication-type="journal"><name><surname>Su</surname>
<given-names>YS</given-names>
</name>
, <name><surname>Gelman</surname>
<given-names>A</given-names>
</name>
, <name><surname>Hill</surname>
<given-names>J</given-names>
</name>
, <name><surname>Yajima</surname>
<given-names>M</given-names>
</name>
. <article-title>Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box</article-title>
. <source>Journal of Statistical Software</source>
. <year>2011</year>
<month>12</month>
;<volume>45</volume>
(<issue>2</issue>
):<fpage>1</fpage>
–<lpage>31</lpage>
. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.jstatsoft.org/v45/i02">http://www.jstatsoft.org/v45/i02</ext-link>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref022"><label>22</label>
<mixed-citation publication-type="journal"><name><surname>Pritchard</surname>
<given-names>JK</given-names>
</name>
, <name><surname>Stephens</surname>
<given-names>M</given-names>
</name>
, <name><surname>Donnelly</surname>
<given-names>P</given-names>
</name>
. <article-title>Inference of population structure using multilocus genotype data</article-title>
. <source>Genetics</source>
. <year>2000</year>
<month>6</month>
;<volume>155</volume>
(<issue>2</issue>
):<fpage>945</fpage>
–<lpage>959</lpage>.
<pub-id pub-id-type="pmid">10835412</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref023"><label>23</label>
<mixed-citation publication-type="journal"><name><surname>Alexander</surname>
<given-names>DH</given-names>
</name>
, <name><surname>Novembre</surname>
<given-names>J</given-names>
</name>
, <name><surname>Lange</surname>
<given-names>K</given-names>
</name>
. <article-title>Fast model-based estimation of ancestry in unrelated individuals</article-title>
. <source>Genome research</source>
. <year>2009</year>
;<volume>19</volume>
(<issue>9</issue>
):<fpage>1655</fpage>
–<lpage>1664</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1101/gr.094052.109">10.1101/gr.094052.109</ext-link>
</comment>
<pub-id pub-id-type="pmid">19648217</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref024"><label>24</label>
<mixed-citation publication-type="journal"><name><surname>Price</surname>
<given-names>AL</given-names>
</name>
, <name><surname>Patterson</surname>
<given-names>NJ</given-names>
</name>
, <name><surname>Plenge</surname>
<given-names>RM</given-names>
</name>
, <name><surname>Weinblatt</surname>
<given-names>ME</given-names>
</name>
, <name><surname>Shadick</surname>
<given-names>NA</given-names>
</name>
, <name><surname>Reich</surname>
<given-names>D</given-names>
</name>
. <article-title>Principal components analysis corrects for stratification in genome-wide association studies</article-title>
. <source>Nat Genet</source>
. <year>2006</year>
<month>8</month>
;<volume>38</volume>
(<issue>8</issue>
):<fpage>904</fpage>
–<lpage>909</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/ng1847">10.1038/ng1847</ext-link>
</comment>
<pub-id pub-id-type="pmid">16862161</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref025"><label>25</label>
<mixed-citation publication-type="journal"><name><surname>Rosenberg</surname>
<given-names>NA</given-names>
</name>
, <name><surname>Pritchard</surname>
<given-names>JK</given-names>
</name>
, <name><surname>Weber</surname>
<given-names>JL</given-names>
</name>
, <name><surname>Cann</surname>
<given-names>HM</given-names>
</name>
, <name><surname>Kidd</surname>
<given-names>KK</given-names>
</name>
, <name><surname>Zhivotovsky</surname>
<given-names>LA</given-names>
</name>
, <etal>et al</etal>
<article-title>Genetic structure of human populations</article-title>
. <source>Science</source>
. <year>2002</year>
<month>12</month>
;<volume>298</volume>
(<issue>5602</issue>
):<fpage>2381</fpage>
–<lpage>2385</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1126/science.1078311">10.1126/science.1078311</ext-link>
</comment>
<pub-id pub-id-type="pmid">12493913</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref026"><label>26</label>
<mixed-citation publication-type="journal"><name><surname>Kittles</surname>
<given-names>RA</given-names>
</name>
, <name><surname>Weiss</surname>
<given-names>KM</given-names>
</name>
. <article-title>Race, ancestry, and genes: implications for defining disease risk</article-title>
. <source>Annual Rev Genomics Hum Genet</source>
. <year>2003</year>
;<volume>4</volume>
:<fpage>33</fpage>
–<lpage>67</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1146/annurev.genom.4.070802.110356">10.1146/annurev.genom.4.070802.110356</ext-link>
</comment>
<pub-id pub-id-type="pmid">14527296</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref027"><label>27</label>
<mixed-citation publication-type="journal"><name><surname>Wang</surname>
<given-names>S</given-names>
</name>
, <name><surname>Lewis</surname>
<given-names>CM</given-names>
<suffix>Jr</suffix>
</name>
, <name><surname>Jakobsson</surname>
<given-names>M</given-names>
</name>
, <name><surname>Ramachandran</surname>
<given-names>S</given-names>
</name>
, <name><surname>Ray</surname>
<given-names>N</given-names>
</name>
, <name><surname>Bedoya</surname>
<given-names>G</given-names>
</name>
, <etal>et al</etal>
<article-title>Genetic Variation and Population Structure in Native Americans</article-title>
. <source>PLoS Genet</source>
. <year>2007</year>
<month>11</month>
;<volume>3</volume>
(<issue>11</issue>
):<fpage>e185</fpage>
<comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pgen.0030185">10.1371/journal.pgen.0030185</ext-link>
</comment>
<pub-id pub-id-type="pmid">18039031</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref028"><label>28</label>
<mixed-citation publication-type="journal"><name><surname>Li</surname>
<given-names>JZ</given-names>
</name>
, <name><surname>Absher</surname>
<given-names>DM</given-names>
</name>
, <name><surname>Tang</surname>
<given-names>H</given-names>
</name>
, <name><surname>Southwick</surname>
<given-names>AM</given-names>
</name>
, <name><surname>Casto</surname>
<given-names>AM</given-names>
</name>
, <name><surname>Ramachandran</surname>
<given-names>S</given-names>
</name>
, <etal>et al</etal>
<article-title>Worldwide human relationships inferred from genome-wide patterns of variation</article-title>
. <source>Science</source>
. <year>2008</year>
<month>2</month>
;<volume>319</volume>
(<issue>5866</issue>
):<fpage>1100</fpage>
–<lpage>1104</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1126/science.1153717">10.1126/science.1153717</ext-link>
</comment>
<pub-id pub-id-type="pmid">18292342</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref029"><label>29</label>
<mixed-citation publication-type="other">census bureau C. The Fourth Population Census of China in 1990; 1990.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref030"><label>30</label>
<mixed-citation publication-type="other">census bureau C. Population Census of China in 2000; 2000.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref031"><label>31</label>
<mixed-citation publication-type="journal"><name><surname>Rosenberg</surname>
<given-names>NA</given-names>
</name>
, <name><surname>Mahajan</surname>
<given-names>S</given-names>
</name>
, <name><surname>Gonzalez-Quevedo</surname>
<given-names>C</given-names>
</name>
, <name><surname>Blum</surname>
<given-names>MG</given-names>
</name>
, <name><surname>Nino-Rosales</surname>
<given-names>L</given-names>
</name>
, <name><surname>Ninis</surname>
<given-names>V</given-names>
</name>
, <etal>et al</etal>
<article-title>Low levels of genetic divergence across geographically and linguistically diverse populations from India</article-title>
. <source>PLoS Genet</source>
. <year>2006</year>
<month>12</month>
;<volume>2</volume>
(<issue>12</issue>
):<fpage>e215</fpage>
<comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pgen.0020215">10.1371/journal.pgen.0020215</ext-link>
</comment>
<pub-id pub-id-type="pmid">17194221</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref032"><label>32</label>
<mixed-citation publication-type="journal"><name><surname>Coudray</surname>
<given-names>C</given-names>
</name>
, <name><surname>Olivieri</surname>
<given-names>A</given-names>
</name>
, <name><surname>Achilli</surname>
<given-names>A</given-names>
</name>
, <name><surname>Pala</surname>
<given-names>M</given-names>
</name>
, <name><surname>Melhaoui</surname>
<given-names>M</given-names>
</name>
, <name><surname>Cherkaoui</surname>
<given-names>M</given-names>
</name>
, <etal>et al</etal>
<article-title>The complex and diversified mitochondrial gene pool of Berber populations</article-title>
. <source>Ann Hum Genet</source>
. <year>2009</year>
<month>3</month>
;<volume>73</volume>
(<issue>2</issue>
):<fpage>196</fpage>
–<lpage>214</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.1469-1809.2008.00493.x">10.1111/j.1469-1809.2008.00493.x</ext-link>
</comment>
<pub-id pub-id-type="pmid">19053990</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref033"><label>33</label>
<mixed-citation publication-type="journal"><name><surname>Qamar</surname>
<given-names>R</given-names>
</name>
, <name><surname>Ayub</surname>
<given-names>Q</given-names>
</name>
, <name><surname>Mohyuddin</surname>
<given-names>A</given-names>
</name>
, <name><surname>Helgason</surname>
<given-names>A</given-names>
</name>
, <name><surname>Mazhar</surname>
<given-names>K</given-names>
</name>
, <name><surname>Mansoor</surname>
<given-names>A</given-names>
</name>
, <etal>et al</etal>
<article-title>Y-chromosomal DNA variation in Pakistan</article-title>
. <source>Am J Hum Genet</source>
. <year>2002</year>
<month>5</month>
;<volume>70</volume>
(<issue>5</issue>
):<fpage>1107</fpage>
–<lpage>1124</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1086/339929">10.1086/339929</ext-link>
</comment>
<pub-id pub-id-type="pmid">11898125</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref034"><label>34</label>
<mixed-citation publication-type="journal"><name><surname>Ablimit</surname>
<given-names>A</given-names>
</name>
, <name><surname>Qin</surname>
<given-names>W</given-names>
</name>
, <name><surname>Shan</surname>
<given-names>W</given-names>
</name>
, <name><surname>Wu</surname>
<given-names>W</given-names>
</name>
, <name><surname>Ling</surname>
<given-names>F</given-names>
</name>
, <name><surname>Ling</surname>
<given-names>KH</given-names>
</name>
, <etal>et al</etal>
<article-title>Genetic diversities of cytochrome B in Xinjiang Uyghur unveiled its origin and migration history</article-title>
. <source>BMC Genet</source>
. <year>2013</year>
;<volume>14</volume>
:<fpage>100</fpage>
<comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2156-14-100">10.1186/1471-2156-14-100</ext-link>
</comment>
<pub-id pub-id-type="pmid">24103151</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref035"><label>35</label>
<mixed-citation publication-type="journal"><name><surname>Nelson</surname>
<given-names>MR</given-names>
</name>
, <name><surname>Bryc</surname>
<given-names>K</given-names>
</name>
, <name><surname>King</surname>
<given-names>KS</given-names>
</name>
, <name><surname>Indap</surname>
<given-names>A</given-names>
</name>
, <name><surname>Boyko</surname>
<given-names>AR</given-names>
</name>
, <name><surname>Novembre</surname>
<given-names>J</given-names>
</name>
, <etal>et al</etal>
<article-title>The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research</article-title>
. <source>Am J Hum Genet</source>
. <year>2008</year>
<month>9</month>
;<volume>83</volume>
(<issue>3</issue>
):<fpage>347</fpage>
–<lpage>358</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.ajhg.2008.08.005">10.1016/j.ajhg.2008.08.005</ext-link>
</comment>
<pub-id pub-id-type="pmid">18760391</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref036"><label>36</label>
<mixed-citation publication-type="journal"><name><surname>Gravel</surname>
<given-names>S</given-names>
</name>
, <name><surname>Henn</surname>
<given-names>BM</given-names>
</name>
, <name><surname>Gutenkunst</surname>
<given-names>RN</given-names>
</name>
, <name><surname>Indap</surname>
<given-names>AR</given-names>
</name>
, <name><surname>Marth</surname>
<given-names>GT</given-names>
</name>
, <name><surname>Clark</surname>
<given-names>AG</given-names>
</name>
, <etal>et al</etal>
<article-title>Demographic history and rare allele sharing among human populations</article-title>
. <source>Proceedings of the National Academy of Sciences</source>
. <year>2011</year>
;<volume>108</volume>
(<issue>29</issue>
):<fpage>11983</fpage>
–<lpage>11988</lpage>
. Available from: <ext-link ext-link-type="uri" xlink:href="http://www.pnas.org/content/108/29/11983.abstract">http://www.pnas.org/content/108/29/11983.abstract</ext-link>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.1019276108">10.1073/pnas.1019276108</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref037"><label>37</label>
<mixed-citation publication-type="journal"><name><surname>Rochefort</surname>
<given-names>H</given-names>
</name>
, <name><surname>Glondu</surname>
<given-names>M</given-names>
</name>
, <name><surname>Sahla</surname>
<given-names>ME</given-names>
</name>
, <name><surname>Platet</surname>
<given-names>N</given-names>
</name>
, <name><surname>Garcia</surname>
<given-names>M</given-names>
</name>
. <article-title>How to target estrogen receptor-negative breast cancer?</article-title>
<source>Endocr Relat Cancer</source>
. <year>2003</year>
<month>6</month>
;<volume>10</volume>
(<issue>2</issue>
):<fpage>261</fpage>
–<lpage>266</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1677/erc.0.0100261">10.1677/erc.0.0100261</ext-link>
</comment>
<pub-id pub-id-type="pmid">12790787</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref038"><label>38</label>
<mixed-citation publication-type="journal"><name><surname>Perou</surname>
<given-names>CM</given-names>
</name>
, <name><surname>S?rlie</surname>
<given-names>T</given-names>
</name>
, <name><surname>Eisen</surname>
<given-names>MB</given-names>
</name>
, <name><surname>van de Rijn</surname>
<given-names>M</given-names>
</name>
, <name><surname>Jeffrey</surname>
<given-names>SS</given-names>
</name>
, <name><surname>Rees</surname>
<given-names>CA</given-names>
</name>
, <etal>et al</etal>
<article-title>Molecular portraits of human breast tumours</article-title>
. <source>Nature</source>
. <year>2000</year>
<month>8</month>
;<volume>406</volume>
(<issue>6797</issue>
):<fpage>747</fpage>
–<lpage>752</lpage>
. <comment>doi: <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/35021093">10.1038/35021093</ext-link>
</comment>
<pub-id pub-id-type="pmid">10963602</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref039"><label>39</label>
<mixed-citation publication-type="journal"><name><surname>Chi</surname>
<given-names>EC</given-names>
</name>
, <name><surname>Lange</surname>
<given-names>K</given-names>
</name>
. <article-title>Splitting methods for convex clustering</article-title>
. <source>Journal of Computational and Graphical Statistics</source>
. <year>2013</year>
.</mixed-citation>
</ref>
<ref id="pcbi.1004228.ref040"><label>40</label>
<mixed-citation publication-type="other">Chi EC, Allen GI, Baraniuk RG. Convex Biclustering; 2014. arXiv:1408.0856 [stat.ME]. Available from: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1408.0856">http://arxiv.org/abs/1408.0856</ext-link>
.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Linguistique/explor/TamazightV2/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000036  | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000036  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Linguistique
   |area=    TamazightV2
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

This area was generated with Dilib version V0.6.33.
Data generation: Wed Nov 15 18:28:35 2017. Site generation: Sat Feb 10 16:46:27 2024

	Serveur d'exploration Tamazight
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Tamazight

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri