Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
Identifieur interne : 000674 ( PubMed/Curation ); précédent : 000673; suivant : 000675Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
Auteurs : Romeo Rizzi [Italie] ; Pritha Mahata ; Luke Mathieson ; Pablo MoscatoSource :
- PloS one [ 1932-6203 ] ; 2010.
Descripteurs français
- KwdFr :
- Algorithmes, Analyse de profil d'expression de gènes, Analyse de regroupements, Biologie informatique (), Humains, Leucémies (métabolisme), Lignée cellulaire tumorale, Modèles statistiques, Modèles théoriques, Mélanome (métabolisme), Régulation de l'expression des gènes tumoraux, Tumeurs du côlon (métabolisme).
- MESH :
English descriptors
- KwdEn :
- MESH :
Abstract
Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.
DOI: 10.1371/journal.pone.0014067
PubMed: 21151943
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000674
Links to Exploration step
pubmed:21151943Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</title>
<author><name sortKey="Rizzi, Romeo" sort="Rizzi, Romeo" uniqKey="Rizzi R" first="Romeo" last="Rizzi">Romeo Rizzi</name>
<affiliation wicri:level="1"><nlm:affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</nlm:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Dipartimento di Matematica ed Informatica, University of Udine, Udine</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Mahata, Pritha" sort="Mahata, Pritha" uniqKey="Mahata P" first="Pritha" last="Mahata">Pritha Mahata</name>
</author>
<author><name sortKey="Mathieson, Luke" sort="Mathieson, Luke" uniqKey="Mathieson L" first="Luke" last="Mathieson">Luke Mathieson</name>
</author>
<author><name sortKey="Moscato, Pablo" sort="Moscato, Pablo" uniqKey="Moscato P" first="Pablo" last="Moscato">Pablo Moscato</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2010">2010</date>
<idno type="RBID">pubmed:21151943</idno>
<idno type="pmid">21151943</idno>
<idno type="doi">10.1371/journal.pone.0014067</idno>
<idno type="wicri:Area/PubMed/Corpus">000674</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000674</idno>
<idno type="wicri:Area/PubMed/Curation">000674</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000674</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</title>
<author><name sortKey="Rizzi, Romeo" sort="Rizzi, Romeo" uniqKey="Rizzi R" first="Romeo" last="Rizzi">Romeo Rizzi</name>
<affiliation wicri:level="1"><nlm:affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</nlm:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Dipartimento di Matematica ed Informatica, University of Udine, Udine</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Mahata, Pritha" sort="Mahata, Pritha" uniqKey="Mahata P" first="Pritha" last="Mahata">Pritha Mahata</name>
</author>
<author><name sortKey="Mathieson, Luke" sort="Mathieson, Luke" uniqKey="Mathieson L" first="Luke" last="Mathieson">Luke Mathieson</name>
</author>
<author><name sortKey="Moscato, Pablo" sort="Moscato, Pablo" uniqKey="Moscato P" first="Pablo" last="Moscato">Pablo Moscato</name>
</author>
</analytic>
<series><title level="j">PloS one</title>
<idno type="eISSN">1932-6203</idno>
<imprint><date when="2010" type="published">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Cell Line, Tumor</term>
<term>Cluster Analysis</term>
<term>Colonic Neoplasms (metabolism)</term>
<term>Computational Biology (methods)</term>
<term>Gene Expression Profiling</term>
<term>Gene Expression Regulation, Neoplastic</term>
<term>Humans</term>
<term>Leukemia (metabolism)</term>
<term>Melanoma (metabolism)</term>
<term>Models, Statistical</term>
<term>Models, Theoretical</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes</term>
<term>Analyse de regroupements</term>
<term>Biologie informatique ()</term>
<term>Humains</term>
<term>Leucémies (métabolisme)</term>
<term>Lignée cellulaire tumorale</term>
<term>Modèles statistiques</term>
<term>Modèles théoriques</term>
<term>Mélanome (métabolisme)</term>
<term>Régulation de l'expression des gènes tumoraux</term>
<term>Tumeurs du côlon (métabolisme)</term>
</keywords>
<keywords scheme="MESH" qualifier="metabolism" xml:lang="en"><term>Colonic Neoplasms</term>
<term>Leukemia</term>
<term>Melanoma</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Computational Biology</term>
</keywords>
<keywords scheme="MESH" qualifier="métabolisme" xml:lang="fr"><term>Leucémies</term>
<term>Mélanome</term>
<term>Tumeurs du côlon</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Cell Line, Tumor</term>
<term>Cluster Analysis</term>
<term>Gene Expression Profiling</term>
<term>Gene Expression Regulation, Neoplastic</term>
<term>Humans</term>
<term>Models, Statistical</term>
<term>Models, Theoretical</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes</term>
<term>Analyse de regroupements</term>
<term>Biologie informatique</term>
<term>Humains</term>
<term>Lignée cellulaire tumorale</term>
<term>Modèles statistiques</term>
<term>Modèles théoriques</term>
<term>Régulation de l'expression des gènes tumoraux</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" Owner="NLM"><PMID Version="1">21151943</PMID>
<DateCompleted><Year>2011</Year>
<Month>07</Month>
<Day>05</Day>
</DateCompleted>
<DateRevised><Year>2018</Year>
<Month>11</Month>
<Day>13</Day>
</DateRevised>
<Article PubModel="Electronic"><Journal><ISSN IssnType="Electronic">1932-6203</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>5</Volume>
<Issue>12</Issue>
<PubDate><Year>2010</Year>
<Month>Dec</Month>
<Day>02</Day>
</PubDate>
</JournalIssue>
<Title>PloS one</Title>
<ISOAbbreviation>PLoS ONE</ISOAbbreviation>
</Journal>
<ArticleTitle>Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</ArticleTitle>
<Pagination><MedlinePgn>e14067</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1371/journal.pone.0014067</ELocationID>
<Abstract><AbstractText>Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Rizzi</LastName>
<ForeName>Romeo</ForeName>
<Initials>R</Initials>
<AffiliationInfo><Affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Mahata</LastName>
<ForeName>Pritha</ForeName>
<Initials>P</Initials>
</Author>
<Author ValidYN="Y"><LastName>Mathieson</LastName>
<ForeName>Luke</ForeName>
<Initials>L</Initials>
</Author>
<Author ValidYN="Y"><LastName>Moscato</LastName>
<ForeName>Pablo</ForeName>
<Initials>P</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic"><Year>2010</Year>
<Month>12</Month>
<Day>02</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo><Country>United States</Country>
<MedlineTA>PLoS One</MedlineTA>
<NlmUniqueID>101285081</NlmUniqueID>
<ISSNLinking>1932-6203</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D045744" MajorTopicYN="N">Cell Line, Tumor</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016000" MajorTopicYN="N">Cluster Analysis</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D003110" MajorTopicYN="N">Colonic Neoplasms</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D020869" MajorTopicYN="N">Gene Expression Profiling</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D015972" MajorTopicYN="Y">Gene Expression Regulation, Neoplastic</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D007938" MajorTopicYN="N">Leukemia</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D008545" MajorTopicYN="N">Melanoma</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D015233" MajorTopicYN="N">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D008962" MajorTopicYN="N">Models, Theoretical</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2010</Year>
<Month>07</Month>
<Day>21</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2010</Year>
<Month>10</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2011</Year>
<Month>7</Month>
<Day>6</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">21151943</ArticleId>
<ArticleId IdType="doi">10.1371/journal.pone.0014067</ArticleId>
<ArticleId IdType="pmc">PMC2997101</ArticleId>
</ArticleIdList>
<ReferenceList><Reference><Citation>Nat Genet. 2000 Mar;24(3):227-35</Citation>
<ArticleIdList><ArticleId IdType="pubmed">10700174</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>Evol Comput. 2000 Spring;8(1):61-91</Citation>
<ArticleIdList><ArticleId IdType="pubmed">10753231</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>Proc Natl Acad Sci U S A. 2002 Apr 2;99(7):4465-70</Citation>
<ArticleIdList><ArticleId IdType="pubmed">11904358</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>IEEE Trans Pattern Anal Mach Intell. 2007 Nov;29(11):1944-57</Citation>
<ArticleIdList><ArticleId IdType="pubmed">17848776</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>Bioinformatics. 2004 Jun 12;20(9):1453-4</Citation>
<ArticleIdList><ArticleId IdType="pubmed">14871861</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>Science. 1999 Oct 15;286(5439):531-7</Citation>
<ArticleIdList><ArticleId IdType="pubmed">10521349</ArticleId>
</ArticleIdList>
</Reference>
<Reference><Citation>BMC Bioinformatics. 2003 Sep 20;4:43</Citation>
<ArticleIdList><ArticleId IdType="pubmed">14499005</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/CovidV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000674 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 000674 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sante |area= CovidV1 |flux= PubMed |étape= Curation |type= RBID |clé= pubmed:21151943 |texte= Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i -Sk "pubmed:21151943" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd \ | NlmPubMed2Wicri -a CovidV1
This area was generated with Dilib version V0.6.33. |