Serveur d'exploration Covid

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.

Identifieur interne : 000674 ( PubMed/Curation ); précédent : 000673; suivant : 000675

Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.

Auteurs : Romeo Rizzi [Italie] ; Pritha Mahata ; Luke Mathieson ; Pablo Moscato

Source :

RBID : pubmed:21151943

Descripteurs français

English descriptors

Abstract

Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.

DOI: 10.1371/journal.pone.0014067
PubMed: 21151943

Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:21151943

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</title>
<author>
<name sortKey="Rizzi, Romeo" sort="Rizzi, Romeo" uniqKey="Rizzi R" first="Romeo" last="Rizzi">Romeo Rizzi</name>
<affiliation wicri:level="1">
<nlm:affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</nlm:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Dipartimento di Matematica ed Informatica, University of Udine, Udine</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Mahata, Pritha" sort="Mahata, Pritha" uniqKey="Mahata P" first="Pritha" last="Mahata">Pritha Mahata</name>
</author>
<author>
<name sortKey="Mathieson, Luke" sort="Mathieson, Luke" uniqKey="Mathieson L" first="Luke" last="Mathieson">Luke Mathieson</name>
</author>
<author>
<name sortKey="Moscato, Pablo" sort="Moscato, Pablo" uniqKey="Moscato P" first="Pablo" last="Moscato">Pablo Moscato</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2010">2010</date>
<idno type="RBID">pubmed:21151943</idno>
<idno type="pmid">21151943</idno>
<idno type="doi">10.1371/journal.pone.0014067</idno>
<idno type="wicri:Area/PubMed/Corpus">000674</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000674</idno>
<idno type="wicri:Area/PubMed/Curation">000674</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000674</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</title>
<author>
<name sortKey="Rizzi, Romeo" sort="Rizzi, Romeo" uniqKey="Rizzi R" first="Romeo" last="Rizzi">Romeo Rizzi</name>
<affiliation wicri:level="1">
<nlm:affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</nlm:affiliation>
<country xml:lang="fr">Italie</country>
<wicri:regionArea>Dipartimento di Matematica ed Informatica, University of Udine, Udine</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Mahata, Pritha" sort="Mahata, Pritha" uniqKey="Mahata P" first="Pritha" last="Mahata">Pritha Mahata</name>
</author>
<author>
<name sortKey="Mathieson, Luke" sort="Mathieson, Luke" uniqKey="Mathieson L" first="Luke" last="Mathieson">Luke Mathieson</name>
</author>
<author>
<name sortKey="Moscato, Pablo" sort="Moscato, Pablo" uniqKey="Moscato P" first="Pablo" last="Moscato">Pablo Moscato</name>
</author>
</analytic>
<series>
<title level="j">PloS one</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2010" type="published">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Cell Line, Tumor</term>
<term>Cluster Analysis</term>
<term>Colonic Neoplasms (metabolism)</term>
<term>Computational Biology (methods)</term>
<term>Gene Expression Profiling</term>
<term>Gene Expression Regulation, Neoplastic</term>
<term>Humans</term>
<term>Leukemia (metabolism)</term>
<term>Melanoma (metabolism)</term>
<term>Models, Statistical</term>
<term>Models, Theoretical</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes</term>
<term>Analyse de regroupements</term>
<term>Biologie informatique ()</term>
<term>Humains</term>
<term>Leucémies (métabolisme)</term>
<term>Lignée cellulaire tumorale</term>
<term>Modèles statistiques</term>
<term>Modèles théoriques</term>
<term>Mélanome (métabolisme)</term>
<term>Régulation de l'expression des gènes tumoraux</term>
<term>Tumeurs du côlon (métabolisme)</term>
</keywords>
<keywords scheme="MESH" qualifier="metabolism" xml:lang="en">
<term>Colonic Neoplasms</term>
<term>Leukemia</term>
<term>Melanoma</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
</keywords>
<keywords scheme="MESH" qualifier="métabolisme" xml:lang="fr">
<term>Leucémies</term>
<term>Mélanome</term>
<term>Tumeurs du côlon</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Cell Line, Tumor</term>
<term>Cluster Analysis</term>
<term>Gene Expression Profiling</term>
<term>Gene Expression Regulation, Neoplastic</term>
<term>Humans</term>
<term>Models, Statistical</term>
<term>Models, Theoretical</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes</term>
<term>Analyse de regroupements</term>
<term>Biologie informatique</term>
<term>Humains</term>
<term>Lignée cellulaire tumorale</term>
<term>Modèles statistiques</term>
<term>Modèles théoriques</term>
<term>Régulation de l'expression des gènes tumoraux</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">21151943</PMID>
<DateCompleted>
<Year>2011</Year>
<Month>07</Month>
<Day>05</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>11</Month>
<Day>13</Day>
</DateRevised>
<Article PubModel="Electronic">
<Journal>
<ISSN IssnType="Electronic">1932-6203</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>5</Volume>
<Issue>12</Issue>
<PubDate>
<Year>2010</Year>
<Month>Dec</Month>
<Day>02</Day>
</PubDate>
</JournalIssue>
<Title>PloS one</Title>
<ISOAbbreviation>PLoS ONE</ISOAbbreviation>
</Journal>
<ArticleTitle>Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.</ArticleTitle>
<Pagination>
<MedlinePgn>e14067</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1371/journal.pone.0014067</ELocationID>
<Abstract>
<AbstractText>Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Rizzi</LastName>
<ForeName>Romeo</ForeName>
<Initials>R</Initials>
<AffiliationInfo>
<Affiliation>Dipartimento di Matematica ed Informatica, University of Udine, Udine, Italy.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Mahata</LastName>
<ForeName>Pritha</ForeName>
<Initials>P</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Mathieson</LastName>
<ForeName>Luke</ForeName>
<Initials>L</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Moscato</LastName>
<ForeName>Pablo</ForeName>
<Initials>P</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2010</Year>
<Month>12</Month>
<Day>02</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>PLoS One</MedlineTA>
<NlmUniqueID>101285081</NlmUniqueID>
<ISSNLinking>1932-6203</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D045744" MajorTopicYN="N">Cell Line, Tumor</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016000" MajorTopicYN="N">Cluster Analysis</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D003110" MajorTopicYN="N">Colonic Neoplasms</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D020869" MajorTopicYN="N">Gene Expression Profiling</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015972" MajorTopicYN="Y">Gene Expression Regulation, Neoplastic</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D007938" MajorTopicYN="N">Leukemia</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008545" MajorTopicYN="N">Melanoma</DescriptorName>
<QualifierName UI="Q000378" MajorTopicYN="N">metabolism</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015233" MajorTopicYN="N">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D008962" MajorTopicYN="N">Models, Theoretical</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2010</Year>
<Month>07</Month>
<Day>21</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2010</Year>
<Month>10</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2011</Year>
<Month>7</Month>
<Day>6</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">21151943</ArticleId>
<ArticleId IdType="doi">10.1371/journal.pone.0014067</ArticleId>
<ArticleId IdType="pmc">PMC2997101</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Nat Genet. 2000 Mar;24(3):227-35</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10700174</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Evol Comput. 2000 Spring;8(1):61-91</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10753231</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2002 Apr 2;99(7):4465-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11904358</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>IEEE Trans Pattern Anal Mach Intell. 2007 Nov;29(11):1944-57</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17848776</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2004 Jun 12;20(9):1453-4</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14871861</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Science. 1999 Oct 15;286(5439):531-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10521349</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2003 Sep 20;4:43</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14499005</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/CovidV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000674 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 000674 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sante
   |area=    CovidV1
   |flux=    PubMed
   |étape=   Curation
   |type=    RBID
   |clé=     pubmed:21151943
   |texte=   Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i   -Sk "pubmed:21151943" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a CovidV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Fri Mar 27 18:14:15 2020. Site generation: Sun Jan 31 15:15:08 2021