Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A new clustering algorithm with multiple runs of iterative procedures

Identifieur interne : 001871 ( Istex/Corpus ); précédent : 001870; suivant : 001872

A new clustering algorithm with multiple runs of iterative procedures

Auteurs : Qiwen Zhang ; Roger D. Boyle

Source :

RBID : ISTEX:3E8A1CB2353FA1D809908872659479713B9542C9

Abstract

The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.

Url:
DOI: 10.1016/0031-3203(91)90003-N

Links to Exploration step

ISTEX:3E8A1CB2353FA1D809908872659479713B9542C9

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>A new clustering algorithm with multiple runs of iterative procedures</title>
<author>
<name sortKey="Zhang, Qiwen" sort="Zhang, Qiwen" uniqKey="Zhang Q" first="Qiwen" last="Zhang">Qiwen Zhang</name>
<affiliation>
<mods:affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Boyle, Roger D" sort="Boyle, Roger D" uniqKey="Boyle R" first="Roger D." last="Boyle">Roger D. Boyle</name>
<affiliation>
<mods:affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3E8A1CB2353FA1D809908872659479713B9542C9</idno>
<date when="1991" year="1991">1991</date>
<idno type="doi">10.1016/0031-3203(91)90003-N</idno>
<idno type="url">https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001871</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">A new clustering algorithm with multiple runs of iterative procedures</title>
<author>
<name sortKey="Zhang, Qiwen" sort="Zhang, Qiwen" uniqKey="Zhang Q" first="Qiwen" last="Zhang">Qiwen Zhang</name>
<affiliation>
<mods:affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Boyle, Roger D" sort="Boyle, Roger D" uniqKey="Boyle R" first="Roger D." last="Boyle">Roger D. Boyle</name>
<affiliation>
<mods:affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1990">1990</date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">9</biblScope>
<biblScope unit="page" from="835">835</biblScope>
<biblScope unit="page" to="848">848</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">3E8A1CB2353FA1D809908872659479713B9542C9</idno>
<idno type="DOI">10.1016/0031-3203(91)90003-N</idno>
<idno type="PII">0031-3203(91)90003-N</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.</div>
</front>
</TEI>
<istex>
<corpusName>elsevier</corpusName>
<author>
<json:item>
<name>Qiwen Zhang</name>
<affiliations>
<json:string>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</json:string>
</affiliations>
</json:item>
<json:item>
<name>Roger D. Boyle</name>
<affiliations>
<json:string>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Clustering problem</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Configuration</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>K-means</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Random initialisation</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Cost minimisation</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Global minimum</value>
</json:item>
</subject>
<language>
<json:string>eng</json:string>
</language>
<abstract>The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.</abstract>
<qualityIndicators>
<score>6.776</score>
<pdfVersion>1.2</pdfVersion>
<pdfPageSize>548 x 785 pts</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>6</keywordCount>
<abstractCharCount>1019</abstractCharCount>
<pdfWordCount>6830</pdfWordCount>
<pdfCharCount>39241</pdfCharCount>
<pdfPageCount>14</pdfPageCount>
<abstractWordCount>148</abstractWordCount>
</qualityIndicators>
<title>A new clustering algorithm with multiple runs of iterative procedures</title>
<pii>
<json:string>0031-3203(91)90003-N</json:string>
</pii>
<genre>
<json:string>research-article</json:string>
</genre>
<serie>
<genre></genre>
<language>
<json:string>unknown</json:string>
</language>
<title>Master Thesis</title>
</serie>
<host>
<volume>24</volume>
<pii>
<json:string>S0031-3203(00)X0153-7</json:string>
</pii>
<pages>
<last>848</last>
<first>835</first>
</pages>
<issn>
<json:string>0031-3203</json:string>
</issn>
<issue>9</issue>
<genre>
<json:string>Journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<title>Pattern Recognition</title>
<publicationDate>1991</publicationDate>
</host>
<categories>
<wos>
<json:string>COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE</json:string>
<json:string>ENGINEERING, ELECTRICAL & ELECTRONIC</json:string>
</wos>
</categories>
<publicationDate>1990</publicationDate>
<copyrightDate>1991</copyrightDate>
<doi>
<json:string>10.1016/0031-3203(91)90003-N</json:string>
</doi>
<id>3E8A1CB2353FA1D809908872659479713B9542C9</id>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/fulltext/pdf</uri>
</json:item>
<json:item>
<original>true</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/fulltext/txt</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a">A new clustering algorithm with multiple runs of iterative procedures</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>ELSEVIER</publisher>
<availability>
<p>ELSEVIER</p>
</availability>
<date>1991</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a">A new clustering algorithm with multiple runs of iterative procedures</title>
<author>
<persName>
<forename type="first">Qiwen</forename>
<surname>Zhang</surname>
</persName>
<affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</affiliation>
</author>
<author>
<persName>
<forename type="first">Roger D.</forename>
<surname>Boyle</surname>
</persName>
<affiliation>Author to whom correspondence should be addressed.</affiliation>
<affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="pISSN">0031-3203</idno>
<idno type="PII">S0031-3203(00)X0153-7</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1990"></date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">9</biblScope>
<biblScope unit="page" from="835">835</biblScope>
<biblScope unit="page" to="848">848</biblScope>
</imprint>
</monogr>
<idno type="istex">3E8A1CB2353FA1D809908872659479713B9542C9</idno>
<idno type="DOI">10.1016/0031-3203(91)90003-N</idno>
<idno type="PII">0031-3203(91)90003-N</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>1991</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<head>Keywords</head>
<item>
<term>Clustering problem</term>
</item>
<item>
<term>Configuration</term>
</item>
<item>
<term>K-means</term>
</item>
<item>
<term>Random initialisation</term>
</item>
<item>
<term>Cost minimisation</term>
</item>
<item>
<term>Global minimum</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="1990-06-01">Received</change>
<change when="1991-01-30">Modified</change>
<change when="1990">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Elsevier, elements deleted: tail">
<istex:xmlDeclaration>version="1.0" encoding="utf-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//ES//DTD journal article DTD version 4.5.2//EN//XML" URI="art452.dtd" name="istex:docType"></istex:docType>
<istex:document>
<converted-article version="4.5.2" docsubtype="fla">
<item-info>
<jid>PR</jid>
<aid>9190003N</aid>
<ce:pii>0031-3203(91)90003-N</ce:pii>
<ce:doi>10.1016/0031-3203(91)90003-N</ce:doi>
<ce:copyright type="unknown" year="1991"></ce:copyright>
</item-info>
<head>
<ce:title>A new clustering algorithm with multiple runs of iterative procedures</ce:title>
<ce:author-group>
<ce:author>
<ce:given-name>Qiwen</ce:given-name>
<ce:surname>Zhang</ce:surname>
</ce:author>
<ce:author>
<ce:given-name>Roger D.</ce:given-name>
<ce:surname>Boyle</ce:surname>
<ce:cross-ref refid="COR1">
<ce:sup></ce:sup>
</ce:cross-ref>
</ce:author>
<ce:affiliation>
<ce:textfn>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</ce:textfn>
</ce:affiliation>
<ce:correspondence id="COR1">
<ce:label></ce:label>
<ce:text>Author to whom correspondence should be addressed.</ce:text>
</ce:correspondence>
</ce:author-group>
<ce:date-received day="1" month="6" year="1990"></ce:date-received>
<ce:date-revised day="30" month="1" year="1991"></ce:date-revised>
<ce:abstract>
<ce:section-title>Abstract</ce:section-title>
<ce:abstract-sec>
<ce:simple-para>The
<ce:italic>K</ce:italic>
-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart,
<ce:italic>Pattern Classification and Scene Analysis</ce:italic>
(1973); Ismail and Kamel,
<ce:italic>Pattern Recognition</ce:italic>
<ce:bold>22</ce:bold>
, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to
<ce:italic>K</ce:italic>
-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.</ce:simple-para>
</ce:abstract-sec>
</ce:abstract>
<ce:keywords>
<ce:section-title>Keywords</ce:section-title>
<ce:keyword>
<ce:text>Clustering problem</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Configuration</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>
<ce:italic>K</ce:italic>
-means</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Random initialisation</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Cost minimisation</ce:text>
</ce:keyword>
<ce:keyword>
<ce:text>Global minimum</ce:text>
</ce:keyword>
</ce:keywords>
</head>
</converted-article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo>
<title>A new clustering algorithm with multiple runs of iterative procedures</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>A new clustering algorithm with multiple runs of iterative procedures</title>
</titleInfo>
<name type="personal">
<namePart type="given">Qiwen</namePart>
<namePart type="family">Zhang</namePart>
<affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Roger D.</namePart>
<namePart type="family">Boyle</namePart>
<affiliation>School of Computer Studies, The University of Leeds, Leeds LS2 9JT, U.K.</affiliation>
<description>Author to whom correspondence should be addressed.</description>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="Full-length article"></genre>
<originInfo>
<publisher>ELSEVIER</publisher>
<dateIssued encoding="w3cdtf">1990</dateIssued>
<dateCaptured encoding="w3cdtf">1990-06-01</dateCaptured>
<dateModified encoding="w3cdtf">1991-01-30</dateModified>
<copyrightDate encoding="w3cdtf">1991</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.</abstract>
<subject>
<genre>Keywords</genre>
<topic>Clustering problem</topic>
<topic>Configuration</topic>
<topic>K-means</topic>
<topic>Random initialisation</topic>
<topic>Cost minimisation</topic>
<topic>Global minimum</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Pattern Recognition</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>PR</title>
</titleInfo>
<genre type="Journal">journal</genre>
<originInfo>
<dateIssued encoding="w3cdtf">1991</dateIssued>
</originInfo>
<identifier type="ISSN">0031-3203</identifier>
<identifier type="PII">S0031-3203(00)X0153-7</identifier>
<part>
<date>1991</date>
<detail type="volume">
<number>24</number>
<caption>vol.</caption>
</detail>
<detail type="issue">
<number>9</number>
<caption>no.</caption>
</detail>
<extent unit="issue pages">
<start>815</start>
<end>919</end>
</extent>
<extent unit="pages">
<start>835</start>
<end>848</end>
</extent>
</part>
</relatedItem>
<identifier type="istex">3E8A1CB2353FA1D809908872659479713B9542C9</identifier>
<identifier type="DOI">10.1016/0031-3203(91)90003-N</identifier>
<identifier type="PII">0031-3203(91)90003-N</identifier>
<recordInfo>
<recordContentSource>ELSEVIER</recordContentSource>
</recordInfo>
</mods>
</metadata>
<enrichments>
<istex:catWosTEI uri="https://api.istex.fr/document/3E8A1CB2353FA1D809908872659479713B9542C9/enrichments/catWos">
<teiHeader>
<profileDesc>
<textClass>
<classCode scheme="WOS">COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE</classCode>
<classCode scheme="WOS">ENGINEERING, ELECTRICAL & ELECTRONIC</classCode>
</textClass>
</profileDesc>
</teiHeader>
</istex:catWosTEI>
</enrichments>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001871 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001871 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:3E8A1CB2353FA1D809908872659479713B9542C9
   |texte=   A new clustering algorithm with multiple runs of iterative procedures
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024