A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data
Identifieur interne : 001572 ( Istex/Curation ); précédent : 001571; suivant : 001573A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data
Auteurs : Jean-Charles Lamirel [France] ; Pascal Cuxac [France] ; Aneesh Sreevallabh Chivukula [Inde] ; Kafil Hajlaoui [France]Source :
- Lecture Notes in Computer Science [ 0302-9743 ]
Abstract
Abstract: Feature maximization is a cluster quality metric which favors clusters with maximum feature representation as regard to their associated data. In this paper we go one step further showing that a straightforward adaptation of such metric can provide a highly efficient feature selection and feature contrasting model in the context of supervised classification. We more especially show that this technique can enhance the performance of classification methods whilst very significantly outperforming (+80%) the state-of-the art feature selection techniques in the case of the classification of unbalanced, highly multidimensional and noisy textual data, with a high degree of similarity between the classes.
Url:
DOI: 10.1007/978-3-642-40319-4_32
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001591
Links to Exploration step
ISTEX:5E5E321E04152FC0E1A70514A3E8C0A3194602FDLe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data</title>
<author><name sortKey="Lamirel, Jean Charles" sort="Lamirel, Jean Charles" uniqKey="Lamirel J" first="Jean-Charles" last="Lamirel">Jean-Charles Lamirel</name>
<affiliation wicri:level="1"><mods:affiliation>SYNALP Team - LORIA, INRIA Nancy-Grand Est, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>SYNALP Team - LORIA, INRIA Nancy-Grand Est, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: jean-charles.lamirel@loria.fr</mods:affiliation>
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author><name sortKey="Cuxac, Pascal" sort="Cuxac, Pascal" uniqKey="Cuxac P" first="Pascal" last="Cuxac">Pascal Cuxac</name>
<affiliation wicri:level="1"><mods:affiliation>INIST-CNRS, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>INIST-CNRS, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: pascal.cuxac@inist.fr</mods:affiliation>
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author><name sortKey="Chivukula, Aneesh Sreevallabh" sort="Chivukula, Aneesh Sreevallabh" uniqKey="Chivukula A" first="Aneesh Sreevallabh" last="Chivukula">Aneesh Sreevallabh Chivukula</name>
<affiliation wicri:level="1"><mods:affiliation>Center for Data Engineering, International Institute of Information Technology, Gachibowli, Hyderabad, Andhra Pradesh, India</mods:affiliation>
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Center for Data Engineering, International Institute of Information Technology, Gachibowli, Hyderabad, Andhra Pradesh</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: aneesh.chivukula@gmail.com</mods:affiliation>
<wicri:noCountry code="no comma">E-mail: aneesh.chivukula@gmail.com</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hajlaoui, Kafil" sort="Hajlaoui, Kafil" uniqKey="Hajlaoui K" first="Kafil" last="Hajlaoui">Kafil Hajlaoui</name>
<affiliation wicri:level="1"><mods:affiliation>INIST-CNRS, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>INIST-CNRS, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:5E5E321E04152FC0E1A70514A3E8C0A3194602FD</idno>
<date when="2013" year="2013">2013</date>
<idno type="doi">10.1007/978-3-642-40319-4_32</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HCB-ZKP9VB3P-8/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001591</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001591</idno>
<idno type="wicri:Area/Istex/Curation">001572</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data</title>
<author><name sortKey="Lamirel, Jean Charles" sort="Lamirel, Jean Charles" uniqKey="Lamirel J" first="Jean-Charles" last="Lamirel">Jean-Charles Lamirel</name>
<affiliation wicri:level="1"><mods:affiliation>SYNALP Team - LORIA, INRIA Nancy-Grand Est, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>SYNALP Team - LORIA, INRIA Nancy-Grand Est, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: jean-charles.lamirel@loria.fr</mods:affiliation>
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author><name sortKey="Cuxac, Pascal" sort="Cuxac, Pascal" uniqKey="Cuxac P" first="Pascal" last="Cuxac">Pascal Cuxac</name>
<affiliation wicri:level="1"><mods:affiliation>INIST-CNRS, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>INIST-CNRS, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: pascal.cuxac@inist.fr</mods:affiliation>
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author><name sortKey="Chivukula, Aneesh Sreevallabh" sort="Chivukula, Aneesh Sreevallabh" uniqKey="Chivukula A" first="Aneesh Sreevallabh" last="Chivukula">Aneesh Sreevallabh Chivukula</name>
<affiliation wicri:level="1"><mods:affiliation>Center for Data Engineering, International Institute of Information Technology, Gachibowli, Hyderabad, Andhra Pradesh, India</mods:affiliation>
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Center for Data Engineering, International Institute of Information Technology, Gachibowli, Hyderabad, Andhra Pradesh</wicri:regionArea>
</affiliation>
<affiliation><mods:affiliation>E-mail: aneesh.chivukula@gmail.com</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Hajlaoui, Kafil" sort="Hajlaoui, Kafil" uniqKey="Hajlaoui K" first="Kafil" last="Hajlaoui">Kafil Hajlaoui</name>
<affiliation wicri:level="1"><mods:affiliation>INIST-CNRS, Vandoeuvre-les-Nancy, France</mods:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>INIST-CNRS, Vandoeuvre-les-Nancy</wicri:regionArea>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Feature maximization is a cluster quality metric which favors clusters with maximum feature representation as regard to their associated data. In this paper we go one step further showing that a straightforward adaptation of such metric can provide a highly efficient feature selection and feature contrasting model in the context of supervised classification. We more especially show that this technique can enhance the performance of classification methods whilst very significantly outperforming (+80%) the state-of-the art feature selection techniques in the case of the classification of unbalanced, highly multidimensional and noisy textual data, with a high degree of similarity between the classes.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001572 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 001572 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Istex |étape= Curation |type= RBID |clé= ISTEX:5E5E321E04152FC0E1A70514A3E8C0A3194602FD |texte= A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data }}
This area was generated with Dilib version V0.6.33. |