Cheating to achieve Formal Concept Analysis over a large formal context
Identifieur interne : 001459 ( Hal/Curation ); précédent : 001458; suivant : 001460Cheating to achieve Formal Concept Analysis over a large formal context
Auteurs : Victor Codocedo [Chili] ; Carla Taramasco [France] ; Hernan Astudillo [Chili]Source :
Abstract
Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).
Url:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001459
Links to Exploration step
Hal:hal-00654576Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author><name sortKey="Codocedo, Victor" sort="Codocedo, Victor" uniqKey="Codocedo V" first="Victor" last="Codocedo">Victor Codocedo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
<author><name sortKey="Taramasco, Carla" sort="Taramasco, Carla" uniqKey="Taramasco C" first="Carla" last="Taramasco">Carla Taramasco</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-1173" status="OLD"><orgName>Centre de recherche en épistémologie appliquée</orgName>
<orgName type="acronym">CREA</orgName>
<desc><address><addrLine>ROUTE DE SACLAY 91128 PALAISEAU CEDEX</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.crea.polytechnique.fr/LeCREA/</ref>
</desc>
<listRelation><relation name="UMR7656" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300340" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="UMR7656" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300340" type="direct"><org type="institution" xml:id="struct-300340" status="VALID"><orgName>Polytechnique - X</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Astudillo, Hernan" sort="Astudillo, Hernan" uniqKey="Astudillo H" first="Hernan" last="Astudillo">Hernan Astudillo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00654576</idno>
<idno type="halId">hal-00654576</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00654576</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-00654576</idno>
<date when="2011-10-17">2011-10-17</date>
<idno type="wicri:Area/Hal/Corpus">001459</idno>
<idno type="wicri:Area/Hal/Curation">001459</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author><name sortKey="Codocedo, Victor" sort="Codocedo, Victor" uniqKey="Codocedo V" first="Victor" last="Codocedo">Victor Codocedo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
<author><name sortKey="Taramasco, Carla" sort="Taramasco, Carla" uniqKey="Taramasco C" first="Carla" last="Taramasco">Carla Taramasco</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-1173" status="OLD"><orgName>Centre de recherche en épistémologie appliquée</orgName>
<orgName type="acronym">CREA</orgName>
<desc><address><addrLine>ROUTE DE SACLAY 91128 PALAISEAU CEDEX</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.crea.polytechnique.fr/LeCREA/</ref>
</desc>
<listRelation><relation name="UMR7656" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300340" type="direct"></relation>
</listRelation>
<tutelles><tutelle name="UMR7656" active="#struct-441569" type="direct"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300340" type="direct"><org type="institution" xml:id="struct-300340" status="VALID"><orgName>Polytechnique - X</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Astudillo, Hernan" sort="Astudillo, Hernan" uniqKey="Astudillo H" first="Hernan" last="Astudillo">Hernan Astudillo</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-36850" status="VALID"><orgName>Departamento de Informatica [Valparaíso, Chile]</orgName>
<desc><address><addrLine>Av.España 1680 - Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://portal.inf.utfsm.cl</ref>
</desc>
<listRelation><relation active="#struct-406898" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-406898" type="direct"><org type="institution" xml:id="struct-406898" status="VALID"><orgName>Universidad Tecnica Federico Santa Maria [Valparaiso]</orgName>
<orgName type="acronym">UTFSM</orgName>
<desc><address><addrLine>Avenida España 1680, Valparaíso</addrLine>
<country key="CL"></country>
</address>
<ref type="url">http://www.usm.cl/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Chili</country>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).</div>
</front>
</TEI>
<hal api="V3"><titleStmt><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author role="aut"><persName><forename type="first">Victor</forename>
<surname>Codocedo</surname>
</persName>
<email>victor.codocedo@inria.fr</email>
<idno type="halauthor">673217</idno>
<affiliation ref="#struct-36850"></affiliation>
<affiliation ref="#struct-2358"></affiliation>
</author>
<author role="aut"><persName><forename type="first">Carla</forename>
<surname>Taramasco</surname>
</persName>
<email></email>
<idno type="halauthor">480166</idno>
<affiliation ref="#struct-1173"></affiliation>
</author>
<author role="aut"><persName><forename type="first">Hernan</forename>
<surname>Astudillo</surname>
</persName>
<email></email>
<idno type="halauthor">448702</idno>
<affiliation ref="#struct-36850"></affiliation>
</author>
<editor role="depositor"><persName><forename>Victor</forename>
<surname>Codocedo</surname>
</persName>
<email>victor.codocedo@inria.fr</email>
</editor>
<funder>Quaero Program, funded by OSEO, French State agency for innovation</funder>
</titleStmt>
<editionStmt><edition n="v1" type="current"><date type="whenSubmitted">2011-12-22 12:38:30</date>
<date type="whenWritten">2011-10-10</date>
<date type="whenModified">2016-05-18 08:55:36</date>
<date type="whenReleased">2011-12-22 13:28:28</date>
<date type="whenProduced">2011-10-17</date>
<date type="whenEndEmbargoed">2011-12-22</date>
<ref type="file" target="https://hal.archives-ouvertes.fr/hal-00654576/document"><date notBefore="2011-12-22"></date>
</ref>
<ref type="file" subtype="author" n="1" target="https://hal.archives-ouvertes.fr/hal-00654576/file/codocedo.pdf"><date notBefore="2011-12-22"></date>
</ref>
</edition>
<respStmt><resp>contributor</resp>
<name key="168231"><persName><forename>Victor</forename>
<surname>Codocedo</surname>
</persName>
<email>victor.codocedo@inria.fr</email>
</name>
</respStmt>
</editionStmt>
<publicationStmt><distributor>CCSD</distributor>
<idno type="halId">hal-00654576</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00654576</idno>
<idno type="halBibtex">codocedo:hal-00654576</idno>
<idno type="halRefHtml">The Eighth International Conference on Concept Lattices and their Applications - CLA 2011, Oct 2011, Nancy, France. pp.349-362, 2011</idno>
<idno type="halRef">The Eighth International Conference on Concept Lattices and their Applications - CLA 2011, Oct 2011, Nancy, France. pp.349-362, 2011</idno>
</publicationStmt>
<seriesStmt><idno type="stamp" n="CNRS">CNRS - Centre national de la recherche scientifique</idno>
<idno type="stamp" n="INRIA">INRIA - Institut National de Recherche en Informatique et en Automatique</idno>
<idno type="stamp" n="INPL">Institut National Polytechnique de Lorraine</idno>
<idno type="stamp" n="LORIA2">Publications du LORIA</idno>
<idno type="stamp" n="INRIA-NANCY-GRAND-EST">INRIA Nancy - Grand Est</idno>
<idno type="stamp" n="LORIA">LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications</idno>
<idno type="stamp" n="LORIA-TALC" p="LORIA">Traitement automatique des langues et des connaissances</idno>
<idno type="stamp" n="PARISTECH">ParisTech</idno>
<idno type="stamp" n="X-CREA" p="X">Centre de Recherche en Epistémologie Appliquée (CREA)</idno>
<idno type="stamp" n="X" p="PARISTECH">Ecole Polytechnique</idno>
<idno type="stamp" n="X-DEP-SHS" p="X-DEP">Département humanités et sciences sociales</idno>
<idno type="stamp" n="X-DEP">Polytechnique</idno>
<idno type="stamp" n="INRIA2">INRIA 2</idno>
<idno type="stamp" n="INRIA-LORRAINE">INRIA Nancy - Grand Est</idno>
<idno type="stamp" n="LABO-LORIA-SET" p="LORIA">LABO-LORIA-SET</idno>
<idno type="stamp" n="UNIV-LORRAINE">Université de Lorraine</idno>
</seriesStmt>
<notesStmt><note type="audience" n="2">International</note>
<note type="invited" n="0">No</note>
<note type="popular" n="0">No</note>
<note type="peer" n="1">Yes</note>
<note type="proceedings" n="1">Yes</note>
</notesStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Cheating to achieve Formal Concept Analysis over a large formal context</title>
<author role="aut"><persName><forename type="first">Victor</forename>
<surname>Codocedo</surname>
</persName>
<email>victor.codocedo@inria.fr</email>
<idno type="halAuthorId">673217</idno>
<affiliation ref="#struct-36850"></affiliation>
<affiliation ref="#struct-2358"></affiliation>
</author>
<author role="aut"><persName><forename type="first">Carla</forename>
<surname>Taramasco</surname>
</persName>
<idno type="halAuthorId">480166</idno>
<affiliation ref="#struct-1173"></affiliation>
</author>
<author role="aut"><persName><forename type="first">Hernan</forename>
<surname>Astudillo</surname>
</persName>
<idno type="halAuthorId">448702</idno>
<affiliation ref="#struct-36850"></affiliation>
</author>
</analytic>
<monogr><title level="m">The Eighth International Conference on Concept Lattices and their Applications - CLA 2011</title>
<meeting><title>The Eighth International Conference on Concept Lattices and their Applications - CLA 2011</title>
<date type="start">2011-10-17</date>
<date type="end">2011-10-20</date>
<settlement>Nancy</settlement>
<country key="FR">France</country>
</meeting>
<imprint><biblScope unit="pp">349-362</biblScope>
<date type="datePub">2011</date>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
<profileDesc><langUsage><language ident="en">English</language>
</langUsage>
<textClass><classCode scheme="halDomain" n="info.info-tt">Computer Science [cs]/Document and Text Processing</classCode>
<classCode scheme="halDomain" n="info.info-ir">Computer Science [cs]/Information Retrieval [cs.IR]</classCode>
<classCode scheme="halTypology" n="COMM">Conference papers</classCode>
</textClass>
<abstract xml:lang="en">Researchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents).</abstract>
</profileDesc>
</hal>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Hal/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001459 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Hal/Curation/biblio.hfd -nk 001459 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Hal |étape= Curation |type= RBID |clé= Hal:hal-00654576 |texte= Cheating to achieve Formal Concept Analysis over a large formal context }}
![]() | This area was generated with Dilib version V0.6.33. | ![]() |