Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
Identifieur interne : 000109 ( Main/Exploration ); précédent : 000108; suivant : 000110Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
Auteurs : Yuanchao Liu ; Ming Liu ; Xin WangSource :
- PLoS ONE ; 2015.
Abstract
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.
Url:
DOI: 10.1371/journal.pone.0117390
PubMed: 25794172
PubMed Central: 4367988
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000039
- to stream Pmc, to step Curation: 000039
- to stream Pmc, to step Checkpoint: 000021
- to stream Ncbi, to step Merge: 000D26
- to stream Ncbi, to step Curation: 000D26
- to stream Ncbi, to step Checkpoint: 000D26
- to stream Main, to step Merge: 000106
- to stream Main, to step Curation: 000109
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension</title>
<author><name sortKey="Liu, Yuanchao" sort="Liu, Yuanchao" uniqKey="Liu Y" first="Yuanchao" last="Liu">Yuanchao Liu</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Liu, Ming" sort="Liu, Ming" uniqKey="Liu M" first="Ming" last="Liu">Ming Liu</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Wang, Xin" sort="Wang, Xin" uniqKey="Wang X" first="Xin" last="Wang">Xin Wang</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">25794172</idno>
<idno type="pmc">4367988</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988</idno>
<idno type="RBID">PMC:4367988</idno>
<idno type="doi">10.1371/journal.pone.0117390</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000039</idno>
<idno type="wicri:Area/Pmc/Curation">000039</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000021</idno>
<idno type="wicri:Area/Ncbi/Merge">000D26</idno>
<idno type="wicri:Area/Ncbi/Curation">000D26</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000D26</idno>
<idno type="wicri:Area/Main/Merge">000106</idno>
<idno type="wicri:Area/Main/Curation">000109</idno>
<idno type="wicri:Area/Main/Exploration">000109</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension</title>
<author><name sortKey="Liu, Yuanchao" sort="Liu, Yuanchao" uniqKey="Liu Y" first="Yuanchao" last="Liu">Yuanchao Liu</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Liu, Ming" sort="Liu, Ming" uniqKey="Liu M" first="Ming" last="Liu">Ming Liu</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Wang, Xin" sort="Wang, Xin" uniqKey="Wang X" first="Xin" last="Wang">Xin Wang</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">PLoS ONE</title>
<idno type="e-ISSN">1932-6203</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Kaski, S" uniqKey="Kaski S">S Kaski</name>
</author>
<author><name sortKey="Honkela, T" uniqKey="Honkela T">T Honkela</name>
</author>
<author><name sortKey="Lagus, K" uniqKey="Lagus K">K Lagus</name>
</author>
<author><name sortKey="Kohonen, T" uniqKey="Kohonen T">T Kohonen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chim, H" uniqKey="Chim H">H Chim</name>
</author>
<author><name sortKey="Xiaotie, D" uniqKey="Xiaotie D">D Xiaotie</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Guerrero, R" uniqKey="Guerrero R">R Guerrero</name>
</author>
<author><name sortKey="Vincent, P" uniqKey="Vincent P">P Vincent</name>
</author>
<author><name sortKey="Moya, A" uniqKey="Moya A">A Moya</name>
</author>
<author><name sortKey="Victor, H" uniqKey="Victor H">H Victor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shan, C" uniqKey="Shan C">C Shan</name>
</author>
<author><name sortKey="Damminda, A" uniqKey="Damminda A">A Damminda</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Merkl, D" uniqKey="Merkl D">D Merkl</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hartigan, A" uniqKey="Hartigan A">A Hartigan</name>
</author>
<author><name sortKey="Wong, A" uniqKey="Wong A">A Wong</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kohonen, T" uniqKey="Kohonen T">T Kohonen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hammouda, M" uniqKey="Hammouda M">M Hammouda</name>
</author>
<author><name sortKey="Kamel, S" uniqKey="Kamel S">S Kamel</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Salton, G" uniqKey="Salton G">G Salton</name>
</author>
<author><name sortKey="Wong, A" uniqKey="Wong A">A Wong</name>
</author>
<author><name sortKey="Yang, C" uniqKey="Yang C">C Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kwak, M" uniqKey="Kwak M">M Kwak</name>
</author>
<author><name sortKey="Leroy, G" uniqKey="Leroy G">G Leroy</name>
</author>
<author><name sortKey="Martinez, Jd" uniqKey="Martinez J">JD Martinez</name>
</author>
<author><name sortKey="Harwell, J" uniqKey="Harwell J">J Harwell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Turney, D" uniqKey="Turney D">D Turney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Letsche, A" uniqKey="Letsche A">A Letsche</name>
</author>
<author><name sortKey="Berry, W" uniqKey="Berry W">W Berry</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Griffiths, T" uniqKey="Griffiths T">T Griffiths</name>
</author>
<author><name sortKey="Steyvers, M" uniqKey="Steyvers M">M Steyvers</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Cui, X" uniqKey="Cui X">X Cui</name>
</author>
<author><name sortKey="Potok, E" uniqKey="Potok E">E Potok</name>
</author>
<author><name sortKey="Palathingal, P" uniqKey="Palathingal P">P Palathingal</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, Y" uniqKey="Liu Y">Y Liu</name>
</author>
<author><name sortKey="Cai, J" uniqKey="Cai J">J Cai</name>
</author>
<author><name sortKey="Yin, J" uniqKey="Yin J">J Yin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pinto, D" uniqKey="Pinto D">D Pinto</name>
</author>
<author><name sortKey="Benedi, Jm" uniqKey="Benedi J">JM Benedí</name>
</author>
<author><name sortKey="Rosso, P" uniqKey="Rosso P">P Rosso</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dhillon, S" uniqKey="Dhillon S">S Dhillon</name>
</author>
<author><name sortKey="Mallela, S" uniqKey="Mallela S">S Mallela</name>
</author>
<author><name sortKey="Kumar, R" uniqKey="Kumar R">R Kumar</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Elahi, A" uniqKey="Elahi A">A Elahi</name>
</author>
<author><name sortKey="Rostami, S" uniqKey="Rostami S">S Rostami</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Granados, A" uniqKey="Granados A">A Granados</name>
</author>
<author><name sortKey="Camacho, D" uniqKey="Camacho D">D Camacho</name>
</author>
<author><name sortKey="Rodriguez, B" uniqKey="Rodriguez B">B Rodríguez</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Shaoxu, S" uniqKey="Shaoxu S">S Shaoxu</name>
</author>
<author><name sortKey="Jian, Z" uniqKey="Jian Z">Z Jian</name>
</author>
<author><name sortKey="Chunping, L" uniqKey="Chunping L">L Chunping</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yubao, L" uniqKey="Yubao L">L Yubao</name>
</author>
<author><name sortKey="Jiarong, C" uniqKey="Jiarong C">C Jiarong</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jing, J" uniqKey="Jing J">J Jing</name>
</author>
<author><name sortKey="Dong Qing, Y" uniqKey="Dong Qing Y">Y Dong-Qing</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jiaju, M" uniqKey="Jiaju M">M Jiaju</name>
</author>
<author><name sortKey="Yiming, Z" uniqKey="Yiming Z">Z YiMing</name>
</author>
<author><name sortKey="Yunqi, G" uniqKey="Yunqi G">G YunQi</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Griffiths, T" uniqKey="Griffiths T">T Griffiths</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shen, H" uniqKey="Shen H">H Shen</name>
</author>
<author><name sortKey="Zheng, C" uniqKey="Zheng C">C Zheng</name>
</author>
<author><name sortKey="Yong, Y" uniqKey="Yong Y">Y Yong</name>
</author>
<author><name sortKey="Wei Ying, M" uniqKey="Wei Ying M">M Wei-Ying</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Azcarraga, A" uniqKey="Azcarraga A">A Azcarraga</name>
</author>
<author><name sortKey="Yap, Tn" uniqKey="Yap T">TN Yap</name>
</author>
<author><name sortKey="Tan, J" uniqKey="Tan J">J Tan</name>
</author>
<author><name sortKey="Chua, Ts" uniqKey="Chua T">TS Chua</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ercan, G" uniqKey="Ercan G">G Ercan</name>
</author>
<author><name sortKey="Cicekli, I" uniqKey="Cicekli I">I Cicekli</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Nasira, Ja" uniqKey="Nasira J">JA Nasira</name>
</author>
<author><name sortKey="Varlamisb, I" uniqKey="Varlamisb I">I Varlamisb</name>
</author>
<author><name sortKey="Karima, A" uniqKey="Karima A">A Karima</name>
</author>
<author><name sortKey="Tsatsaronisc, G" uniqKey="Tsatsaronisc G">G Tsatsaronisc</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Liu, Ming" sort="Liu, Ming" uniqKey="Liu M" first="Ming" last="Liu">Ming Liu</name>
<name sortKey="Liu, Yuanchao" sort="Liu, Yuanchao" uniqKey="Liu Y" first="Yuanchao" last="Liu">Yuanchao Liu</name>
<name sortKey="Wang, Xin" sort="Wang, Xin" uniqKey="Wang X" first="Xin" last="Wang">Xin Wang</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/OperaV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000109 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000109 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Musique |area= OperaV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:4367988 |texte= Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:25794172" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a OperaV1
This area was generated with Dilib version V0.6.21. |