Serveur d'exploration sur SGML

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

An intelligent information system for organizing online text documents

Identifieur interne : 001A84 ( Istex/Corpus ); précédent : 001A83; suivant : 001A85

An intelligent information system for organizing online text documents

Auteurs : Han-Joon Kim ; Sang-Goo Lee

Source :

RBID : ISTEX:6663944DF77BA2C70A2E066877CAA96357B79CCF

English descriptors

Abstract

Abstract: This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.

Url:
DOI: 10.1007/BF02637152

Links to Exploration step

ISTEX:6663944DF77BA2C70A2E066877CAA96357B79CCF

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">An intelligent information system for organizing online text documents</title>
<author>
<name sortKey="Kim, Han Joon" sort="Kim, Han Joon" uniqKey="Kim H" first="Han-Joon" last="Kim">Han-Joon Kim</name>
<affiliation>
<mods:affiliation>Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743, Seoul, Korea</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: khj@uos.ac.kr</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lee, Sang Goo" sort="Lee, Sang Goo" uniqKey="Lee S" first="Sang-Goo" last="Lee">Sang-Goo Lee</name>
<affiliation>
<mods:affiliation>School of Computer Science and Engineering, Seoul National University, Seoul, Korea</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:6663944DF77BA2C70A2E066877CAA96357B79CCF</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/BF02637152</idno>
<idno type="url">https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001A84</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001A84</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">An intelligent information system for organizing online text documents</title>
<author>
<name sortKey="Kim, Han Joon" sort="Kim, Han Joon" uniqKey="Kim H" first="Han-Joon" last="Kim">Han-Joon Kim</name>
<affiliation>
<mods:affiliation>Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743, Seoul, Korea</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: khj@uos.ac.kr</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Lee, Sang Goo" sort="Lee, Sang Goo" uniqKey="Lee S" first="Sang-Goo" last="Lee">Sang-Goo Lee</name>
<affiliation>
<mods:affiliation>School of Computer Science and Engineering, Seoul National University, Seoul, Korea</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Knowledge and Information Systems</title>
<title level="j" type="abbrev">Knowledge and Information Systems</title>
<idno type="ISSN">0219-1377</idno>
<idno type="eISSN">0219-3116</idno>
<imprint>
<publisher>Springer-Verlag</publisher>
<pubPlace>London</pubPlace>
<date type="published" when="2004-03-01">2004-03-01</date>
<biblScope unit="volume">6</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="125">125</biblScope>
<biblScope unit="page" to="149">149</biblScope>
</imprint>
<idno type="ISSN">0219-1377</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0219-1377</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Document categorization</term>
<term>Document clustering</term>
<term>Fuzzy relations</term>
<term>Hierarchical agglomerative clustering</term>
<term>Information systems</term>
<term>Naïve Bayes</term>
<term>Topic hierarchy</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.</div>
</front>
</TEI>
<istex>
<corpusName>springer-journals</corpusName>
<author>
<json:item>
<name>Han-joon Kim</name>
<affiliations>
<json:string>Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743, Seoul, Korea</json:string>
<json:string>E-mail: khj@uos.ac.kr</json:string>
</affiliations>
</json:item>
<json:item>
<name>Sang-goo Lee</name>
<affiliations>
<json:string>School of Computer Science and Engineering, Seoul National University, Seoul, Korea</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Document categorization</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Document clustering</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Fuzzy relations</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Hierarchical agglomerative clustering</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Information systems</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Naïve Bayes</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>Topic hierarchy</value>
</json:item>
</subject>
<articleId>
<json:string>BF02637152</json:string>
<json:string>Art1</json:string>
</articleId>
<arkIstex>ark:/67375/VQC-5MH63HNF-Z</arkIstex>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>OriginalPaper</json:string>
</originalGenre>
<abstract>Abstract: This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.</abstract>
<qualityIndicators>
<score>8.728</score>
<pdfWordCount>11212</pdfWordCount>
<pdfCharCount>67343</pdfCharCount>
<pdfVersion>1.3</pdfVersion>
<pdfPageCount>25</pdfPageCount>
<pdfPageSize>432 x 666 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<abstractWordCount>144</abstractWordCount>
<abstractCharCount>1056</abstractCharCount>
<keywordCount>7</keywordCount>
</qualityIndicators>
<title>An intelligent information system for organizing online text documents</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<title>Knowledge and Information Systems</title>
<language>
<json:string>unknown</json:string>
</language>
<publicationDate>2004</publicationDate>
<copyrightDate>2004</copyrightDate>
<issn>
<json:string>0219-1377</json:string>
</issn>
<eissn>
<json:string>0219-3116</json:string>
</eissn>
<journalId>
<json:string>10115</json:string>
</journalId>
<volume>6</volume>
<issue>2</issue>
<pages>
<first>125</first>
<last>149</last>
</pages>
<genre>
<json:string>journal</json:string>
</genre>
<subject>
<json:item>
<value>Information Systems and Communication Service</value>
</json:item>
<json:item>
<value>Business Information Systems</value>
</json:item>
</subject>
</host>
<ark>
<json:string>ark:/67375/VQC-5MH63HNF-Z</json:string>
</ark>
<publicationDate>2004</publicationDate>
<copyrightDate>2004</copyrightDate>
<doi>
<json:string>10.1007/BF02637152</json:string>
</doi>
<id>6663944DF77BA2C70A2E066877CAA96357B79CCF</id>
<score>1</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/fulltext.pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/bundle.zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/fulltext.tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">An intelligent information system for organizing online text documents</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher scheme="https://scientific-publisher.data.istex.fr">Springer-Verlag</publisher>
<pubPlace>London</pubPlace>
<availability>
<licence>
<p>Springer-Verlag London Ltd., 2004</p>
</licence>
<p scheme="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-3XSW68JL-F">springer</p>
</availability>
<date>2004</date>
</publicationStmt>
<notesStmt>
<note type="research-article" scheme="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</note>
<note type="journal" scheme="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</note>
<note>Regular Papers</note>
</notesStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">An intelligent information system for organizing online text documents</title>
<author xml:id="author-0000" corresp="yes">
<persName>
<forename type="first">Han-joon</forename>
<surname>Kim</surname>
</persName>
<email>khj@uos.ac.kr</email>
<affiliation>Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743, Seoul, Korea</affiliation>
</author>
<author xml:id="author-0001">
<persName>
<forename type="first">Sang-goo</forename>
<surname>Lee</surname>
</persName>
<affiliation>School of Computer Science and Engineering, Seoul National University, Seoul, Korea</affiliation>
</author>
<idno type="istex">6663944DF77BA2C70A2E066877CAA96357B79CCF</idno>
<idno type="ark">ark:/67375/VQC-5MH63HNF-Z</idno>
<idno type="DOI">10.1007/BF02637152</idno>
<idno type="article-id">BF02637152</idno>
<idno type="article-id">Art1</idno>
</analytic>
<monogr>
<title level="j">Knowledge and Information Systems</title>
<title level="j" type="abbrev">Knowledge and Information Systems</title>
<idno type="pISSN">0219-1377</idno>
<idno type="eISSN">0219-3116</idno>
<idno type="journal-ID">true</idno>
<idno type="issue-article-count">6</idno>
<idno type="volume-issue-count">4</idno>
<imprint>
<publisher>Springer-Verlag</publisher>
<pubPlace>London</pubPlace>
<date type="published" when="2004-03-01"></date>
<biblScope unit="volume">6</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="125">125</biblScope>
<biblScope unit="page" to="149">149</biblScope>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2004</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Abstract: This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.</p>
</abstract>
<textClass xml:lang="en">
<keywords scheme="keyword">
<list>
<head>Keywords</head>
<item>
<term>Document categorization</term>
</item>
<item>
<term>Document clustering</term>
</item>
<item>
<term>Fuzzy relations</term>
</item>
<item>
<term>Hierarchical agglomerative clustering</term>
</item>
<item>
<term>Information systems</term>
</item>
<item>
<term>Naïve Bayes</term>
</item>
<item>
<term>Topic hierarchy</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Journal Subject">
<list>
<head>Computer Science</head>
<item>
<term>Information Systems and Communication Service</term>
</item>
<item>
<term>Business Information Systems</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2004-03-01">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/fulltext.txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus springer-journals not found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Springer-Verlag</PublisherName>
<PublisherLocation>London</PublisherLocation>
</PublisherInfo>
<Journal>
<JournalInfo JournalProductType="ArchiveJournal" NumberingStyle="Unnumbered">
<JournalID>10115</JournalID>
<JournalPrintISSN>0219-1377</JournalPrintISSN>
<JournalElectronicISSN>0219-3116</JournalElectronicISSN>
<JournalTitle>Knowledge and Information Systems</JournalTitle>
<JournalAbbreviatedTitle>Knowledge and Information Systems</JournalAbbreviatedTitle>
<JournalSubjectGroup>
<JournalSubject Type="Primary">Computer Science</JournalSubject>
<JournalSubject Type="Secondary">Information Systems and Communication Service</JournalSubject>
<JournalSubject Type="Secondary">Business Information Systems</JournalSubject>
</JournalSubjectGroup>
</JournalInfo>
<Volume>
<VolumeInfo TocLevels="0" VolumeType="Regular">
<VolumeIDStart>6</VolumeIDStart>
<VolumeIDEnd>6</VolumeIDEnd>
<VolumeIssueCount>4</VolumeIssueCount>
</VolumeInfo>
<Issue IssueType="Regular">
<IssueInfo TocLevels="0">
<IssueIDStart>2</IssueIDStart>
<IssueIDEnd>2</IssueIDEnd>
<IssueArticleCount>6</IssueArticleCount>
<IssueHistory>
<CoverDate>
<Year>2004</Year>
<Month>3</Month>
</CoverDate>
</IssueHistory>
<IssueCopyright>
<CopyrightHolderName>Springer-Verlag London Ltd.</CopyrightHolderName>
<CopyrightYear>2004</CopyrightYear>
</IssueCopyright>
</IssueInfo>
<Article ID="Art1">
<ArticleInfo ArticleType="OriginalPaper" ContainsESM="No" Language="En" NumberingStyle="Unnumbered" TocLevels="0">
<ArticleID>BF02637152</ArticleID>
<ArticleDOI>10.1007/BF02637152</ArticleDOI>
<ArticleSequenceNumber>1</ArticleSequenceNumber>
<ArticleTitle Language="En">An intelligent information system for organizing online text documents</ArticleTitle>
<ArticleCategory>Regular Papers</ArticleCategory>
<ArticleFirstPage>125</ArticleFirstPage>
<ArticleLastPage>149</ArticleLastPage>
<ArticleHistory>
<RegistrationDate>
<Year>2007</Year>
<Month>5</Month>
<Day>23</Day>
</RegistrationDate>
</ArticleHistory>
<ArticleCopyright>
<CopyrightHolderName>Springer-Verlag London Ltd.</CopyrightHolderName>
<CopyrightYear>2004</CopyrightYear>
</ArticleCopyright>
<ArticleGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ArticleGrants>
<ArticleContext>
<JournalID>10115</JournalID>
<VolumeIDStart>6</VolumeIDStart>
<VolumeIDEnd>6</VolumeIDEnd>
<IssueIDStart>2</IssueIDStart>
<IssueIDEnd>2</IssueIDEnd>
</ArticleContext>
</ArticleInfo>
<ArticleHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff1" CorrespondingAffiliationID="Aff1">
<AuthorName DisplayOrder="Western">
<GivenName>Han-joon</GivenName>
<FamilyName>Kim</FamilyName>
</AuthorName>
<Contact>
<Email>khj@uos.ac.kr</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff2">
<AuthorName DisplayOrder="Western">
<GivenName>Sang-goo</GivenName>
<FamilyName>Lee</FamilyName>
</AuthorName>
</Author>
<Affiliation ID="Aff1">
<OrgDivision>Department of Electrical and Computer Engineering</OrgDivision>
<OrgName>The University of Seoul</OrgName>
<OrgAddress>
<Street>90 Jeonnong-dong, Dongdaemun-gu</Street>
<Postcode>130-743</Postcode>
<City>Seoul</City>
<Country>Korea</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2">
<OrgDivision>School of Computer Science and Engineering</OrgDivision>
<OrgName>Seoul National University</OrgName>
<OrgAddress>
<City>Seoul</City>
<Country>Korea</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.</Para>
</Abstract>
<KeywordGroup Language="En">
<Heading>Keywords</Heading>
<Keyword>Document categorization</Keyword>
<Keyword>Document clustering</Keyword>
<Keyword>Fuzzy relations</Keyword>
<Keyword>Hierarchical agglomerative clustering</Keyword>
<Keyword>Information systems</Keyword>
<Keyword>Naïve Bayes</Keyword>
<Keyword>Topic hierarchy</Keyword>
</KeywordGroup>
</ArticleHeader>
<NoBody></NoBody>
</Article>
</Issue>
</Volume>
</Journal>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>An intelligent information system for organizing online text documents</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>An intelligent information system for organizing online text documents</title>
</titleInfo>
<name type="personal" displayLabel="corresp">
<namePart type="given">Han-joon</namePart>
<namePart type="family">Kim</namePart>
<affiliation>Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743, Seoul, Korea</affiliation>
<affiliation>E-mail: khj@uos.ac.kr</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sang-goo</namePart>
<namePart type="family">Lee</namePart>
<affiliation>School of Computer Science and Engineering, Seoul National University, Seoul, Korea</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="OriginalPaper" authority="ISTEX" authorityURI="https://content-type.data.istex.fr" valueURI="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</genre>
<originInfo>
<publisher>Springer-Verlag</publisher>
<place>
<placeTerm type="text">London</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2004-03-01</dateIssued>
<copyrightDate encoding="w3cdtf">2004</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<abstract lang="en">Abstract: This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.</abstract>
<note>Regular Papers</note>
<subject lang="en">
<genre>Keywords</genre>
<topic>Document categorization</topic>
<topic>Document clustering</topic>
<topic>Fuzzy relations</topic>
<topic>Hierarchical agglomerative clustering</topic>
<topic>Information systems</topic>
<topic>Naïve Bayes</topic>
<topic>Topic hierarchy</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Knowledge and Information Systems</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>Knowledge and Information Systems</title>
</titleInfo>
<genre type="journal" authority="ISTEX" authorityURI="https://publication-type.data.istex.fr" valueURI="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</genre>
<originInfo>
<publisher>Springer</publisher>
<dateIssued encoding="w3cdtf">2004-03-01</dateIssued>
<copyrightDate encoding="w3cdtf">2004</copyrightDate>
</originInfo>
<subject>
<genre>Computer Science</genre>
<topic>Information Systems and Communication Service</topic>
<topic>Business Information Systems</topic>
</subject>
<identifier type="ISSN">0219-1377</identifier>
<identifier type="eISSN">0219-3116</identifier>
<identifier type="JournalID">10115</identifier>
<identifier type="IssueArticleCount">6</identifier>
<identifier type="VolumeIssueCount">4</identifier>
<part>
<date>2004</date>
<detail type="volume">
<number>6</number>
<caption>vol.</caption>
</detail>
<detail type="issue">
<number>2</number>
<caption>no.</caption>
</detail>
<extent unit="pages">
<start>125</start>
<end>149</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer-Verlag London Ltd., 2004</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">6663944DF77BA2C70A2E066877CAA96357B79CCF</identifier>
<identifier type="ark">ark:/67375/VQC-5MH63HNF-Z</identifier>
<identifier type="DOI">10.1007/BF02637152</identifier>
<identifier type="ArticleID">BF02637152</identifier>
<identifier type="ArticleID">Art1</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag London Ltd., 2004</accessCondition>
<recordInfo>
<recordContentSource authority="ISTEX" authorityURI="https://loaded-corpus.data.istex.fr" valueURI="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-3XSW68JL-F">springer</recordContentSource>
<recordOrigin>Springer-Verlag London Ltd., 2004</recordOrigin>
</recordInfo>
</mods>
<json:item>
<extension>json</extension>
<original>false</original>
<mimetype>application/json</mimetype>
<uri>https://api.istex.fr/ark:/67375/VQC-5MH63HNF-Z/record.json</uri>
</json:item>
</metadata>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A84 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001A84 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Informatique
   |area=    SgmlV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:6663944DF77BA2C70A2E066877CAA96357B79CCF
   |texte=   An intelligent information system for organizing online text documents
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021